██████╗  ██████╗ ██╗  ██╗ █████╗ 
╚════██╗██╔═══██╗██║  ██║██╔══██╗
 █████╔╝██║   ██║███████║███████║
 ╚═══██╗██║   ██║██╔══██║██╔══██║
██████╔╝╚██████╔╝██║  ██║██║  ██║
╚═════╝  ╚═════╝ ╚═╝  ╚═╝╚═╝  ╚═╝

Welcome to 3OHA, a place for random notes, thoughts, and factoids that I want to share or remember


3OHA

2 April 2026

On IPv4 address obfuscation

IPv4 addresses are 32-bit integers. They are commonly represented in human-readable notation for textual contexts, such as when writing them in a configuration file or providing them as variable inputs. The most common of these representations is dot-decimal notation, consisting of four decimal numbers, each ranging from 0 to 255, separated by dots (for example, 192.168.1.15).

The implementation of IP networking in 4.2BSD introduced a function named inet_aton(), which parses a character string as an IP address. The problem programmers faced is that, historically, the definitions of such textual representations for IP addresses have been loose, and the standard dotted-octet format was never formally specified in an RFC. Consequently, inet_aton() opted to support the various IP address representations mentioned in several RFCs at the time. For example, the Mail Transfer Protocol (MTP) specified in RFC 780 considers both a single number representing the entire 32-bit address and dot-separated octet values. The inet_aton() function also admits two intermediate syntaxes:

Furthermore, it allows these constituent parts of an address (octets, 16-bit, or 24-bit values) to be specified in decimal, octal, or hexadecimal. These features are still supported, although some consider them non-standard —notwithstanding the lack of a primary standard. As an example, consider how the current implementation of inet_aton() in Bionic (Google's implementation of libc for Android) parses IPv4 addresses:

/* 
 * Check whether "cp" is a valid ascii representation
 * of an Internet address and convert to a binary address.
 * Returns 1 if the address is valid, 0 if not.
 * This replaces inet_addr, the return value from which
 * cannot distinguish between failure and a local broadcast address.
 */
int
inet_aton(const char *cp, struct in_addr *addr)
{
	in_addr_t val;
	int base, n;
	char c;
	u_int parts[4];
	u_int *pp = parts;
	c = *cp;
	for (;;) {
		/*
		 * Collect number up to ``.''.
		 * Values are specified as for C:
		 * 0x=hex, 0=octal, isdigit=decimal.
		 */
		if (!isdigit(c))
			return (0);
		val = 0; base = 10;
		if (c == '0') {
			c = *++cp;
			if (c == 'x' || c == 'X')
				base = 16, c = *++cp;
			else
				base = 8;
		}
		for (;;) {
			if (isascii(c) && isdigit(c)) {
				val = (val * base) + (c - '0');
				c = *++cp;
			} else if (base == 16 && isascii(c) && isxdigit(c)) {
				val = (val << 4) |
					(c + 10 - (islower(c) ? 'a' : 'A'));
				c = *++cp;
			} else
				break;
		}
		if (c == '.') {
			/*
			 * Internet format:
			 *	a.b.c.d
			 *	a.b.c	(with c treated as 16 bits)
			 *	a.b	(with b treated as 24 bits)
			 */
			if (pp >= parts + 3)
				return (0);
			*pp++ = val;
			c = *++cp;
		} else
			break;
	}
	/*
	 * Check for trailing characters.
	 */
	if (c != '\0' && (!isascii(c) || !isspace(c)))
		return (0);
	/*
	 * Concoct the address according to
	 * the number of parts specified.
	 */
	n = pp - parts + 1;
	switch (n) {
	case 0:
		return (0);		/* initial nondigit */
	case 1:				/* a -- 32 bits */
		break;
	case 2:				/* a.b -- 8.24 bits */
		if ((val > 0xffffff) || (parts[0] > 0xff))
			return (0);
		val |= parts[0] << 24;
		break;
	case 3:				/* a.b.c -- 8.8.16 bits */
		if ((val > 0xffff) || (parts[0] > 0xff) || (parts[1] > 0xff))
			return (0);
		val |= (parts[0] << 24) | (parts[1] << 16);
		break;
	case 4:				/* a.b.c.d -- 8.8.8.8 bits */
		if ((val > 0xff) || (parts[0] > 0xff) || (parts[1] > 0xff) || (parts[2] > 0xff))
			return (0);
		val |= (parts[0] << 24) | (parts[1] << 16) | (parts[2] << 8);
		break;
	}
	if (addr)
		addr->s_addr = htonl(val);
	return (1);
}

The equivalent function in glibc 2.43 is __inet_network(), which inherits the same features from the original inet_aton():

/*
 * Internet network address interpretation routine.
 * The library routines call this routine to interpret
 * network numbers.
 */
uint32_t
__inet_network (const char *cp)
{
	uint32_t val, base, n, i;
	char c;
	uint32_t parts[4], *pp = parts;
	int digit;

again:
	val = 0; base = 10; digit = 0;
	if (*cp == '0')
		digit = 1, base = 8, cp++;
	if (*cp == 'x' || *cp == 'X')
		digit = 0, base = 16, cp++;
	while ((c = *cp) != 0) {
		if (val > 0xff)
			return (INADDR_NONE);
		if (isdigit(c)) {
			if (base == 8 && (c == '8' || c == '9'))
				return (INADDR_NONE);
			val = (val * base) + (c - '0');
			cp++;
			digit = 1;
			continue;
		}
		if (base == 16 && isxdigit(c)) {
			val = (val << 4) + (tolower (c) + 10 - 'a');
			cp++;
			digit = 1;
			continue;
		}
		break;
	}
	if (!digit)
		return (INADDR_NONE);
	if (pp >= parts + 4 || val > 0xff)
		return (INADDR_NONE);
	if (*cp == '.') {
		*pp++ = val, cp++;
		goto again;
	}
	while (isspace(*cp))
		cp++;
	if (*cp)
		return (INADDR_NONE);
	if (pp >= parts + 4 || val > 0xff)
		return (INADDR_NONE);
	*pp++ = val;
	n = pp - parts;
	for (val = 0, i = 0; i < n; i++) {
		val <<= 8;
		val |= parts[i] & 0xff;
	}
	return (val);
}

Thus, the following IPv4 addresses are all equivalent:

Applications using string-to-address functions based on inet_aton() typically accept all these representations. The 4.2BSD inet_aton() was widely copied and became the de facto standard for the textual representation of IPv4 addresses. As a consequence, many applications accept these representations interchangeably. In the cases I have tested, the same applies to programs using getaddrinfo() and inet_addr(), such as curl and nc, as well as bind().

Bash has no native socket API. When using Bash's built-in pseudo-device for networking (/dev/tcp/<host>/<port>), it passes the host string to getaddrinfo(), so all numeric representations should work correctly.

The inet_pton() function (the "p" stands for "presentation") was introduced in 1997 as part of RFC 2133, which extended the socket interface for IPv6. Unlike inet_aton(), inet_pton() follows a stricter validation procedure and supports only the four-decimal variant of IP addresses.

Malware and obfuscated IP addresses

Several malware families have used IPv4 addresses written in notations other than dot-decimal. This practice seeks to evade static analysis tools that do not recognize all available representations. One example is a 2023 Shellbot campaign that used hexadecimal representations of IPv4 addresses. Some reports refer to this practice as a form of obfuscation specific to IPv4 addresses, but in reality it simply exploits incomplete pattern-matching rules that are unable to recognize IPv4 addresses written in unusual, yet valid, notations.

Incidentally, the term "IPfuscation" has been used for a different obfuscation technique. In this case, the payload is camouflaged as an array of IPv4 addresses written in the standard dot-decimal notation. Each of these addresses (strings) is translated to binary using RtlIpv4StringToAddressA(), and the combined sequence of 32-bit blocks forms a blob of shellcode.



© 2026 Juan Tapiador