██████╗ ██████╗ ██╗ ██╗ █████╗ ╚════██╗██╔═══██╗██║ ██║██╔══██╗ █████╔╝██║ ██║███████║███████║ ╚═══██╗██║ ██║██╔══██║██╔══██║ ██████╔╝╚██████╔╝██║ ██║██║ ██║ ╚═════╝ ╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝
Welcome to 3OHA, a place for random notes, thoughts, and factoids that I want to share or remember
2 April 2026
IPv4 addresses are 32-bit integers. They are commonly represented in human-readable notation for textual contexts, such as when writing them in a configuration file or providing them as variable inputs. The most common of these representations is dot-decimal notation, consisting of four decimal numbers, each ranging from 0 to 255, separated by dots (for example, 192.168.1.15).
The implementation of IP networking in 4.2BSD introduced a function named inet_aton(), which parses a character string as an IP address. The problem programmers faced is that, historically, the definitions of such textual representations for IP addresses have been loose, and the standard dotted-octet format was never formally specified in an RFC. Consequently, inet_aton() opted to support the various IP address representations mentioned in several RFCs at the time. For example, the Mail Transfer Protocol (MTP) specified in RFC 780 considers both a single number representing the entire 32-bit address and dot-separated octet values. The inet_aton() function also admits two intermediate syntaxes:
/*
* Check whether "cp" is a valid ascii representation
* of an Internet address and convert to a binary address.
* Returns 1 if the address is valid, 0 if not.
* This replaces inet_addr, the return value from which
* cannot distinguish between failure and a local broadcast address.
*/
int
inet_aton(const char *cp, struct in_addr *addr)
{
in_addr_t val;
int base, n;
char c;
u_int parts[4];
u_int *pp = parts;
c = *cp;
for (;;) {
/*
* Collect number up to ``.''.
* Values are specified as for C:
* 0x=hex, 0=octal, isdigit=decimal.
*/
if (!isdigit(c))
return (0);
val = 0; base = 10;
if (c == '0') {
c = *++cp;
if (c == 'x' || c == 'X')
base = 16, c = *++cp;
else
base = 8;
}
for (;;) {
if (isascii(c) && isdigit(c)) {
val = (val * base) + (c - '0');
c = *++cp;
} else if (base == 16 && isascii(c) && isxdigit(c)) {
val = (val << 4) |
(c + 10 - (islower(c) ? 'a' : 'A'));
c = *++cp;
} else
break;
}
if (c == '.') {
/*
* Internet format:
* a.b.c.d
* a.b.c (with c treated as 16 bits)
* a.b (with b treated as 24 bits)
*/
if (pp >= parts + 3)
return (0);
*pp++ = val;
c = *++cp;
} else
break;
}
/*
* Check for trailing characters.
*/
if (c != '\0' && (!isascii(c) || !isspace(c)))
return (0);
/*
* Concoct the address according to
* the number of parts specified.
*/
n = pp - parts + 1;
switch (n) {
case 0:
return (0); /* initial nondigit */
case 1: /* a -- 32 bits */
break;
case 2: /* a.b -- 8.24 bits */
if ((val > 0xffffff) || (parts[0] > 0xff))
return (0);
val |= parts[0] << 24;
break;
case 3: /* a.b.c -- 8.8.16 bits */
if ((val > 0xffff) || (parts[0] > 0xff) || (parts[1] > 0xff))
return (0);
val |= (parts[0] << 24) | (parts[1] << 16);
break;
case 4: /* a.b.c.d -- 8.8.8.8 bits */
if ((val > 0xff) || (parts[0] > 0xff) || (parts[1] > 0xff) || (parts[2] > 0xff))
return (0);
val |= (parts[0] << 24) | (parts[1] << 16) | (parts[2] << 8);
break;
}
if (addr)
addr->s_addr = htonl(val);
return (1);
}
The equivalent function in glibc 2.43 is __inet_network(), which inherits the same features from the original inet_aton():
/*
* Internet network address interpretation routine.
* The library routines call this routine to interpret
* network numbers.
*/
uint32_t
__inet_network (const char *cp)
{
uint32_t val, base, n, i;
char c;
uint32_t parts[4], *pp = parts;
int digit;
again:
val = 0; base = 10; digit = 0;
if (*cp == '0')
digit = 1, base = 8, cp++;
if (*cp == 'x' || *cp == 'X')
digit = 0, base = 16, cp++;
while ((c = *cp) != 0) {
if (val > 0xff)
return (INADDR_NONE);
if (isdigit(c)) {
if (base == 8 && (c == '8' || c == '9'))
return (INADDR_NONE);
val = (val * base) + (c - '0');
cp++;
digit = 1;
continue;
}
if (base == 16 && isxdigit(c)) {
val = (val << 4) + (tolower (c) + 10 - 'a');
cp++;
digit = 1;
continue;
}
break;
}
if (!digit)
return (INADDR_NONE);
if (pp >= parts + 4 || val > 0xff)
return (INADDR_NONE);
if (*cp == '.') {
*pp++ = val, cp++;
goto again;
}
while (isspace(*cp))
cp++;
if (*cp)
return (INADDR_NONE);
if (pp >= parts + 4 || val > 0xff)
return (INADDR_NONE);
*pp++ = val;
n = pp - parts;
for (val = 0, i = 0; i < n; i++) {
val <<= 8;
val |= parts[i] & 0xff;
}
return (val);
}
Thus, the following IPv4 addresses are all equivalent:
Applications using string-to-address functions based on inet_aton() typically accept all these representations. The 4.2BSD inet_aton() was widely copied and became the de facto standard for the textual representation of IPv4 addresses. As a consequence, many applications accept these representations interchangeably. In the cases I have tested, the same applies to programs using getaddrinfo() and inet_addr(), such as curl and nc, as well as bind().
Bash has no native socket API. When using Bash's built-in pseudo-device for networking (/dev/tcp/<host>/<port>), it passes the host string to getaddrinfo(), so all numeric representations should work correctly.
The inet_pton() function (the "p" stands for "presentation") was introduced in 1997 as part of RFC 2133, which extended the socket interface for IPv6. Unlike inet_aton(), inet_pton() follows a stricter validation procedure and supports only the four-decimal variant of IP addresses.
Several malware families have used IPv4 addresses written in notations other than dot-decimal. This practice seeks to evade static analysis tools that do not recognize all available representations. One example is a 2023 Shellbot campaign that used hexadecimal representations of IPv4 addresses. Some reports refer to this practice as a form of obfuscation specific to IPv4 addresses, but in reality it simply exploits incomplete pattern-matching rules that are unable to recognize IPv4 addresses written in unusual, yet valid, notations.
Incidentally, the term "IPfuscation" has been used for a different obfuscation technique. In this case, the payload is camouflaged as an array of IPv4 addresses written in the standard dot-decimal notation. Each of these addresses (strings) is translated to binary using RtlIpv4StringToAddressA(), and the combined sequence of 32-bit blocks forms a blob of shellcode.