How IP Addresses Work

Table Of Contents

How IP Addresses Work
Resources

Introduction

Welcome to Part 4 of the Network Fundamentals study notes! If you haven’t already, we recommend watching the video first.

In Part 2, we introduced IP addresses as a way for devices to find each other across networks. Now we’ll dig into how they actually work — what the numbers mean, how the address space has evolved over time, and how subnet masks let us carve it up efficiently.

IPv4 Address Structure

IP addresses come in two flavours: IPv4 and IPv6. IPv6 is newer, but IPv4 is still the most common — so that’s what we’ll focus on here.

An IPv4 address looks like this: 172.16.0.1. It’s four numbers separated by dots. Each number is called an octet, because each one is an 8-bit value. A useful memory trick: an octopus has eight tentacles, and an octet has eight bits.

Because each octet is 8 bits, each number can range from 0 to 255. That means an IP address starts at 0.0.0.0 and ends at 255.255.255.255. The full range of all possible addresses is called the IP space.

Understanding binary is important for working with IP addresses. If you need a refresher, check out the binary video linked in the YouTube description.

Two Addresses in One

An IP address is actually two addresses in one — it tells you both the address of the device, and the address of the network that device is in.

Take 172.16.0.1 as an example. The first part — 172.16 — identifies the network. The second part — 0.1 — identifies the specific host within that network. All devices starting with 172.16 are on the same network. If a device’s address starts with 172.17, it’s on a different network — and to communicate between them, you’ll need a router.

But how do we know where the network part ends and the host part begins? That’s changed over time, so let’s look at the history.

Classful Addressing

When IP was first created, the first octet always represented the network and the remaining three were for hosts. This allowed for up to 255 networks with over 16 million hosts each. That seemed fine at first — the internet was small and used by only a handful of organisations. But as the internet grew, 255 networks clearly wasn’t going to be enough.

In 1981, a new method was introduced that divided the IP space into five classes — Class A through E. Classes A, B, and C are used for assigning addresses to devices. Class D is reserved for multicast, and Class E is reserved for special purposes.

Class A

Class A uses the first octet for the network and the remaining three for hosts — similar to the original method. The catch is that the first bit of the address is always 0, leaving 7 bits for the network. That gives us 128 Class A networks, each supporting over 16 million hosts. The usable Class A space runs from 1.0.0.0 to 126.0.0.0 (networks starting with 0 and 127 are reserved).

Class B

Class B uses the first two octets for the network and the last two for hosts. The first two bits are always 1 0, leaving 14 bits for networks. That gives us 16,384 Class B networks, each supporting over 65,000 hosts. The Class B space runs from 128.0.0.0 to 191.255.0.0.

Class C

Class C uses the first three octets for the network and only the last octet for hosts. The first three bits are always 1 1 0, leaving 21 bits for networks — giving us over 2 million Class C networks. But with only one octet left for hosts, each network can only have 256 host addresses. The Class C space runs from 192.0.0.0 to 223.255.255.0.

You can identify a class by looking at the first few bits of an address. A device can do this automatically — if it sees the bits “10” at the start, it knows it’s a Class B address and can assume the first two octets are the network.

CIDR and Subnet Masks

As the internet continued to grow, even classful addressing couldn’t keep up — we were running out of IP addresses again. In 1993, a new system was introduced: Classless Inter-Domain Routing, or CIDR (pronounced “cider”).

With CIDR, we throw away the idea of fixed classes and instead use a subnet mask to define where the network ends and the host begins. The subnet mask is also four octets, and it lines up with the IP address bit-for-bit. Bits set to 1 in the mask indicate the network portion. Bits set to 0 indicate the host portion. All the 1s are always grouped on the left, and all the 0s on the right.

For example, a Class B address would have a subnet mask of 255.255.0.0 — 16 ones for the network, 16 zeros for hosts. A Class C would use 255.255.255.0 — 24 ones and 8 zeros.

Subnetting

The real power of CIDR is subnetting — the ability to break a large network into smaller, more useful pieces.

Imagine you have the Class B network 172.16.0.0, which with a subnet mask of 255.255.0.0 gives you roughly 65,000 hosts. That’s fine for one giant office, but what if you have several smaller offices? You don’t want to allocate 65,000 IPs to each.

Instead, you can change the subnet mask to 255.255.255.0. Now you’ve borrowed an extra 8 bits for the network portion, splitting that original network into 256 subnets, each with up to 256 host addresses. Traffic between subnets still needs a router, just like traffic between separate networks.

CIDR Notation

Writing out a full subnet mask every time is cumbersome. Instead, we use CIDR notation — a slash followed by the number of 1-bits in the mask. For example, 172.16.1.0/24 means the first 24 bits are the network. This is equivalent to the subnet mask 255.255.255.0. You’ll see this notation used constantly in networking, so it’s worth getting comfortable with it.

Does Classful Still Matter?

Classful addressing has been replaced by CIDR in modern networks — but it hasn’t disappeared entirely. You may still encounter classful concepts in exam questions. You’ll also see echoes of it in everyday tools: when you type an IP address into Windows, it may automatically suggest a subnet mask based on the class of the address you entered.

Many people also think of subnetting as starting with a classful network and breaking it down — and while that’s not always strictly true, it’s a useful mental model to have. The reverse process — joining smaller networks into a larger one — is called supernetting.

Resources

Test your knowledge with the Introduction to Networking quizzes.