It’s really all about memory. But to start at the beginning, the rough stack looks like this:
- Userspace application
- Kernel driver
- Hardware device
I find it easier to think about this from the middle out. On Linux, the kernel exposes hardware devices as files backed by the
/dev virtual filesystem. Userspace can do normal syscalls like
mmap on them, as well as the less typical
ioctl (for more arbitrary, device-specific functionality).1.
The files are created by kernel drivers which are modules of kernel code whose sole purpose is to interface with and abstract hardware so it can be used by other parts of the operating system, or userspace. They are implemented implemented using internal driver “frameworks” in the kernel, e.g. the I2C or SPI frameworks. When you interface with a file in
/dev, you are directly triggering callback handlers in a driver which execute in the process context.
That’s how userspace interfaces with the kernel. How do drivers interface with hardware? These days, mostly via memory mapped I/O (MMIO)2. This is when device hardware “appears” at certain physical addresses, and can be interfaced with via load and store instructions using an “API” that the device defines. For example, you can read data from a sensor by simply reading a physical address, or write data out to a device by writing to an address. The technical term for the hardware component these reads/writes interface with is “registers” (i.e. memory mapped registers).
(Aside: Other than MMIO, the other main interface the kernel has with hardware is interrupts, for interrupt driven I/O processing (as opposed to polling, which is what MMIO enables). I’m not very knowledgeable about this, so I won’t get into it other than to say drivers can register handlers for specific IRQ (interrupt requests) numbers, which will be invoked by the kernel’s generic interrupt handling infrastructure.)
Using MMIOs looks a lot like embedded bare metal programming you might do on a microcontroller like a PIC. At the lowest level, a kernel driver is really just embedded bare metal programming.
Here’s an example of a device driver for UART (serial port) hardware for ARM platforms: linux/drivers/tty/serial/amba-pl011.c. If you’re debugging an ARM Linux system via a serial connection, this is might be the driver being used to e.g. show the boot messages.
The lines like:
cr = readb(uap->port.membase + UART010_CR);
are where the real magic happens.
This is simply doing a read from a memory address derived from some base address for the device, plus some offset of the specific register in question. In this case it’s reading some control information from a Control Register.
#define UART010_CR 0x14 /* Control register. */
Device interfaces may range from having just a few to many registers.
To go one step deeper down the rabbit hole, how do devices “end up” at certain physical addresses? How is this physical memory map interface implemented?3
The device/physical address mapping is implemented in digital logic outside the CPU, either on the System on Chip (SOC) (for embedded systems), or on the motherboard (PCs)4. The CPU’s physical interface include the address, data, and control buses. Digital logic converts bits of the address bus into signals that mutually exclusively enable devices that are physically connected to the bus. The implementations of load/store instructions in the CPU set a read/write bit appropriately in the Control bus, which lets devices know whether a read or write is happening. The data bus is where data is either transferred out from or into the CPU.
In practice, documentation for real implementations of these systems can be hard to find, unless you’re a customer of the SoC manufacturer. But there are some out there for older chips, e.g.
Here’s a block diagram for the Tegra 2 SoC architecture, which shipped in products like the Motorola Atrix 4G, Motorola Droid X2, and Motorola Photon. Obviously it’s much more complex than my description above. Other than the two CPU cores in the top left, and the data bus towards the middle, I can’t make sense of it. (link)
While not strictly a “System on Chip”, a class PIC microcontroller has many shared characteristics of a SoC (CPU, memory, peripherals, all in one chip package), but is much more approachable.
We can see the single MIPS core connected to a variety of peripheral devices on the peripheral bus. There’s even layers of peripheral bussing, with a “Peripheral Bridge” connected to a second peripheral bus for things like I2C and SPI.
- ioctl is kind of like a meta-syscall that multiplexes device specific operations via an additional set of ioctl numbers. These ioctl numbers and other driver specific constants or structures are declared in header files that get exposed as the kernel’s userspace API. These live somewhere in
/usr/include/linux/and you’ll see them accessed via
- The legacy alternative is port mapped I/O, which is increasingly being excluded from architectures (ARM doesn’t have it while x86(64) does). It’s conceptually similar, except it introduces a separate “address space” for I/O devices and also uses specific I/O instructions to access it (e.g. IN/OUT on x86), as opposed to standard load/store.
- Note: What follows is my rough, academic understanding, which is likely oversimplified.
- These days, the boundary between SoC and PC is blurred as laptops like the MacbookPro, which may not typically be considered an embedded device, use Apple’s custom M1/M2 System on Chips