Category Archives: Linux

Back to basics: FIFOs (Named Pipes) and simple IPC logging

FIFOS (“First-in-first-out”s aka Named Pipes) are are somewhat obscure part of Unix based operating systems — at least based on my personal experience. I’ve been using Linux for ten years and simply have never needed to interact with them much.

But today, I actually thought of a use for them! If you’ve ever had a program which had human readable output on stdout, diagnostics on stderr, and you wanted to observe the two as separate streams of text, without intermingled output — then FIFOs might be for you. More on that in a bit.

Youtube version of this blog post:

FIFOs 101

As background, FIFOs are one of the simpler IPC (Interprocess Communication) mechanisms. Like normal (un-named) pipes, they are a unidirectional channel for data transmission. Unlike normal pipes, they can be used between processes that don’t have a parent-child relationship. This happens via the filesystem — FIFOs exist as filesystem entities that can be operated on with the typical file API (open, close, read, write), also similar to normal pipes.

The simplest experiment you can try with them is to send data between two processes running cat and echo.

First, create the FIFO:

$ mkfifo my_fifo

Then, start “listening” on the FIFO. It will hang waiting to receive something.

$ cat my_fifo

In a separate terminal, “send a message” to the FIFO:

$ echo hello > my_fifo

This should return immediately, and you should see a message come in on the listening side:

$ cat my_fifo
hello1

FIFO-based logging, v1

Today I was working on a long-running program that produces a lot of human reading terminal output as it runs. There are some warning conditions in the code that I wanted to monitor, but not stop execution for.

I could print to stdout when the warning conditions occurred, but they’d get lost in the sea of terminal output that normally comes from the program. You can also imagine situations where a program is writing structured data to stdout, and you wouldn’t want a stray log text to be interspersed with the output.

The typical solution is to write the logs to stderr, which separates the two data streams, but as an exercise, we can try writing the logs to a FIFO, which can then be monitored by a separate process in another terminal pane. This lets us run our program with its normal output, while also monitoring the warning logs when they come.

In python, it would look like this (assuming the fifo is created manually):

import time

fifo = open('logging_fifo', 'w')

def log(msg):
    fifo.write(msg + '\n')
    fifo.flush()

i = 0
while True:
    print(i)
    i += 1

    if i % 5 == 0:
        log(f'reached a multiple of 5: {i}')

    time.sleep(.1)

In a separate pane, we can monitor for logs with a simple:

$ while true; do cat logging_fifo ; done

In all, it looks like this:

~/c/2/fifoplay ❯ python3 writer.py
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ while true; do cat logging_fifo ; done
reached a multiple of 5: 5
reached a multiple of 5: 10
reached a multiple of 5: 15
reached a multiple of 5: 20
reached a multiple of 5: 25

Note that the listener process must be running or else the writer will block!

FIFO-based logging, v2

An even more elegant and UNIX-y way to do this would be to still write to stderr in our writer application, but use shell redirection to redirect stderr to the fifo.

import time
import sys

i = 0
while True:
    print(i)
    i += 1

    if i % 5 == 0:
        print(f'reached a multiple of 5: {i}', file=sys.stderr)

    time.sleep(.1)

Which you then invoke as:

$ python3 writer2.py 2> logging_fifo

This yields the same result, just is more semantically correct.

Syscall ABI compatibility: Linux vs Windows/macOS

The Linux kernel has an interesting difference compared to the Windows and macOS kernels: it offers syscall ABI compatibility.

This means that applications that program directly against the raw syscall interface are more or less guaranteed to always keep working, even with arbitrarily newer kernel versions. “Programming against the raw syscall interface” means including assembly code in your app that triggers syscalls:

  • setting the appropriate syscall number in the syscall register
  • setting arguments in the defined argument registers
  • executing a syscall instruction
  • reading the syscall return value register

Here are the ABIs for some common architectures.

Syscall Number RegisterSyscall ArgumentsSyscall Return Value
x86EAXEBX, ECX, EDX, ESI, EDI, EBPEAX
x86_64RAXRDI, RSI, RDX, R10, R8, R9RAX
Armv7R7R0-R6R0
AArch64X8X0-X5X0
Manticore is my go-to source to quickly look these up: https://github.com/trailofbits/manticore

Once you’ve done this, now you’re relying on the kernel to not change any part of this. If the kernel changes any of these registers, or changes the syscall number mapping, your app will not longer trigger the desired syscall correctly and will break.

Aside from writing raw assembly in your app, there’s a more innocuous way of accidentally “programming directly against the syscall interface”: statically linking to libc. When you statically link to a library, that library’s code is directly included in your binary. libc is generally the system component responsible for implementing the assembly to trigger syscalls, and by statically linking to it, you effectively inline those assembly instructions directly into your application.

So why does Linux offer this and Windows and macOS don’t?

In general, compatibility is cumbersome. As a developer, if you can avoid having to maintain compatibility, it’s better. You have more freedom to change, improve, and refactor in the future. So by default it’s preferable to not maintain compatibility — including for kernel development.

Windows and macOS are able to not offer compatibility because they control the libc for their platforms and the rules for using it. And one of their rules is “you are not allowed to statically link libc”. For the exact reason that this would encourage apps that depend directly on the syscall ABI, hindering the kernel developers’ ability to freely change the kernel’s implementation.

If all app developers are forced to dynamically link against libc, then as long as kernel developers also update libc with the corresponding changes to the syscall ABI, everything works. Old apps run on a new kernel will dynamically link against the new libc, which properly implements the new ABI. Compatibility is of course still maintained at the app/libc level — just not at the libc/kernel level.

Linux doesn’t control the libc in the same way Windows and macOS do because in the Linux world, there is a distinct separation between kernel and userspace that isn’t present in commercial operating systems. This is rooted in the history of Linux, which was originally designed to target a userspace developed by a separate organization (GNU).

So strictly speaking Linux is just the kernel, and you’re free to run whatever userspace on top. Most people run GNU userspace components (glibc), but alternatives are not unheard of (musl libc, also bionic libc on Android).

So because Linux kernel developers can’t 100% control the libc that resides on the other end of the syscall interface, they bite the bullet and retain ABI compatibility. This technically allows you to statically link with more confidence than on other OSs. That said, there are other reasons why you shouldn’t statically link libc, even on Linux.


Links:

https://news.ycombinator.com/item?id=21908824
https://www.kernel.org/doc/Documentation/ABI/README

This directory documents the interfaces that the developer has
defined to be stable.  Userspace programs are free to use these
interfaces with no restrictions, and backward compatibility for
them will be guaranteed for at least 2 years.  Most interfaces
(like syscalls) are expected to never change and always be
available.

kernel docs

What:		The kernel syscall interface
Description:
	This interface matches much of the POSIX interface and is based
	on it and other Unix based interfaces.  It will only be added to
	over time, and not have things removed from it.

	Note that this interface is different for every architecture
	that Linux supports.  Please see the architecture-specific
	documentation for details on the syscall numbers that are to be
	mapped to each syscall.

apple developer docs

Q:  I'm trying to link my binary statically, but it's failing to link because it can't find crt0.o. Why?
A: Before discussing this issue, it's important to be clear about terminology:

A static library is a library of code that can be linked into a binary that will, eventually, be dynamically linked to the system libraries and frameworks.
A statically linked binary is one that does not import system libraries and frameworks dynamically, but instead makes direct system calls into the kernel.
Apple fully supports static libraries; if you want to create one, just start with the appropriate Xcode project or target template.

Apple does not support statically linked binaries on Mac OS X. A statically linked binary assumes binary compatibility at the kernel system call interface, and we do not make any guarantees on that front. Rather, we strive to ensure binary compatibility in each dynamically linked system library and framework.

If your project absolutely must create a statically linked binary, you can get the Csu (C startup) module from Darwin and try building crt0.o for yourself. Obviously, we won't support such an endeavor.

stackoverflow

  • Solaris also stopped supporting static linking against libc.

What they don’t tell you about demand paging in school

This post details my adventures with the Linux virtual memory subsystem, and my discovery of a creative way to taunt the OOM (out of memory) killer by accumulating memory in the kernel, rather than in userspace.

Keep reading and you’ll learn:

  • Internal details of the Linux kernel’s demand paging implementation
  • How to exploit virtual memory to implement highly efficient sparse data structures
  • What page tables are and how to calculate the memory overhead incurred by them
  • A cute way to get killed by the OOM killer while appearing to consume very little memory (great for parties)

Note: Victor Michel wrote a great follow up to this post here.

Continue reading

Read bits are not enforced for memory mappings

Filmed a screencast exploring some neat mmap behavior — read bits are not enforced for memory mappings. This is because the underlying x86 page table entries have a single bit to toggle between “Read” and “Read/Write”.

Being pedantic about C++ compilation

Takeaways:

  • Don’t assume it’s safe to use pre-built dependencies when compiling C++ programs. You might want to build from source, especially if you can’t determine how a pre-built object was compiled, or if you want to use a different C++ standard than was used to compile it.
  • Ubuntu has public build logs which can help you determine if you can use a pre-built object, or if you should compile from source.
  • pkg-config is useful for generating the flags needed to compile a complex third-party dependency. CMake’s PkgConfig module can make it easy to integrate a dep into your build system.
  • Use CMake IMPORTED targets (e.g. BZip2::Bzip2) versus legacy variables (e.g. BZIP2_INCLUDE_DIRS and BZIP2_LIBRARIES).
Continue reading

struct stat notes

struct stat on Linux is pretty interesting

  • the struct definition in the man page is not exactly accurate
  • glibc explicitly pads the struct with unused members which is intersting. I guess to reserve space for expansion of fields
    • if you want to see the real definition, a trick you can use is writing a test program that uses a struct stat, and compiling with -E to stop after preprocessing then look in that output for the definition
  • you can look in the glibc sources and the linux sources and see that they actually have to make their struct definitions match! (i think). since kernel space is populating the struct memory and usespace is using it, they need to exactly agree on where what members are
    • you can find some snarky comments in linux about the padding, which is pretty funny. for example (arch/arm/include/uapi/asm/stat.h)
  • because the structs are explicitly padded, if you do a struct designator initialization, you CANNOT omit the designators. if you do, the padded members will be initialized instead of the fields you wanted!

How setjmp and longjmp work (2016)

Pretty recently I learned about setjmp() and longjmp(). They’re a neat pair of libc functions which allow you to save your program’s current execution context and resume it at an arbitrary point in the future (with some caveats1). If you’re wondering why this is particularly useful, to quote the manpage, one of their main use cases is “…for dealing with errors and interrupts encountered in a low-level subroutine of a program.” These functions can be used for more sophisticated error handling than simple error code return values.

I was curious how these functions worked, so I decided to take a look at musl libc’s implementation for x86. First, I’ll explain their interfaces and show an example usage program. Next, since this post isn’t aimed at the assembly wizard, I’ll cover some basics of x86 and Linux calling convention to provide some required background knowledge. Lastly, I’ll walk through the source, line by line.

Continue reading