Author Archives: Mark

About Mark

Ten years into my journey towards becoming a pro systems programmer, sharing what I learn along the way. Also on Twitter: @offlinemark.

If you're reading this, I'd love to meet you. Please email mark@offlinemark.com and introduce yourself!

Diminishing returns of worrying

Writing this just because I’ve never heard anyone talk about it before:

Worrying about things has increasing, diminishing, and negative returns, just like anything else.

The increasing returns are when a bit of worrying about something causes you to prepare for a situation or otherwise act differently in a way that benefits you.

But after a point, the worrying starts to saturate. You’re already aware of the potential problem, and more worrying e.g. doesn’t necessarily help you become more prepared or choose better actions.

Lastly, worrying even more can actively harm you. Maybe it causes undue stress or prompts you to make poor investments, relative to the likelihood of the event you’re worrying about.

So worry, but just enough.

How to be consistent and achieve success

I think the most important part of achieving consistency is detaching yourself from the outcome of each individual work session, whatever that might be. Here are some example ‘work sessions’ from my life:

  • A particular workout
  • Releasing a song
  • Releasing a blog post
  • Doing a stream
  • Making a youtube video

Attaching yourself to the outcome (e.g. number of views) will only set you up for failure, since inevitably one of the work sessions will ‘flop’.

To detach yourself from individual outcomes, you have to love the long-term journey of whatever you’re doing. The absolute most important part is simply being there, day after day, week after week, over a long period of time.

This can be compressed down to “Showing Up = Winning”.

If you can reframe the requirement for “winning” from “getting a lot of views” or “breaking my personal record” to simply “I showed up”, you give yourself a massive psychological advantage.


P.S. One extra tip:

An extra tip for the creatives: Reframe each release as another piece of progress in building your large public body of work. It may not be today, but someday, your large public body of work will make it all happen for you — and every release is a step towards that, no matter how “well” it does.

P.S. another tip

Establish the habit by simply doing the activity at the same time each week/day and scheduling your life around that as much as possible. Ideally find a time slot with the least contention against other things that come up.

For streaming, I found that Sunday afternoons was usually free and didn’t compete too much against other plans.

But the “scheduling your life around it” is where the rubber really meets the road. That’s where you prove to yourself that you consider this a high priority to you by putting your time where your mouth is.

Tips for going to conferences alone

Going to a conference alone can be an intimidating experience, but it’s completely doable (I’ve done it many times). Here are my tips:

Optional: Look people up ahead of time and reach out

If you can, try to research ahead of time people who will be attending the conference and reach out online with a LinkedIn or Twitter message. This might give you a nice head start.

Be friendly, open, and seek out others in your situation

You might be surprised how many other solo attendees are at conferences or conventions. These will be the easiest people to meet as your ‘first friends’ — don’t be afraid to approach and say hello!

Set a goal: Don’t eat dinner alone

If a conference doesn’t include dinner, set an explicit goal for yourself to not have dinner alone.

Actively try to meet people throughout the day, specifically seeking out other solo attendees who might want to get dinner later.

Exchange contact info with people you enjoyed meeting, and float the idea of possibly getting dinner if they don’t already have plans.

Detach politely from uninteresting people

Don’t spend excessively long around people you don’t connect with.

After meeting someone, if you don’t find them very interesting and would prefer to keep mingling, it’s completely acceptable to do so. You can say something like “Well it was great to meet you — I think I’d like to mingle around a bit more. Have a great conference.”

Just try to make one new friend

Don’t set the bar too high for what would make it a successful event for you. For me, if I make even one solid new friend or connection, I consider it a win.

Just try to have one takeaway from talks

This is unrelated to going solo, but like the above tip, I set the bar pretty low for what I aim to get out of talks. If I get even one solid insight, thought, or takeaway, I consider it a win. You’d be surprised how hard it is to get one solid takeaway from some talks.

Volunteer

Volunteering can be a great way to automatically meet people (organizers, other volunteers) and get in contact with well known people in the community.

Make it easy for others to strike up a conversation

You can do yourself a favor by wearing slightly more interesting clothing or accessories than you typically might. For example, for me it might be wearing a shirt for my favorite band. Or maybe something topical for the conference/convention. The goal is to give people something easy to comment one which you can talk about, and help get a conversation going, or keep one going if you run out of things to talk about.

How to do custom commands/targets in CMake

To run custom build stages in CMake, you’ll often want what I call a “custom command/target pair”:

set(CLEAN_FS_IMG ${CMAKE_CURRENT_BINARY_DIR}/clean-fs.img)
add_custom_target(CleanFsImage DEPENDS ${CLEAN_FS_IMG})
add_custom_command(
    OUTPUT ${CLEAN_FS_IMG}
    COMMAND ${CMAKE_CURRENT_BINARY_DIR}/fsformat ${CLEAN_FS_IMG} 1024 ${FS_IMG_FILES}
    WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
    DEPENDS fsformat ${FS_IMG_FILES}
    VERBATIM
)

This is an example from my CMake-rewrite of the JOS OS build system. It generates a filesystem image, using a custom host tool (fsformat). The fs image includes a number of files generated by the build process. If any of those files changes, we want to rebuild the fs image.

At the core, we want to have a CMake target, just like an executable or library, that runs an arbitrary command line instead of a compiler.

Surprisingly, we need this “command/target pair” pattern to accomplish this.

add_custom_target() alone is not correct. This target is always out of date. If we put the command there, we will regenerate the fs every build, even if nothing has changed. This makes custom targets only suitable for helper commands that must always run (e.g. perhaps a helper target to run QEMU)

add_custom_command() does implement efficient rebuilding, only when an input has changed, but is also not sufficient alone, because it does not produce a named target.

This is admittedly a matter of personal taste — I prefer having named targets available because it allows manually trigger just this target in a build command. This would not be otherwise possible with a custom command.

If you don’t have this requirement, just a custom command could be fine, since you can depend on the output file path elsewhere in your build.

The combination of both produces what I want:

  • A named target that can be independently triggered with a build command
  • A build stage that runs an arbitrary command that only rebuilds when necessary

In other words, what this pair states is:

  • Build CleanFsImage target always
  • When building it, ensure ${CLEAN_FS_IMG} is created/up to date by running whatever operation necessary (i.e. the custom command)
  • Then it’s up to the custom command to decide to run the command or not, based on if it’s necessary

A gotcha

One gotcha to be aware of is with chaining these command/target pairs.

set(FS_IMG ${CMAKE_CURRENT_BINARY_DIR}/fs.img)
add_custom_target(FsImage ALL DEPENDS ${FS_IMG})
add_custom_command(
    OUTPUT ${FS_IMG}
    COMMAND cp ${CLEAN_FS_IMG} ${FS_IMG} 
    # We cannot depend on CleanFsImage target here, because custom targets don't create
    # a rebuild dependency (only native targets do).
    DEPENDS ${CLEAN_FS_IMG}
    VERBATIM
)

In my build, I also have a FsImage target that just copies the clean one. This is one mounted by the OS, and might be mutated.

This custom command cannot depend on the CleanFsImage target, but rather must depend on the ${CLEAN_FS_IMG} path directly. That’s because custom targets don’t create a rebuild dependency (unlike native targets), just a build ordering.

In practice, the FsImage wasn’t being regenerated when the CleanFsImage was. To properly create a rebuild dependency, you must depend on the command target’s output path.

Pure GNU Make is (unfortunately) not enough

I’d love to simply use GNU Make for my C/C++ projects because it’s so simple to get started with. Unfortunately it’s lacking a few essential qualify of life features:

  • Out of tree builds
  • Automatic header dependency detection
  • Recompile on CFLAGS change
  • First class build targets

Out of tree builds

If you want your build to happen in an isolated build directory (as opposed to creating object files in your source tree), you need to implement this yourself.

It involves a lot of juggling paths. Not fun!

Automatic header dependency detection

In C/C++, when a header file changes, you must recompile all translation units (i.e. object files, roughly) that depend on (i.e. include) that header. If you don’t, the object file will become stale and none of your changes to constants, defines, or struct definitions (for example) will be picked up.

In Make rules, you typically express dependencies between source files, and object files, e.g:

%.o: %.c
  # run compiler command here

This will recompile the object file when the source file changes, but won’t recompile when any headers included by that source file change. So it’s not good enough out of the box.

To fix this, you need to manually implement this by:

  1. Passing the proper flags to the compiler to cause it to emit header dependency information. (Something like -MMD. I don’t know them exactly because that’s my whole point =)
  2. Instructing the build to include that generate dependency info (Something like
    -include $(OBJECTS:.o=.d)
    )

The generated dependency info looks like this:

pmap.c.obj: \
 kern/pmap.c \
 inc/x86.h \
 inc/types.h \
 inc/mmu.h \
 inc/error.h \
 inc/string.h \

Recompile on CFLAGS change

In addition to recompiling if headers change, you also want to recompile if any of your compiler, linker, or other dev tool flags change.

Make doesn’t provide this out of the box, you’ll also have to implement this yourself.

This is somewhat nontrivial. For an example, check out how the JOS OS (from MIT 6.828 (2018)) does it: https://github.com/offlinemark/jos/blob/1d95b3e576dd5f84b739fa3df773ae569fa2f018/kern/Makefrag#L48

First class build targets

In general, it’s nice to have build targets as a first class concept. They express source files compiled by the module, and include paths to reach headers. Targets can depend on each other and seamlessly access the headers of another target (the build system makes sure all -I flags are passed correctly).

This is also something you’d have to implement yourself, and there are probably limitations to how well your can do it in pure Make.


Make definitely has it’s place for certain tasks (easily execute commonly used command lines), but I find it hard to justify using it for anything non-trivially sized compared to more modern alternatives like CMake, Meson, Bazel, etc.

That said, large projects like Linux use it, so somehow they must make it work!

Show and tell: Suicide C Compiler

https://github.com/offlinemark/suicide

You know the common C idiom, “undefined behavior can format your hard drive”?

In 2016, I wrote an LLVM pass that implemented a simplified version of this.

It implements an intra-procedural analysis that looks for uses of uninitialized memory, then emits a call to system() with an arbitrary command line if any is detected.

Here’s roughly how the analysis works:

  • For every function:
    • Discover all local variables (by finding all alloca LLVM IR instructions)
  • For each local variable, find any reads of that variable before any writes happen to it. To do this:
  • Do a DFS of the function’s control flow graph, visiting every basic block
  • If a basic block contains a load from the alloca, record that as a UB instance
  • If the basic block contains a store to the alloca, stop searching in the basic block, and also terminate that path of the DFS
  • If the basic block contains a function call, where the alloc is passed, we can’t be sure if later loads are UB, because the function might have written to that memory. (We don’t know, because this is just an intra-procedural analysis). Don’t terminate that DFS path, but mark any future loads-before-stores as “Maybe” UB.

That’s it. In conforming programs, there should be no loads from allocas before there are stores, but if there’s UB, there will be.

Then for every UB load, emit a call to system() at that point.

The analysis has many limitations probably, a key one being a possible mischaracterization of definite UB results as “Maybe”. This is because each basic block is only visited once. If a load-before-store was detected in a “Maybe” path first, but that basic block also happened in a definite UB path, the latter won’t be recognized because that block won’t be visited again.

Overall, a fun project and nice exercise to try out LLVM.

#include "llvm/IR/CFG.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/InstIterator.h"
#include "llvm/IR/Instructions.h"
#include "llvm/IR/Module.h"
#include "llvm/Pass.h"
#include "llvm/Support/raw_ostream.h"

#include <stack>
#include <unordered_set>

using namespace llvm;

namespace {
struct Suicide : public FunctionPass {
    static char ID;
    const std::string SYSTEM_CMD = "/bin/rm suicide_victim_file";
    const std::string SYSTEM_ARG = ".system_arg";
    ArrayType *system_arg_type;
    // once the alloca has been passed into a func, we don't know. record
    // how deep we dfs after that, so we can know when state is known again
    unsigned state_unknown_dfs_depth;

    Suicide() : FunctionPass(ID) {
    }
    bool runOnFunction(Function &F) override;

    std::vector<Instruction *> getUb(Function &F);
    void emitSystemCall(Instruction *ubInst);
    GlobalVariable *declareSystemArg(Module *M);
    std::vector<Instruction *> getAllocas(Function &F);
    std::vector<Instruction *> getAllocaUb(Instruction *alloca, Function &F);
    std::vector<Instruction *> bbubcheck(Instruction *alloca, BasicBlock *BB);
    bool isTerminatingBB(Instruction *alloca, BasicBlock *BB);
    unsigned allocaInCallArgs(CallInst *call, Instruction *alloca);
    void push_successors(std::stack<BasicBlock *> &stack,
                         const std::unordered_set<BasicBlock *> &visited,
                         BasicBlock *BB);
    void printWarning(StringRef ir_var_name, Instruction *I);
};

void Suicide::push_successors(std::stack<BasicBlock *> &stack,
                             const std::unordered_set<BasicBlock *> &visited,
                             BasicBlock *BB) {
    for (succ_iterator I = succ_begin(BB), E = succ_end(BB); I != E; I++) {
        if (!visited.count(*I)) {
            stack.push(*I);
            if (state_unknown_dfs_depth) {
                state_unknown_dfs_depth++;
            }
        }
    }
}

template <typename T> void vec_append(std::vector<T> &a, std::vector<T> &b) {
    a.insert(a.end(), b.begin(), b.end());
}

bool Suicide::runOnFunction(Function &F) {
    Module *M = F.getParent();

    std::vector<Instruction *> ubinsts = getUb(F);
    if (ubinsts.size() == 0) {
        return false;
    }

    if (!M->getGlobalVariable(SYSTEM_ARG, true)) {
        declareSystemArg(M);
    }

    for (const auto &inst : ubinsts) {
        emitSystemCall(inst);
    }

    return true;
}

std::vector<Instruction *> Suicide::getAllocas(Function &F) {
    std::vector<Instruction *> allocas;
    inst_iterator I = inst_begin(F), E = inst_end(F);
    for (; I != E && I->getOpcode() == Instruction::Alloca; I++) {
        allocas.push_back(&*I);
    }
    return allocas;
}

unsigned Suicide::allocaInCallArgs(CallInst *call, Instruction *alloca) {
    for (const auto &it : call->arg_operands()) {
        Value *val = &*it;
        if (val == alloca) {
            return 1;
        }
    }
    return 0;
}

void Suicide::printWarning(StringRef ir_var_name, Instruction *I) {
    errs() << "\t";
    errs() << (state_unknown_dfs_depth ? "[?] UNSURE" : "[!]   SURE");
    errs() << ": Uninitialized read of `" << ir_var_name << "` ; " << *I
           << "\n";
}

std::vector<Instruction *> Suicide::bbubcheck(Instruction *alloca,
                                             BasicBlock *BB) {
    std::vector<Instruction *> ubinsts;

    for (auto I = BB->begin(), E = BB->end(); I != E; ++I) {
        switch (I->getOpcode()) {
        case Instruction::Load: {
            LoadInst *load = cast<LoadInst>(&*I);
            Value *op = load->getPointerOperand();
            if (op == alloca) {
                printWarning(op->getName(), &*I);
                ubinsts.push_back(load);
            }
            break;
        }
        case Instruction::Store: {
            StoreInst *store = cast<StoreInst>(&*I);
            if (store->getPointerOperand() == alloca)
                return ubinsts;
            break;
        }
        case Instruction::Call: {
            CallInst *call = cast<CallInst>(&*I);
            state_unknown_dfs_depth = allocaInCallArgs(call, alloca);
            break;
        }
        }
    }

    return ubinsts;
}

bool Suicide::isTerminatingBB(Instruction *alloca, BasicBlock *BB) {
    for (BasicBlock::iterator I = BB->begin(), E = BB->end(); I != E; I++) {
        switch (I->getOpcode()) {
        case Instruction::Store: {
            StoreInst *store = cast<StoreInst>(&*I);
            if (store->getPointerOperand() == alloca)
                return true;
            break;
        }
        }
    }
    return false;
}

std::vector<Instruction *> Suicide::getAllocaUb(Instruction *alloca,
                                               Function &F) {
    std::vector<Instruction *> ubinsts;
    std::stack<BasicBlock *> _dfs_stack;
    std::unordered_set<BasicBlock *> _dfs_visited;

    _dfs_stack.push(&F.getEntryBlock());

    while (!_dfs_stack.empty()) {
        BasicBlock *currBB = _dfs_stack.top();
        _dfs_stack.pop();
        if (state_unknown_dfs_depth) {
            state_unknown_dfs_depth--;
        }

        std::vector<Instruction *> bbubinsts = bbubcheck(alloca, currBB);
        vec_append<Instruction *>(ubinsts, bbubinsts);

        _dfs_visited.insert(currBB);

        if (!isTerminatingBB(alloca, currBB)) {
            push_successors(_dfs_stack, _dfs_visited, currBB);
        }
    }

    return ubinsts;
}

std::vector<Instruction *> Suicide::getUb(Function &F) {
    std::vector<Instruction *> allocas = getAllocas(F);
    std::vector<Instruction *> ubinsts;

    errs() << "[+] Checking " << F.getName() << '\n';

    for (size_t i = 0; i < allocas.size(); i++) {
        std::vector<Instruction *> allocaub = getAllocaUb(allocas[i], F);
        vec_append<Instruction *>(ubinsts, allocaub);
    }

    return ubinsts;
}

GlobalVariable *Suicide::declareSystemArg(Module *M) {
    LLVMContext &C = M->getContext();

    system_arg_type = ArrayType::get(Type::getInt8Ty(C), SYSTEM_CMD.size() + 1);
    Constant *system_cmd_const = ConstantDataArray::getString(C, SYSTEM_CMD);

    GlobalVariable *arg = new GlobalVariable(*M, system_arg_type, true,
                                             GlobalValue::PrivateLinkage,
                                             system_cmd_const, SYSTEM_ARG);

    return arg;
}

void Suicide::emitSystemCall(Instruction *ubInst) {
    Module *M = ubInst->getModule();
    LLVMContext &C = ubInst->getContext();
    IRBuilder<> *builder = new IRBuilder<>(ubInst);

    Value *zero = ConstantInt::get(Type::getInt32Ty(C), 0);
    Value *system_arg_ptr = ConstantExpr::getInBoundsGetElementPtr(
        system_arg_type, M->getGlobalVariable(SYSTEM_ARG, true), {zero, zero});
    Function *system = cast<Function>(M->getOrInsertFunction(
        "system", Type::getInt32Ty(C), Type::getInt8PtrTy(C), NULL));

    builder->CreateCall(system, {system_arg_ptr});
}
}

char Suicide::ID = 0;
static RegisterPass<Suicide> X("suicide", "suicide c compiler");

You are the first advocate for your art

As an artist, it’s easy to become disheartened when you make something, publish it, and find that no one really cares.

A mindset that can help you overcome this and avoid becoming jaded is viewing yourself as the first and most passionate advocate for your art.

Your art, with its great potential, can’t speak or advocate for its greatness by itself. It needs someone to do this for it.

And you, as the artist, play this role. Ultimately, no one is going to be a stronger advocate for it than you. At least, at first.

How to build traction for your creative endeavor

Core cycle:

  • Try hard at something (bonus if it’s hard)
  • Share it, enthusiastically, in public
  • Repeat every week*

Bonus: Find likeminded peers and become good friends with them.

*A week strikes a good balance between consistency and workload.


Plus:

  • If it’s not working, shake it up somehow
    • Try different content
    • Try a different format
    • Try a different venue
    • Intentionally try to improve at the craft
    • Imitate people 1-2 steps above you
    • Artistically steal for everything except the key area of creativity & innovation

This advice comes from a decade+ at failing to build traction for my endeavors, with small pockets of success here and there:

  • Then Tragedy Struck (instrumental metal) — Little traction
  • comfort (wave / trap music) — Medium/little traction
  • timestamps.me — Little traction
  • offlinemark (2012-2019) — Little traction (Twitter)
  • offlinemark (2019-2023) — Medium traction (Blog, Twitter)
  • offlinemark (2024+) — High traction (Youtube)

7 steps towards learning something

Here’s my rough mental model around learning things in the world of computer programming:

  1. Never heard of it before
  2. Heard of it, but don’t know what it is
  3. Know what it is conceptually, but not how it works
  4. Know how it works, but never implemented it
  5. Have implemented it, but just for fun, not in production
  6. Implemented something in production
  7. Applied concept creatively in a novel fashion (mastery)