Surgical formatting with git-clang-format

If you’re already a 10x engineer, you probably won’t need this article. But for the rest of us, this is what I wish I knew about clang-format as an inexperienced C++ programmer: how to only format the changes in your pull request.


You may have already heard of clang-format. It auto-formats source files for languages including C and C++. You can aim it at a source file and format the entire thing using clang-format -i file.cpp.

If you’re contributing to a project that is already 100% clang-format clean, then this workflow works fine. But you’ll occasionally encounter a project that is not quite 100% formatted, such as LLVM, osquery, or Electron1.

For these projects, the “format entire files” workflow doesn’t work because you’ll incidentally format parts of the files that are unrelated to your contribution. This will add noise to your diff and make it harder for your reviewers.

In this case, you need a way to surgically format only the lines changed in your contribution. To do this, you can use the clang-format git extension. This article will cover the basics of git-clang-format, including a practical workflow that allows for messy development, and formatting at the end.

git clang-format

The clang-format git extension comes with the clang-format package on Ubuntu. You can also install it manually by downloading the git-clang-format script from the LLVM source tree and putting it in your PATH. Make sure it is executable. 2Then you’ll be able to run git clang-format in your shell.

Formatting a single commit

git clang-format formats staged changes. The workflow looks like this:

  • Change files to your heart’s content. Be messy.
  • Stage your changes by running git add
  • Format changes by running git clang-format

For example, I’ve added a new cpp file and staged it:

$ git diff --staged
diff --git a/x.cpp b/x.cpp
new file mode 100644
index 0000000..af14ed5
--- /dev/null
+++ b/x.cpp
@@ -0,0 +1,3 @@
+int main() {
+
+}

Then I ran clang-format. It says it changed a file:

$ git clang-format
changed files:
    x.cpp

Now git status shows that I have both staged and unstaged changes.

$ git status
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
	new file:   x.cpp

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   x.cpp

git diff will show the unstaged changes – the changes created by clang-format.

$ git diff
diff --git a/x.cpp b/x.cpp
index af14ed5..237c8ce 100644
--- a/x.cpp
+++ b/x.cpp
@@ -1,3 +1 @@
-int main() {
-
-}
+int main() {}

This is nice because you can peruse git clang-format‘s changes independently from your development changes. If I don’t like them, I can easily get rid of them by purging your working tree using git checkout. If I like them, I can stage the formatting changes using git add.

Specify formatting style

You can use the --style argument to change the formatting style.

You can use the included styles (LLVM, Google, Chromium, Mozilla, WebKit), or a custom one in a .clang-format file. (For the latter, you literally pass the string “file” as the argument to --style).

$ git clang-format --style=WebKit
$ git clang-format --style=file # if you have an existing .clang-format

Formatting a messy dev branch

But there’s a problem. What if you have been making many incremental, messily-formatted commits during development? Since git clang-format only formats the staging tree, are you forced to run clang-format during every commit in your dev process? Can I format all the changes that I’ve accumulated in your development branch after the fact?

Here’s my solution:

  • Make a new dev branch and squash all commits into one using git rebase
  • Use git reset --soft to bring the single squashed commit into the staging tree
  • Run git clang-format to format the squashed commit and add formatting changes to working tree
  • git checkout back to original dev branch to apply the formatting changes for the whole branch in one commit

For example, here are a few new changes I made in a dev branch. Let’s pretend they’re a bit messy because I was focused on the code itself, not the formatting.

$ git log --graph --oneline --all --decorate
* 170205b (HEAD -> dev) messy 3
* 93eaf10 messy 2
* 7689e0a messy 1
* a3d4418 (master) init

To prepare my dev branch for code review I create a new branch, and squash all the commits in dev into 1 commit.

$ git checkout -b dev-squash
$ git rebase -i master

Then in the rebase editor, I use fixup commits for every commit except the first. This squashes everything into the first commit and throws away the commit messages since we don’t need them.

pick 7689e0a messy 1
f 93eaf10 messy 2
f 170205b messy 3

Now my branches look like this:

$ git log --graph --oneline --all --decorate
* dc83503 (HEAD -> dev-squash) messy 1
| * 170205b (dev) messy 3
| * 93eaf10 messy 2
| * 7689e0a messy 1
|/
* a3d4418 (master) init

Next is to move the dev-squash commit into the staging tree so we can run git clang-format on it.

$ git reset --soft master
$ git status # the squashed commit is now in the staging tree
On branch dev-squash
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   x.cpp

Run git clang-format.

$ git clang-format
changed files:
    x.cpp
$ git diff
diff --git a/x.cpp b/x.cpp
index f0217c2..83e7f04 100644
--- a/x.cpp
+++ b/x.cpp
@@ -1,5 +1,4 @@
 // init
-int    main() {
-      // comment
-
-          }
+int main() {
+  // comment
+}
$ git status
On branch dev-squash
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   x.cpp

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   x.cpp

Checkout back to the original dev branch.

$ git checkout dev
M       x.cpp
Switched to branch 'dev'

And commit the formatting changes.

$ git add -u
$ git commit -m 'Formatting changes'
$ git log --graph --oneline --all --decorate
* fb3c51c (HEAD -> dev) Formatting changes
* 170205b messy 3
* 93eaf10 messy 2
* 7689e0a messy 1
* a3d4418 (master, dev-squash) init

There! Now we have our original dev branch in tact, but we’ve run clang-format on all the changes in the branch and added them as a commit at the tip of the branch. This has allowed us to develop messily and format everything at the end.

Conclusion

clang-format is a powerful tool, but using it in real life requires a bit more than clang-format -i. In practice, developers run clang-format on their specific changes using tools like git clang-format.

git clang-format operates on the staging tree, which makes it easy to independently review formatting changes separately from development changes. This adds complexity when formatting an entire dev branch, but it’s nothing that can’t be solved with a few git commands. You can feel free to develop messily and be assured that you can format everything when you’re ready to submit for code review.


Learn something new? Let me know!

Did you learn something from this post? I’d love to hear what it was — tweet me @offlinemark!

I also send out a brief email digest with links to the best writing I do each month. It’s by far the best way to stay up to date:


  1. Projects may avoid nuking all files with clang-format because it tends to disrupt the VCS history. That said, git blame‘s “ignore revs” feature is a solution for this.
  2. Fun fact: git extensions are simply executable scripts in your path that start with “git-“!)

Any thoughts?