Articles

Open source is like a second job
Published on 2025-06-24, edited on 2025-06-25
When I started working as a software engineer, I really wanted to work full time in open source. I've been lucky enough to spend a decent chunk of my career doing exactly that. Much of that work has been incredibly rewarding, but for readers interested in taking a similar path, I want to share an insight and a warning.
Tagged: career open-source

Preserving comments when parsing and formatting code
Published on 2023-11-02
Some tools in the Bazel ecosystem can preserve comments when formatting a BUILD file, even after significant modifications. In this article, I'll compare Go's parser and formatter with the library used by the Bazel tools. Both libraries can preserve comments, but the Bazel library does it better.
Tagged: bazel compilers go parsers

Goroutines: the concurrency model we wanted all along
Published on 2023-07-02, edited on 2023-07-07
Goroutines are the single feature that distinguishes Go from other languages. They look very much like threads, but they are cheap to create and manage. Go's runtime schedules goroutines onto real threads efficiently so you can easily create lots of goroutines.
Tagged: concurrency go java

How GitHub's upgrade broke Bazel builds
Published on 2023-02-08
Last week, GitHub upgraded the version of Git they use to produce repository archives. Upgrading Git regularly is a generally good idea, but this change broke a huge number of Bazel projects. What happened? Most Bazel projects fetch at least some of their dependencies using http_archive rules in their WORKSPACE files like this:
Tagged: bazel dependency-management

ctxio: A new Go package for cancellable I/O
Published on 2023-01-15, edited on 2023-01-16
ctxio is a small Go library that lets you cancel long-running I/O operations using context.Context. Say you're writing a command-line tool that copies large files, perhaps on a slow network file system. When the user presses ^C, you want to cancel ongoing copy operations and clean up partially written files.
Tagged: go

Go Editor Support in Bazel Workspaces
Published on 2022-11-21, edited on 2022-11-26
The story of how Go editor support was implemented in Bazel workspaces. This article is the script I wrote for that talk, together with my slides.
Tagged: bazel editors go

Curiosities of the Go testing package
Published on 2022-05-26, edited on 2022-06-10
go test and the testing package have a pretty unique way of doing things. Their implementations are clever, but far from obvious, and the answers weren't clear until I started working on the testing package itself.
Tagged: go

Internals of Go's new fuzzing system
Published on 2022-02-17
Go 1.18 is coming out soon. It's a huge release, but native fuzzing has a special place in my heart. Not much has been written yet on how Go's fuzzing system actually works, so I'll talk a bit about that here.
Tagged: fuzzing go

Leaving Google
Published on 2021-10-22
Last Friday was my last day at Google. This article is a reflection on the last seven years of my life, thinking about what was important and what I'll do differently in the future.
Tagged: bazel career go

Life of a Go module
Published on 2021-03-26
Go's module system is decentralized. An author can publish a new version by simply creating a tag in the module's source repository. How exactly does that work? What does the go command download, and from where?
Tagged: git go modules

Error handling guidelines for Go
Published on 2020-12-28
Error handling is one of the most ambiguous parts of programming. There are so many ways to do it. One approach is usually better than others, but it's not always clear what that is, especially in a new language or environment.
Tagged: go

Organizing Bazel WORKSPACE files
Published on 2020-09-13
WORKSPACE has several functions, but its main purpose is to declare external dependencies using repository rules. In this article, I'll explain how WORKSPACE is evaluated, then I'll give some guidelines for organizing WORKSPACE files to avoid confusion and ambiguity.
Tagged: bazel

Accessing private GitHub repositories over HTTP
Published on 2020-05-10
This article provides instructions on configuring Git to authenticate with private GitHub repositories. This is helpful for downloading Go modules in a CI environment where HTTP is preferred, credentials can't be stored permanently, and typing a password or tapping a security key is not possible.
Tagged: git go

Export data, the secret of Go's fast builds
Published on 2020-03-07
Each compiled Go package contains export data, a binary description of its exported definitions. When the compiler handles an import, it can quickly scan export data to learn about the imported package's definitions. This is much faster than parsing and type-checking sources for imported packages, so building large programs is very fast.
Tagged: compilers go

Writing Bazel rules: data and runfiles
Published on 2020-02-01, edited on 2023-10-12
Bazel has a neat feature that can simplify a lot of work with tests and executables: the ability to make data files available at run-time using `data` attributes. This is useful for all kinds of things such as plugins, configuration files, certificates and keys, and resources.
Tagged: bazel go

Writing Bazel rules: platforms and toolchains
Published on 2019-12-07, edited on 2020-02-01
Bazel can isolate a build from the host system using platforms and toolchains. In this article, we'll walk through the process of configuring our simple set of rules to use toolchains. When this is done, the rules will be almost completely independent from the host system.
Tagged: bazel go

Writing Bazel rules: repository rules
Published on 2019-11-09, edited on 2020-02-01
In this article, we'll define a repository rule that downloads a Go toolchain and generates a custom build file. This lets us avoid depending on the host toolchain, which aids reproducibility and make remote execution possible.
Tagged: bazel go

Writing Bazel rules: moving logic to execution
Published on 2018-12-26, edited on 2023-10-15
Bazel's execution phase has many advantages, and you should prefer to implement rule logic there if at all possible. Execution code has I/O access to source files. It can be written in any language. Work can be distributed across many machines, so it can be faster for everyone.
Tagged: bazel go

Writing Bazel rules: library rule, depsets, providers
Published on 2018-08-15, edited on 2020-02-01
We'll define a go_library rule, which can be depended on by other libraries and binaries. We'll also cover structs, providers, and depsets. They are data structures used to pass information between rules, and we'll need them to gather information about dependencies.
Tagged: bazel go

Writing Bazel rules: simple binary rule
Published on 2018-07-31, edited on 2023-09-10
Bazel lets you write rules in Starlark to support new languages. This time, we'll cover writing a simple rule that compiles and links a Go binary from sources. Bazel rules are highly structured, and learning this structure takes time. However, this structure helps you avoid introducing unnecessary complication in complex builds.
Tagged: bazel go

An update on Gypsum and CodeSwitch
Published on 2018-04-22
Observant readers will notice I haven't written anything about Gypsum or CodeSwitch in a while. Work has reached manageable pace though, and I'm ready to start tinkering on side projects again. It's time for a change in direction: I plan to focus more on CodeSwitch and less on Gypsum.
Tagged: codeswitch gypsum

Minibox: A miniature Linux container runner
Published on 2017-12-23
I've been curious about how Linux containers work for a long time. I've played around with Docker, but it basically seems like magic. I decided to learn more about them by writing my own tiny container implementation.
Tagged: linux

Intermediate Linux command line tutorial
Published on 2017-10-23
This is a concise collection of tips that will help you be more productive on the command line without getting into Linux internals or non-standard tools. Advice here generally applies not only to Linux but also to macOS (which has the same shell).
Tagged: bash linux

Thoughts from GopherCon 2017
Published on 2017-07-17
Just got back from GopherCon 2017, my first Go conference. It was great to meet a lot of prominent people and hear some new ideas. I'm writing down some thoughts and ideas while they're still fresh.
Tagged: go

How Python parses white space
Published on 2017-06-17
One of the most distinct and remarkable things about Python's syntax is its use of white space. Python uses white space for two things: newlines are used to terminate logical lines, and changes in indentation are used to delimit blocks. Both of these behaviors are somewhat contextual.
Tagged: compilers python

Lambdas in Gypsum
Published on 2017-04-27
I've finally added a feature to Gypsum that I've wanted for a long time: lambdas. They're useful for defining short, compact functions that can be passed to other functions as callbacks. They're especially important for functional programming (map, filter, reduce, and friends).
Tagged: compilers gypsum

Migrating to Bazel: Part 2
Published on 2017-03-16
Previously, I focused mostly on getting Python and C++ code to build. This time, I'll talk about adding support for building Gypsum packages. I'll also give a bit more background on the Skylark language and how Bazel deals with extensions.
Tagged: bazel gypsum

Migrating to Bazel: Part 1
Published on 2017-02-21
Bazel is the open source version of Google's internal build system. Gypsum is a cross-language project, and wanted something that could easily be extended to work with Gypsum itself. Bazel was a natural choice.
Tagged: bazel gypsum

Traits in Gypsum
Published on 2017-01-22
I'm happy to announce the launch of a major new language feature in Gypsum: traits. Traits are like interfaces in Java. They provide a form of multiple inheritance. When you define a trait, you create a new type and a set of methods that go with it, some of which may be abstract. Classes and other traits that inherit from a trait will inherit those methods.
Tagged: compilers gypsum

Static and dynamic types in pattern matching
Published on 2016-12-28
Gypsum now supports pattern matching using both static and dynamic type information. In general, pattern matching involves checking at run-time whether a value has a given type. By incorporating static type information about the value being matched, we can perform some checks that wouldn't normally be safe to perform at run-time.
Tagged: compilers gypsum

Machine learning and compiler optimization
Published on 2016-09-19
Compilers use a heuristic function that predicts whether inlining will be beneficial. This seems like the kind of problem that machine learning was born to solve.
Tagged: compilers machine-learning v8

Futures are better than callbacks
Published on 2016-08-14
This class was ridiculously large and complicated because of the way it dealt with concurrency. Callbacks are cumbersome for a number of reasons. Futures are better in pretty much every situation I can think of.
Tagged: android concurrency java

CodeSwitch assembly glue for native functions
Published on 2016-07-21
Last time, I discussed native functions, but I didn't really talk about how CodeSwitch makes the transition from interpreted code to native code. CodeSwitch uses a bit of assembly glue code to load arguments from the interpreter's stack into the right places.
Tagged: codeswitch

CodeSwitch API: native functions
Published on 2016-06-12
I've added the capability for CodeSwitch to call native functions written in C++. This means that when you write a package, part of it can be written in Gypsum, and part of it in C++. This is useful for implementing new low-level primitives, such as files and sockets.
Tagged: codeswitch gypsum

CodeSwitch API improvements
Published on 2016-03-18
CodeSwitch is designed to be a library that can be embedded in any application. A good API is crucial. While I can't say that CodeSwitch's C++ API is completely stable yet, I think it's gotten to a pretty usable state.
Tagged: codeswitch gypsum virtual-machines

Existential types in Gypsum
Published on 2016-02-04
Existential types allow you to express that you have an object with a known class, but you don't know what's inside it. For example, instead of having a list of strings, you have a list of "something". In technical terms, you have an instance of some paraterized class, but you don't know the type arguments. Existential types are similar to wildcard types in Java.
Tagged: gypsum

Arrays in Gypsum
Published on 2016-01-26
In most languages (like C or Java), arrays are a primitive that stand on their own. You can build other data structures like array lists and hash maps out of them. In Gypsum, array elements can be integrated into any class. The normal class members come first in memory, then the array elements follow immediately.
Tagged: gypsum

Pattern matching in Gypsum
Published on 2016-01-02
You might think of pattern matching as a switch statement on steroids. You examine a value using several patterns, then execute one an expression based on which of the patterns successfully matched.
Tagged: gypsum

Importing symbols in Gypsum
Published on 2015-10-28
The import statement is one of several new language features I added to Gypsum this summer. Just like the import statement in Java, it makes definitions from another package available in the scope containing the import statement. Unlike Java, multiple definitions can be imported in the same statement. Definitions can also be renamed.
Tagged: gypsum

Memory management in CodeSwitch
Published on 2015-09-12
CodeSwitch has its own garbage collected heap, which is used not only for objects allocated by interpreted code, but also for most internal data structures. In this article, I'll describe how the heap is built, how the garbage collector works, and how it tracks pointers to the heap from C++ code.
Tagged: codeswitch garbage-collection gypsum virtual-machines

CodeSwitch bytecode and interpretation
Published on 2015-08-27
The interpreter is essentially a loop with a big switch-statement. In each iteration, it reads one instruction, switches on the opcode, branches to the appropriate case, then executes some code for that instruction.
Tagged: codeswitch gypsum interpreter virtual-machines

Package loading in CodeSwitch
Published on 2015-07-27
CodeSwitch manages code in chunks called packages. Each package is stored in a separate file. A package has a name, a version, and a list of dependencies (other packages it depends on). Each dependency has a name, a minimum version, and a maximum version (both versions are optional).
Tagged: codeswitch

How CodeSwitch got its name
Published on 2015-07-10
Code switching is a linguistic term for when a person speaks in one language, then switches to another language mid-sentence. I want programmers to be able to do that with code.
Tagged: codeswitch

Packages in Gypsum and CodeSwitch
Published on 2015-05-31
Packages are named bundles of related code. They make code easier to understand and distribute. Each package is compiled into a single file, and has a unique name, a version, and a list of dependencies.
Tagged: codeswitch gypsum

Thoughts on automated testing
Published on 2015-05-07
I'm writing this article to organize my thoughts on the subject of testing so that I can better understand the obstacles which make it more difficult, and hopefully eliminate them in the future.
Tagged: object-oriented testing

Type parameter bounds and variance
Published on 2015-02-11
Type parameter bounds and variance provide a huge amount of flexibility and precision in the Gypsum type system. They let you handle many cases where you would normally have to fall back to casting and run-time type checking.
Tagged: compilers gypsum

A weird problem in the Scala type system
Published on 2015-01-19
I've been trying to formalize the type system in Gypsum. There are two operations in particular that I want to put on a sound theoretical foundation.
Tagged: compilers gypsum scala

Gypsum now has type parameters!
Published on 2014-12-09
Type parameters are also known as generics in other languages. They enable parametric polymorphism, providing abstraction over types for functions and classes.
Tagged: compilers gypsum

Structure of the Gypsum compiler (part 3)
Published on 2014-09-27
In this article, I discuss closure conversion, class flattening, CodeSwitch bytecode, semantic analysis, and serialization.
Tagged: compilers gypsum

Structure of the Gypsum compiler (part 2)
Published on 2014-09-12
In this article, I discuss the Gypsum intermediate representation, declaration analysis, inheritance analysis, and type analysis.
Tagged: compilers gypsum

Structure of the Gypsum compiler (part 1)
Published on 2014-07-20
Gypsum is an experimental language, so the compiler is designed to be very flexible, easy to change and extend. The nice thing about side projects is that you can spend some extra time making sure the code is clean and elegant. You don't have to take on any technical debt to meet deadlines.
Tagged: compilers gypsum

Introducing Gypsum
Published on 2014-07-06
Gypsum is a new compiled, statically-typed, object-oriented programming language. When the compiler is more complete, it will be functional as well. Its syntax is inspired by Python and Scala.
Tagged: gypsum

Pinky and the god: Avoiding Emacs-induced RSI
Published on 2014-06-24
A story about creating and open sourcing a minor mode to help prevent "Emacs pinky"
Tagged: emacs

Emacs: Run git-blame on the current line
Published on 2014-05-10
This function runs git-blame on the current line and prints the short commit id, author, and commit date into the minibuffer.
Tagged: emacs

The Strahler number
Published on 2014-03-13
I was browsing Wikipedia and came across an article on the Strahler number, which measures the branching of a tree. The number originally came from hydrology and was used to describe systems of rivers and streams.
Tagged: compilers

How to build a parser by hand
Published on 2014-02-13
Building a hand-written parser is actually not much harder than using a tool. You can easily build something simple, efficient, and flexible, but perhaps not that elegant. I'll show how I built a parser for a simple template language I use to generate HTML for this blog.
Tagged: parsers python web

New and improved blog software
Published on 2014-02-05
If you're a regular reader, you might notice a few changes around here! I just launched a modernization of my blogging software.
Tagged: python web

Bit rot is real!
Published on 2013-11-16
A debugging war story about bad memory.
Tagged: debugging

A tour of V8: Garbage Collection
Published on 2013-09-23, edited on 2014-01-26
In this article, I reveal the secrets of V8's garbage collector! Tagged pointers, generational collection, incremental marking, and parallel sweeping are demystified!
Tagged: garbage-collection javascript v8 virtual-machines

A tour of V8: Crankshaft, the optimizing compiler
Published on 2013-04-10, edited on 2013-12-13
Crankshaft is V8's optimizing compiler. Once unoptimized code has been running for a while, V8 identifies hot functions and recompiles them with Crankshaft, greatly improving performance.
Tagged: javascript optimization v8 virtual-machines

A tour of V8: object representation
Published on 2012-12-24, edited on 2013-12-13
JavaScript allows developers to define objects in a very flexible way. This article looks at the clever optimizations V8 uses to make accessing JavaScript objects as fast as objects in class-based languages.
Tagged: javascript v8 virtual-machines

New Android app: Snowflakes live wallpaper
Published on 2012-11-25
Today, I launched my first Android app. It's a live wallpaper which renders falling snowflakes using OpenGL ES 2.0.
Tagged: 3d android opengl

A tour of V8: full compiler
Published on 2012-11-04, edited on 2015-11-28
An overview of the high level structure of the V8 JavaScript Virtual Machine, with details on the full compiler and inline caches.
Tagged: javascript optimization v8 virtual-machines

Water simulation demo in WebGL
Published on 2012-05-28, edited on 2012-06-18
A fully interactive water simulation demo in WebGL and GLSL using a custom built framework.
Tagged: 3d webgl

Interviewing from the other side of the table
Published on 2012-04-22
Being the interviewer rather than the interviewee is still new to me. I've come to the conclusion that interviewing potential engineers is a technical skill that needs to be learned and refined like any other.

A tale of two benchmarks
Published on 2011-10-24, edited on 2020-04-05
Good benchmarks are extremely useful tools for software developers. They tell you the weaknesses of your hardware and software, so you know what to optimize. Customers also care about them a lot, and higher benchmark scores mean more sales.
Tagged: benchmarks javascript

Polymorphic Inline Caches explained
Published on 2011-07-24
Polymorphic inline caches are a way to optimize function calls in dynamic languages
Tagged: optimization virtual-machines

How to use Emacs like GNU Screen
Published on 2011-06-09
Tagged: emacs

A brief explanation of Scala 2.8 collections
Published on 2011-06-04
Tagged: scala

A simple interpreter from scratch in Python (part 4)
Published on 2011-05-12, edited on 2014-04-12
We will build the last component of the interpreter, the evaluator, by modelling the program environment and by coding how each statement and expression should be executed.
Tagged: compilers imp python

A simple interpreter from scratch in Python (part 3)
Published on 2011-05-09, edited on 2014-04-12
We create an abstract syntax tree for our toy language and write a parser using our parser combinator library.
Tagged: compilers imp parsers python

A simple interpreter from scratch in Python (part 2)
Published on 2011-03-29, edited on 2014-04-12
In this article, we will write a small parser combinator library. The combinator library is language agnostic, so you could use it to write a parser for any language.
Tagged: compilers imp parsers python

A simple interpreter from scratch in Python (part 1)
Published on 2011-02-06, edited on 2014-04-12
In the first part in this series, we build the lexer, the first component of an interpreter for our toy language, IMP.
Tagged: compilers imp python

Emacs etags: a quick introduction
Published on 2010-11-23
Emacs etags lets you quickly locate a definition by its name anywhere in your code base.
Tagged: emacs

Parsing key=value pairs in bash
Published on 2010-10-14
Here's a neat text processing trick for Bash. Let's say you have a text file in which you want to replace several words with new words. The words and their replacements are supplied to you as key=value pairs. How can you parse the pairs?
Tagged: bash

Water simulation in GLSL
Published on 2010-06-02, edited on 2011-11-26
A quick demo of reflective water with waves using GLSL in OpenGL
Tagged: 3d opengl

Trapping floating point exceptions in Linux
Published on 2010-05-09
Tagged: linux

Convenient updates for immutable objects in Scala
Published on 2010-04-30
Tagged: scala

Parallelization: Harder than it looks
Published on 2010-04-12
Tagged: c++ optimization parallelization

Tutorial: Reverse debugging with GDB 7
Published on 2009-12-01
Tagged: gdb linux

Processing large files in Scala
Published on 2009-10-01, edited on 2012-03-24
Assorted tips for quickly processing very large files in Scala programs
Tagged: scala

Midrange computers are amazingly cheap
Published on 2009-08-25

Scala's IDE support is terrible
Published on 2009-07-14

Tutorial: Function Interposition in Linux
Published on 2009-06-30
Tagged: debugging linux

Demise of the BBS
Published on 2009-06-17

Keeping a developer diary
Published on 2009-05-24

How to use HTTP cookies in Python
Published on 2009-02-08
Tagged: python web

Currying and why we don't pass arguments as tuples
Published on 2009-01-16
Tagged: compilers fenris functional-programming

New Blogging System with Python/MySQL
Published on 2008-01-01
Tagged: python web

Type inference and generics post-mortem
Published on 2008-01-01
Tagged: compilers fenris

Type inference
Published on 2008-01-01
Tagged: compilers fenris

Limitations on generics
Published on 2008-01-01
Tagged: compilers fenris

Type tagging format
Published on 2008-01-01
Tagged: compilers fenris