Error handling guidelines for Go
Error handling is one of the most ambiguous parts of programming. There are so many ways to do it. One approach is usually better than others, but it's not always clear what that is, especially in a new language or environment.
Organizing Bazel WORKSPACE files
WORKSPACE has several functions, but its main purpose is to declare external dependencies using repository rules. In this article, I'll explain how WORKSPACE is evaluated, then I'll give some guidelines for organizing WORKSPACE files to avoid confusion and ambiguity.
Accessing private GitHub repositories over HTTP
This article provides instructions on configuring Git to authenticate with private GitHub repositories. This is helpful for downloading Go modules in a CI environment where HTTP is preferred, credentials can't be stored permanently, and typing a password or tapping a security key is not possible.
Export data, the secret of Go's fast builds
Each compiled Go package contains export data, a binary description of its exported definitions. When the compiler handles an import, it can quickly scan export data to learn about the imported package's definitions. This is much faster than parsing and type-checking sources for imported packages, so building large programs is very fast.
Writing Bazel rules: platforms and toolchains
Bazel can isolate a build from the host system using platforms and toolchains. In this article, we'll walk through the process of configuring our simple set of rules to use toolchains. When this is done, the rules will be almost completely independent from the host system.
Writing Bazel rules: repository rules
In this article, we'll define a repository rule that downloads a Go toolchain and generates a custom build file. This lets us avoid depending on the host toolchain, which aids reproducibility and make remote execution possible.
Writing Bazel rules: moving logic to execution
Bazel's execution phase has many advantages, and you should prefer to implement rule logic there if at all possible. Execution code has I/O access to source files. It can be written in any language. Work can be distributed across many machines, so it can be faster for everyone.
Writing Bazel rules: data and runfiles
Bazel has a neat feature that can simplify a lot of work with tests and executables: the ability to make data files available at run-time using `data` attributes. This is useful for all kinds of things such as plugins, configuration files, certificates and keys, and resources.
Writing Bazel rules: library rule, depsets, providers
We'll define a go_library rule, which can be depended on by other libraries and binaries. We'll also cover structs, providers, and depsets. They are data structures used to pass information between rules, and we'll need them to gather information about dependencies.
Writing Bazel rules: simple binary rule
Bazel lets you write rules in Starlark to support new languages. This time, we'll cover writing a simple rule that compiles and links a Go binary from sources. Bazel rules are highly structured, and learning this structure takes time. However, this structure helps you avoid introducing unnecessary complication in complex builds.
An update on Gypsum and CodeSwitch
Observant readers will notice I haven't written anything about Gypsum or CodeSwitch in a while. Work has reached manageable pace though, and I'm ready to start tinkering on side projects again. It's time for a change in direction: I plan to focus more on CodeSwitch and less on Gypsum.
Minibox: A miniature Linux container runner
I've been curious about how Linux containers work for a long time. I've played around with Docker, but it basically seems like magic. I decided to learn more about them by writing my own tiny container implementation.
Intermediate Linux command line tutorial
This is a concise collection of tips that will help you be more productive on the command line without getting into Linux internals or non-standard tools. Advice here generally applies not only to Linux but also to macOS (which has the same shell).
Thoughts from GopherCon 2017
Just got back from GopherCon 2017, my first Go conference. It was great to meet a lot of prominent people and hear some new ideas. I'm writing down some thoughts and ideas while they're still fresh.
How Python parses white space
One of the most distinct and remarkable things about Python's syntax is its use of white space. Python uses white space for two things: newlines are used to terminate logical lines, and changes in indentation are used to delimit blocks. Both of these behaviors are somewhat contextual.
Lambdas in Gypsum
I've finally added a feature to Gypsum that I've wanted for a long time: lambdas. They're useful for defining short, compact functions that can be passed to other functions as callbacks. They're especially important for functional programming (map, filter, reduce, and friends).
Migrating to Bazel: Part 2
Previously, I focused mostly on getting Python and C++ code to build. This time, I'll talk about adding support for building Gypsum packages. I'll also give a bit more background on the Skylark language and how Bazel deals with extensions.
Migrating to Bazel: Part 1
Bazel is the open source version of Google's internal build system. Gypsum is a cross-language project, and wanted something that could easily be extended to work with Gypsum itself. Bazel was a natural choice.
Traits in Gypsum
I'm happy to announce the launch of a major new language feature in Gypsum: traits. Traits are like interfaces in Java. They provide a form of multiple inheritance. When you define a trait, you create a new type and a set of methods that go with it, some of which may be abstract. Classes and other traits that inherit from a trait will inherit those methods.
Static and dynamic types in pattern matching
Gypsum now supports pattern matching using both static and dynamic type information. In general, pattern matching involves checking at run-time whether a value has a given type. By incorporating static type information about the value being matched, we can perform some checks that wouldn't normally be safe to perform at run-time.
Machine learning and compiler optimization
Compilers use a heuristic function that predicts whether inlining will be beneficial. This seems like the kind of problem that machine learning was born to solve.
Futures are better than callbacks
This class was ridiculously large and complicated because of the way it dealt with concurrency. Callbacks are cumbersome for a number of reasons. Futures are better in pretty much every situation I can think of.
CodeSwitch assembly glue for native functions
Last time, I discussed native functions, but I didn't really talk about how CodeSwitch makes the transition from interpreted code to native code. CodeSwitch uses a bit of assembly glue code to load arguments from the interpreter's stack into the right places.
CodeSwitch API: native functions
I've added the capability for CodeSwitch to call native functions written in C++. This means that when you write a package, part of it can be written in Gypsum, and part of it in C++. This is useful for implementing new low-level primitives, such as files and sockets.
CodeSwitch API improvements
CodeSwitch is designed to be a library that can be embedded in any application. A good API is crucial. While I can't say that CodeSwitch's C++ API is completely stable yet, I think it's gotten to a pretty usable state.
Existential types in Gypsum
Existential types allow you to express that you have an object with a known class, but you don't know what's inside it. For example, instead of having a list of strings, you have a list of "something". In technical terms, you have an instance of some paraterized class, but you don't know the type arguments. Existential types are similar to wildcard types in Java.
Arrays in Gypsum
In most languages (like C or Java), arrays are a primitive that stand on their own. You can build other data structures like array lists and hash maps out of them. In Gypsum, array elements can be integrated into any class. The normal class members come first in memory, then the array elements follow immediately.
Pattern matching in Gypsum
You might think of pattern matching as a switch statement on steroids. You examine a value using several patterns, then execute one an expression based on which of the patterns successfully matched.
Importing symbols in Gypsum
The import statement is one of several new language features I added to Gypsum this summer. Just like the import statement in Java, it makes definitions from another package available in the scope containing the import statement. Unlike Java, multiple definitions can be imported in the same statement. Definitions can also be renamed.
Memory management in CodeSwitch
CodeSwitch has its own garbage collected heap, which is used not only for objects allocated by interpreted code, but also for most internal data structures. In this article, I'll describe how the heap is built, how the garbage collector works, and how it tracks pointers to the heap from C++ code.
CodeSwitch bytecode and interpretation
The interpreter is essentially a loop with a big switch-statement. In each iteration, it reads one instruction, switches on the opcode, branches to the appropriate case, then executes some code for that instruction.
Package loading in CodeSwitch
CodeSwitch manages code in chunks called packages. Each package is stored in a separate file. A package has a name, a version, and a list of dependencies (other packages it depends on). Each dependency has a name, a minimum version, and a maximum version (both versions are optional).
How CodeSwitch got its name
Code switching is a linguistic term for when a person speaks in one language, then switches to another language mid-sentence. I want programmers to be able to do that with code.
Packages in Gypsum and CodeSwitch
Packages are named bundles of related code. They make code easier to understand and distribute. Each package is compiled into a single file, and has a unique name, a version, and a list of dependencies.
Thoughts on automated testing
I'm writing this article to organize my thoughts on the subject of testing so that I can better understand the obstacles which make it more difficult, and hopefully eliminate them in the future.
Type parameter bounds and variance
Type parameter bounds and variance provide a huge amount of flexibility and precision in the Gypsum type system. They let you handle many cases where you would normally have to fall back to casting and run-time type checking.
A weird problem in the Scala type system
I've been trying to formalize the type system in Gypsum. There are two operations in particular that I want to put on a sound theoretical foundation.
Gypsum now has type parameters!
Type parameters are also known as generics in other languages. They enable parametric polymorphism, providing abstraction over types for functions and classes.
Structure of the Gypsum compiler (part 3)
In this article, I discuss closure conversion, class flattening, CodeSwitch bytecode, semantic analysis, and serialization.
Structure of the Gypsum compiler (part 2)
In this article, I discuss the Gypsum intermediate representation, declaration analysis, inheritance analysis, and type analysis.
Structure of the Gypsum compiler (part 1)
Gypsum is an experimental language, so the compiler is designed to be very flexible, easy to change and extend. The nice thing about side projects is that you can spend some extra time making sure the code is clean and elegant. You don't have to take on any technical debt to meet deadlines.
Gypsum is a new compiled, statically-typed, object-oriented programming language. When the compiler is more complete, it will be functional as well. Its syntax is inspired by Python and Scala.
Pinky and the god: Avoiding Emacs-induced RSI
A story about creating and open sourcing a minor mode to help prevent "Emacs pinky"
Emacs: Run git-blame on the current line
This function runs git-blame on the current line and prints the short commit id, author, and commit date into the minibuffer.
The Strahler number
I was browsing Wikipedia and came across an article on the Strahler number, which measures the branching of a tree. The number originally came from hydrology and was used to describe systems of rivers and streams.
How to build a parser by hand
Building a hand-written parser is actually not much harder than using a tool. You can easily build something simple, efficient, and flexible, but perhaps not that elegant. I'll show how I built a parser for a simple template language I use to generate HTML for this blog.
New and improved blog software
If you're a regular reader, you might notice a few changes around here! I just launched a modernization of my blogging software.
Bit rot is real!
A debugging war story about bad memory.
A tour of V8: Garbage Collection
In this article, I reveal the secrets of V8's garbage collector! Tagged pointers, generational collection, incremental marking, and parallel sweeping are demystified!
A tour of V8: Crankshaft, the optimizing compiler
Crankshaft is V8's optimizing compiler. Once unoptimized code has been running for a while, V8 identifies hot functions and recompiles them with Crankshaft, greatly improving performance.
A tour of V8: object representation
New Android app: Snowflakes live wallpaper
Today, I launched my first Android app. It's a live wallpaper which renders falling snowflakes using OpenGL ES 2.0.
A tour of V8: full compiler
Water simulation demo in WebGL
A fully interactive water simulation demo in WebGL and GLSL using a custom built framework.
Interviewing from the other side of the table
Being the interviewer rather than the interviewee is still new to me. I've come to the conclusion that interviewing potential engineers is a technical skill that needs to be learned and refined like any other.
A tale of two benchmarks
Good benchmarks are extremely useful tools for software developers. They tell you the weaknesses of your hardware and software, so you know what to optimize. Customers also care about them a lot, and higher benchmark scores mean more sales.
Polymorphic Inline Caches explained
Polymorphic inline caches are a way to optimize function calls in dynamic languages
A simple interpreter from scratch in Python (part 4)
We will build the last component of the interpreter, the evaluator, by modelling the program environment and by coding how each statement and expression should be executed.
A simple interpreter from scratch in Python (part 3)
We create an abstract syntax tree for our toy language and write a parser using our parser combinator library.
A simple interpreter from scratch in Python (part 2)
In this article, we will write a small parser combinator library. The combinator library is language agnostic, so you could use it to write a parser for any language.
A simple interpreter from scratch in Python (part 1)
In the first part in this series, we build the lexer, the first component of an interpreter for our toy language, IMP.
Emacs etags: a quick introduction
Emacs etags lets you quickly locate a definition by its name anywhere in your code base.
Parsing key=value pairs in bash
Here's a neat text processing trick for Bash. Let's say you have a text file in which you want to replace several words with new words. The words and their replacements are supplied to you as key=value pairs. How can you parse the pairs?
Water simulation in GLSL
A quick demo of reflective water with waves using GLSL in OpenGL
Processing large files in Scala
Assorted tips for quickly processing very large files in Scala programs