How CodeSwitch got its name

Published on 2015-07-10
Tagged: codeswitch

I've written several articles over the last year about my experimental programming language, Gypsum, and its compiler. So far, I haven't said much about CodeSwitch, the virtual machine that executes Gypsum code. Today, I'll start going into more detail about CodeSwitch with a series of short articles. Before I get into the technical stuff though, I want to talk about what I want CodeSwitch to be in the future.

Code switching is a linguistic term for when a person speaks in one language, then switches to another language mid-sentence. I want programmers to be able to do that with code. There are a lot of different programming languages that are well-suited for different purposes. I don't think it's actually possible to design a language that is better than all languages for all tasks. It's better to use the right tool for the right job, and it should be easy to build multi-language projects that exercise each language's strengths.

It's frustrating how hard that is today though. To give an example, I work on performance optimizations for a suite of Android apps that are written in a mix of Java, JavaScript, and C++. Android provides some interoperability between Java and C++ with JNI, but JNI is clunky, difficult to use, and error prone; it's really easy to leak references to garbage collected objects. Android doesn't have any built-in support for executing JavaScript, so we bundle V8, which only has a C++ API. To handle calls from Java to JavaScript, we generate Java classes with stub methods that call into generated C++ code that calls JavaScript methods using the V8 APIs. None of this is free from a performance perspective either. The generated code takes up space in memory and in our apk. Cross-language calls are relatively more expensive than normal calls, and they certainly can't be inlined by the optimizer. The two garbage collectors aren't aware of each other, so we have to be careful not to introduce memory cycles which keep dead objects from being freed. We also can't use the same tools to debug, test, and analyze code.

I wasn't around at the time, but I'm sure a lot of work went into setting this up. Every developer who creates a multi-language project will go through the same pain. And this is not an isolated problem; anyone who wants to share code with back-end, front-end, and mobile is familiar with this.

I think these problems can be solved with a portable virtual machine that can understand multiple languages. I don't assume CodeSwitch is going to be that VM, but I think it's an interesting research project. Currently, CodeSwitch only understands its own bytecode, but it is designed to be straightforward to import bytecode generated for other VMs, and compilers could be written to target CodeSwitch bytecode directly. While we're at it, Gypsum could export packages for other VMs and environments, effectively acting as a universal translator. For example, you could write an app in a mix of Gypsum and Lua, then export as Dalvik bytecode for Android, Objective C for iOS, and JavaScript for the web.

Optimization is another problem CodeSwitch can solve. CodeSwitch will have a JIT compiler and optimizer in the future. Languages which are traditionally pretty slow, like Python and Ruby, could target CodeSwitch and get a big speed boost.

This is the future vision anyway. Today, CodeSwitch is the simplest possible thing that works: a package loader, a memory manager, and an unoptimized interpreter. It has a long way to go before it gets to that vision.

As always, you can find the source code to Gypsum and CodeSwitch on GitHub.