Lambdas in Gypsum

Published on 2017-04-27
Tagged: compilers gypsum

I've finally added a feature to Gypsum that I've wanted for a long time: lambdas.

Lambdas are in just about every language now, but in case you aren't familiar with them, they are anonymous functions declared as expressions. They're useful for defining short, compact functions that can be passed to other functions as callbacks. They're especially important for functional programming (map, filter, reduce, and friends).

let x = 2
let addx = lambda (y: i64) x + y
let z = addx(x, 2)  // z is 4

Lambda expressions consist of the keyword lambda, followed by an optional parameter list ((y: i64) above), followed by a body expression (x + y above).

Gypsum now also has arrow types, which are syntactic sugar for Function trait types. You can now write Foo -> Bar, which desugars to Function1[Bar, Foo]. (Foo, Bar) -> Baz desugars to Function2[Baz, Foo, Bar].

Function traits are regular traits in the standard library, so they have some limitations. Traits have a fixed number of type parameters (for now), so there is a maximum number of parameter types that can be listed here (currently 9). Type parameters only work with object types, so you also can't write arrow types involving primitives types. I hope to remove both of these limitations in the future.

Example: filtering items in a list

Here's a short example that shows lambdas in action. The filter function below filters a List according to a given predicate, pred, which has the arrow type T -> Boolean. In the main function below, the argument passed in for pred is a lambda expression.

import std.Boolean, I64, List

def filter[static T](pred: <mark>T -> Booleanmark>, list: List[T]): List[T] =
  let filtered = List[T]()
  var i = 0i32
  while (i < list.length)
    let e = list.get(i)
    if (pred(e).value)
      filtered.push(e)
    i += 1i32
  filtered

def main =
  let list = List[I64]()
  var i = 0
  while (i < 10)
    list.push(I64.of(i))
    i += 1

  let evens = filter[I64]((<mark>lambda (x: I64) Boolean.of(x.value % 2 == 0)mark>), list)
  print(evens.to-string + "\n")
  // prints "[0, 2, 4, 6, 8]"

How lambdas work

When a lambda expression is evaluated, it produces a closure value that references the function being called and the environment in which the closure was defined. The environment is what lets closures access variables in enclosing scopes.

To make this concrete, here's a small function that returns a closure that concatenates a prefix with another string. The closure captures the prefix from its defining scope.

def add-prefix(prefix: String): String -> String =
  lambda (s: String) prefix + s

The Gypsum compiler implements all of this with classes. For each scope that contains a captured variable, Gypsum creates a context class which is instantiated when the scope is entered. Captured variables are transformed into fields of the context class. This allows captured variables to outlive their defining scope.

For each lambda expression, the compiler creates a synthetic closure class, which is instantiated when whenever the lambda expression is evaluated. The lambda function becomes a method of this class. If the return type and parameter types are all objects, the class will implement a Function trait. The closure class has fields for the contexts it captures, which are set by the constructor.

After these transformations, add-prefix is desugared into something resembling the code below.

class AddPrefixContext
  var prefix: String

class AddPrefixLambda(ctx: AddPrefixContext)
  def call(s: String) = ctx.prefix + s

def add-prefix(prefix: String) =
  let ctx = AddPrefixContext)
  ctx.prefix = prefix
  return AddPrefixLambda(ctx)

As an implementation note, Gypsum already had most of the machinery for contexts, closures, and capturing since inner functions are implemented the same way as lambdas. The only real difference between inner functions and lambdas is that lambdas don't have names. There isn't any syntax to get the closure value of an inner function (yet), but the value does exist.

Calling lambdas

Call expressions in Gypsum now allow any expression as the "callee". Previously, the compiler only allowed variable and property expressions that named real functions. Now, if a callee expression does not directly name a function to call (for example, because it is a lambda or a local variable holding a closure), the compiler examines the type to decide what to do with it. If the callee is an object with a call method, the compiler will call that. This means any class can be made callable by adding a call method. If the callee is an instance of a closure class for an inner function, the compiler will call the one method of that class. These calls require some additional handling because inner functions are allowed to have type parameters.

For the curious, the call logic is in DefinitionTypeVisitor.handleValueCall.

Problems and next steps

There are still several limitations and caveats for lambdas and closures in Gypsum. I'd like to fix these some day, but I need to implement some more advanced features first.

In order to use arrow types, the return type and parameter types need to be object types: no primitives allowed. This is because arrow types are syntactic sugar for Function traits, which are parameterized using static type parameters. You might have noticed Java has the same limitation. Java generics are the same as Gypsum static type parameters. In the future, I'm planning to add template type parameters, which will work more like those in C# or C++. This will allow primitive arrow types and lots of other good things.

Another issue with Function traits is that we need a separate trait for each arity: we have a trait for 0-parameter functions, a trait for 1-parameter functions, etc. The largest is Function9, which means if you have a 10-parameter lambda, you can't write a type for it. I'm planning to add variadic type parameters some time in the future. This will let us have just one Function trait.

One last problem: I don't like the way lambda expressions look. I'd like to get rid of the lambda keyword and just have an arrow from parameters to body, like JavaScript does.

def add-prefix(prefix: String) = (s: String) -> prefix + s

This is difficult to parse, since you won't know if you should be parsing expressions or patterns after the '('. JavaScript seems to be fine with it though, so it should be possible.

Despite these caveats, lambdas are here, and they're useful! I'll probably start adding some functional methods like map, filter, and reduce to containers to take advantage of these.