An Overview of the Starlark language

Starlark is a small programming language, designed as a simple dialect of Python and intended primarily for embedded use in applications. Some people might say it’s a bit like Lua with Python syntax, but I think there are many interesting bits to discuss. The language is now open-source and used in many other applications and companies. As I led the design and implementation of Starlark, I’d like to write a bit more about it.

What it looks like

Here’s a code example:

def fizz_buzz(n):
  """Print Fizz Buzz numbers from 1 to n."""
  for i in range(1, n + 1):
    s = ""
    if i % 3 == 0:
      s += "Fizz"
    if i % 5 == 0:
      s += "Buzz"
    print(s if s else i)

fizz_buzz(20)

I told you it looks like Python. Actually this is valid Python 3 code as well as Starlark code.

Like Python, it is a dynamically typed language with high-level data types, first-class functions with lexical scope, and garbage collection. But it’s much simpler than Python and most Python features are not supported.

In my experience on a large codebase with tens of thousands of developers, most people are comfortable using the language, without having to learn it first.

Principles

The language was designed with the following principles in mind.

  • Deterministic evaluation. If you execute the same code twice, you will get the same results. In Python, the code won’t be deterministic if it relies on the output of functions like “id”, “hash”, if it iterates over a hashtable, if it has race conditions, or if it measures the execution time.
  • Hermetic execution. During the execution, the code cannot access the file system, network, or even look at the current date. You may be able to execute untrusted code.
  • Parallel evaluation. In Python, import order as each import is evaluated successively and they may have side effects. In Starlark, modules can safely be loaded in parallel. The data shared across multiple modules becomes immutable, to ensure the code is always thread safe.
  • Simplicity. The language is small, the language specification is short. The number of concepts is relatively limited, so that new users should be able to quickly read and write code. We removed multiple pitfalls that are present in Python.
  • Focus on tooling. As the code base grows, it becomes important to be able to analyze and edit the Starlark code automatically. When a language is too dynamic, it can be hard to refactor and migrate the existing code. In Starlark, it’s easier to do static analysis and have stronger guarantees than in Python
  • Python-like. Python is a very popular language and its syntax is well known among developers. Starlark was designed to feel familiar to reduce the learning curve and make the semantics more obvious to users.

We get lots of guarantees with Starlark, and that’s the main differentiating factor with other popular languages. Some people may believe it’s easy to avoid non-determinism or other issues, but I assure you they frequently happen in large codebases.

Differences with Python

Starlark is much simpler than Python. The library is minimalist: there are about 30 functions, a few types, and methods on lists, strings and dicts. Of course, when embedding Starlark, the host application can provide additional functions.

Some differences with Python: There are no exceptions, no “while”, no “yield”, no “is”, no reflection. Global variables cannot be reassigned. Iteration order is always specified. Implicit string concatenation is not allowed. This list is not exhaustive. For more information on the topic, see the Language Design page.

Applications

People often use Starlark as an extension language (similar to Lua), or as a configuration language. Starlark is available as a library, to be used from other applications. There are currently three implementations of Starlark, in different languages:

  • In Java (part of the Bazel codebase). This was the original implementation. It is used for Bazel and a couple of Google projects, but it doesn’t have a stable interface.
  • In Rust: https://github.com/facebookexperimental/starlark-rust. It is maintained by Facebook. See the [Starlark-Rust announcement](the Starlark-Rust announcement). It is used at Facebook for their build system, Buck, as well as multiple other internal tools.
  • In Go: https://github.com/google/starlark-go/. It is used by many open-source projects and companies, such as IBM, Stripe (skycfg), Chromium (lucicfg), Cruise Automation (isopod), the Go debugger Delve, and many more.

Why did you create Starlark?

Historically, all the code in the Google codebase was built using Makefiles. As the codebase grew, people wrote Python scripts to generate the Makefiles, using a relatively declarative syntax. Around 2007, Bazel was created and it relied on the Python interpreter to understand the build scripts. After we faced scalability, performance and maintenance issues, we needed a language that provide much stronger guarantees. I started the language design and implementation in 2015. In 2017, we migrated all the Python build scripts inside Google to Starlark (hundreds of thousands of files). To help the migration, we decided to keep the language as similar as possible to Python.

If you plan to use Starlark in your application, I’d love to hear it. If you face adoption blockers, I’d love to hear it as well. I’ve just created a Starlark subreddit for news and discussions although I don’t know yet how useful it will be, but I’ll monitor it.

Discuss

Comments are closed, but discussions about this article happen in other places:

code