Writing Bazel rules: data and runfiles

Published on 2020-02-01
Edited on 2025-09-08
Tagged: bazel go

This article is part of the series "Writing Bazel rules".

Writing Bazel rules: simple binary rule
Writing Bazel rules: library rule, depsets, providers
Writing Bazel rules: data and runfiles
Writing Bazel rules: moving logic to execution
Writing Bazel rules: repository rules
Writing Bazel rules: platforms and toolchains
Writing Bazel rules: module extensions

Bazel has a neat feature that can simplify a lot of work with tests and executables: the ability to make data files available at run-time using data attributes. You may have seen these in rules like this:

cc_library(
    name = "server_lib",
    srcs = ["server.cc"],
    data = ["private.key"],
)

When a file is listed in a data attribute (or something that behaves like a data attribute), Bazel makes that file available at run-time to executables started with bazel run. This is useful for all kinds of things such as plugins, configuration files, certificates, keys, and resources.

In this article, we'll add data attributes to the go_library and go_binary rules in rules_go_simple, the set of rules we've been working on. We'll be working on the v3 branch. This won't take long: we only need to add a few lines of code for each rule.

Data and runfiles

We can start by adding a data attribute to our rules. Here's the new declaration for go_library. The attribute in go_binary is similar.

go_library = rule(
    implementation = _go_library_impl,
    attrs = {
        "srcs": attr.label_list(
            allow_files = [".go"],
            doc = "Source files to compile",
        ),
        "deps": attr.label_list(
            providers = [GoLibraryInfo],
            doc = "Direct dependencies of the library",
        ),
        "data": attr.label_list(
            allow_files = True,
            doc = "Data files available to binaries using this library",
        ),
        "importpath": attr.string(
            mandatory = True,
            doc = "Name by which the library may be imported",
        ),
        "_stdlib": attr.label(
            allow_single_file = True,
            default = "//internal:stdlib",
            doc = "Hidden dependency on the Go standard library",
        ),
    },
    doc = "Compiles a Go archive from Go sources and dependencies",
)

Bazel tracks files that should be made available at run-time using runfiles objects. You can create new runfiles objects with ctx.runfiles. That function is very similar to a depset in that it takes a direct list of files and a transitive list of files.

To collect runfiles for go_library, we'll first add a helper function, since we'll need to do the same thing in go_binary.

def _collect_runfiles(ctx, direct_files, indirect_targets):
    """Builds a runfiles object for the current target.

    Args:
        ctx: analysis context.
        direct_files: list of Files to include directly.
        indirect_targets: list of Targets to gather transitive runfiles from.
    Returns:
        A runfiles object containing direct_files and runfiles from
        indirect_targets. The files from indirect_targets won't be included
        unless they are also included in runfiles.
    """
    return ctx.runfiles(
        files = direct_files,
        transitive_files = depset(
            transitive = [target[DefaultInfo].default_runfiles.files for target in indirect_targets],
        ),
    )

We call it in go_library like this:

    runfiles = _collect_runfiles(
        ctx,
        direct_files = ctx.files.data,
        indirect_targets = ctx.attr.data + ctx.attr.deps,
    )

In order to actually make files available, you need to populate the runfiles field in the DefaultInfo provider returned by your rule. Recall that DefaultInfo is used to list the output files and executables produced by a rule.

Here's how we create the DefaultInfo provider for go_library. Again, go_binary is similar.

return [
    DefaultInfo(
        files = depset([archive]),
        runfiles = runfiles,
    ),
    ...
]

If for some reason you need to combine runfiles objects from dependencies, you can do so with runfiles.merge or runfiles.merge_all. This might happen if you return a runfiles object as part of a provider. In simple cases though, you may be able to just use the default_runfiles field of the DefaultInfo provider as we did here.

Avoid using the collect_data and collect_default arguments of the ctx.runfiles function. Setting these flags causes ctx.runfiles to automatically collect files from certain attributes, but this functionality is no longer recommended. An earlier version of this tutorial used those flags before they were deprecated.

Testing data and runfiles

We test our new support for runfiles with a simple binary that depends on a library. Both binary and library have data files, and the test verifies they are present.

sh_test(
    name = "data_test",
    srcs = ["data_test.sh"],
    args = ["$(location :list_data_bin)"],
    data = [":list_data_bin"],
)

go_binary(
    name = "list_data_bin",
    srcs = ["list_data_bin.go"],
    deps = [":list_data_lib"],
    data = ["foo.txt"],
)

go_library(
    name = "list_data_lib",
    srcs = ["list_data_lib.go"],
    data = ["bar.txt"],
    importpath = "rules_go_simple/tests/list_data_lib"
)

You can run this test with bazel test //tests/....

Accessing runfiles, cross-platform

You should use a library to find and open runfiles, especially in tests. When Bazel executes a binary on Unix platforms, it creates a tree of symbolic links to the binary's runfiles. If your code only ever runs on Unix platforms, you can open a runfile by opening its relative path within the workspace.

Bazel handles runfiles differently on Windows, so this is not generally safe. In versions of Windows before about 2019, Windows required you to be an administrator to create symbolic links. Even now, you need to enable "Developer Mode" to create symbolic links, which requires administrator access. Operations on small files including symbolic links are also remarkably slow on Windows. To avoid these problems, Bazel uses another strategy: it creates a manifest file that maps logical runfile paths to absolute paths paths for the real files in Bazel's cache. The manifest is pointed to by the RUNFILES_MANIFEST_FILE environment variable, which is set for tests. Nothing points to the manifest file for binaries run with bazel run, but you should find a file named MANIFEST in the initial working directory of the binary. (Incidentally, you can override this and force symbolic links with the Bazel flag --enable_runfiles).

It is best to use a library if one is available for your language, rather than parsing the manifest file on your own. Bazel's runfile semantics change over time (again with Bzlmod), and using a library will keep your code working. Most languages provide such a library:

C++: @bazel_tools//tools/cpp/runfiles
Bash: @bazel_tools//tools/bash/runfiles
Java: @bazel_tools//tools/java/runfiles
Python: @rules_python//python/runfiles
Go: @io_bazel_rules_go//go/runfiles
Rust: @rules_rust//tools/runfiles

You can learn more from watching Runfiles and where to find them, Fabian Meumertzheim's excellent talk from BazelCon 2022.