Writing Bazel rules: data and runfiles
Bazel has a neat feature that can simplify a lot of work with tests and executables: the ability to make data files available at run-time using data
attributes. You may have seen these in rules like this:
cc_library(
name = "server_lib",
srcs = ["server.cc"],
data = ["private.key"],
)
When a file is listed in a data
attribute (or something that behaves like a data
attribute), Bazel makes that file available at run-time to executables started with bazel run
. This is useful for all kinds of things such as plugins, configuration files, certificates, keys, and resources.
In this article, we'll add data
attributes to the go_library
and go_binary
rules in rules_go_simple, the set of rules we've been working on. We'll be working on the v3
branch. This won't take long: we only need to add a few lines of code for each rule.
Data and runfiles
We can start by adding a data
attribute to our rules. Here's the new declaration for go_library
. The attribute in go_binary
is similar.
go_library = rule(
implementation = _go_library_impl,
attrs = {
"srcs": attr.label_list(
allow_files = [".go"],
doc = "Source files to compile",
),
"deps": attr.label_list(
providers = [GoLibraryInfo],
doc = "Direct dependencies of the library",
),
"data": attr.label_list(
allow_files = True,
doc = "Data files available to binaries using this library",
),
"importpath": attr.string(
mandatory = True,
doc = "Name by which the library may be imported",
),
"_stdlib": attr.label(
allow_single_file = True,
default = "//internal:stdlib",
doc = "Hidden dependency on the Go standard library",
),
},
doc = "Compiles a Go archive from Go sources and dependencies",
)
Bazel tracks files that should be made available at run-time using runfiles
objects. You can create new runfiles
objects with ctx.runfiles
. That function is very similar to a depset in that it takes a direct list of files and a transitive list of files.
To collect runfiles for go_library
, we'll first add a helper function, since we'll need to do the same thing in go_binary
.
def _collect_runfiles(ctx, direct_files, indirect_targets):
"""Builds a runfiles object for the current target.
Args:
ctx: analysis context.
direct_files: list of Files to include directly.
indirect_targets: list of Targets to gather transitive runfiles from.
Returns:
A runfiles object containing direct_files and runfiles from
indirect_targets. The files from indirect_targets won't be included
unless they are also included in runfiles.
"""
return ctx.runfiles(
files = direct_files,
transitive_files = depset(
transitive = [target[DefaultInfo].default_runfiles.files for target in indirect_targets],
),
)
We call it in go_library
like this:
runfiles = _collect_runfiles(
ctx,
direct_files = ctx.files.data,
indirect_targets = ctx.attr.data + ctx.attr.deps,
)
In order to actually make files available, you need to populate the runfiles
field in the DefaultInfo
provider returned by your rule. Recall that DefaultInfo
is used to list the output files and executables produced by a rule.
Here's how we create the DefaultInfo
provider for go_library
. Again, go_binary
is similar.
return [
DefaultInfo(
files = depset([archive]),
runfiles = runfiles,
),
...
]
If for some reason you need to combine runfiles
objects from dependencies, you can do so with runfiles.merge
or runfiles.merge_all
. This might happen if you return a runfiles
object as part of a provider. In simple cases though, you may be able to just use the default_runfiles
field of the DefaultInfo
provider as we did here.
Avoid using the collect_data
and collect_default
arguments of the ctx.runfiles
function. Setting these flags causes ctx.runfiles
to automatically collect files from certain attributes, but this functionality is no longer recommended. An earlier version of this tutorial used those flags before they were deprecated.
Testing data and runfiles
We test our new support for runfiles with a simple binary that depends on a library. Both binary and library have data files, and the test verifies they are present.
sh_test(
name = "data_test",
srcs = ["data_test.sh"],
args = ["$(location :list_data_bin)"],
data = [":list_data_bin"],
)
go_binary(
name = "list_data_bin",
srcs = ["list_data_bin.go"],
deps = [":list_data_lib"],
data = ["foo.txt"],
)
go_library(
name = "list_data_lib",
srcs = ["list_data_lib.go"],
data = ["bar.txt"],
importpath = "rules_go_simple/tests/list_data_lib"
)
You can run this test with bazel test //tests/...
.
Accessing runfiles, cross-platform
You should use a library to find and open runfiles, especially in tests. When Bazel executes a binary on Unix platforms, it creates a tree of symbolic links to the binary's runfiles. If your code only ever runs on Unix platforms, you can open a runfile by opening its relative path within the workspace.
Bazel handles runfiles differently on Windows, so this is not generally safe. In versions of Windows before about 2019, Windows required you to be an administrator to create symbolic links. Even now, you need to enable "Developer Mode" to create symbolic links, which requires administrator access. Operations on small files including symbolic links are also remarkably slow on Windows. To avoid these problems, Bazel uses another strategy: it creates a manifest file that maps logical runfile paths to absolute paths paths for the real files in Bazel's cache. The manifest is pointed to by the RUNFILES_MANIFEST_FILE
environment variable, which is set for tests. Nothing points to the manifest file for binaries run with bazel run
, but you should find a file named MANIFEST
in the initial working directory of the binary. (Incidentally, you can override this and force symbolic links with the Bazel flag --enable_runfiles
).
It is best to use a library if one is available for your language, rather than parsing the manifest file on your own. Bazel's runfile semantics change over time (again with Bzlmod), and using a library will keep your code working. Most languages provide such a library:
- C++:
@bazel_tools//tools/cpp/runfiles
- Bash:
@bazel_tools//tools/bash/runfiles
- Java:
@bazel_tools//tools/java/runfiles
- Python:
@rules_python//python/runfiles
- Go:
@io_bazel_rules_go//go/runfiles
- Rust:
@rules_rust//tools/runfiles
You can learn more from watching Runfiles and where to find them, Fabian Meumertzheim's excellent talk from BazelCon 2022.