Writing Bazel rules: platforms and toolchains

Published on 2019-12-07
Edited on 2025-09-10
Tagged: bazel go

This article is part of the series "Writing Bazel rules".

Writing Bazel rules: simple binary rule
Writing Bazel rules: library rule, depsets, providers
Writing Bazel rules: data and runfiles
Writing Bazel rules: moving logic to execution
Writing Bazel rules: repository rules
Writing Bazel rules: platforms and toolchains
Writing Bazel rules: module extensions

One of Bazel's biggest strengths is its ability to isolate a build from the host system. This enables reproducible builds and remote execution, which lets Bazel scale to huge projects. This isolation isn't completely automatic; rules must cooperate with Bazel to ensure the correct tools are used when the host, execution, and target platforms may all be different.

In the previous article, we defined a repository rule which let us download and verify a Go toolchain. This time, we'll configure our simple set of rules to use that toolchain. After this, our rules will be almost completely independent from the host system. Our users will be able checkout and build a Go project without needing to install Go themselves.

Concepts

Before we get to the actual code, let's go over some platform and toolchain jargon. There's a lot this time. You may also want to read through the official documentation on Platforms and Toolchains.

A platform is a description of where software can run, defined with the platform rule. The host platform is where Bazel itself runs. The execution platform is where actions run. Normally, this is the same as the host platform, but if you're using remote execution, the execution platform may be different. The target platform is where the software you're building should run. By default, this is also the same as the host platform, but if you're cross-compiling, it can be different.

A platform is described by a list of constraint values, defined with the constraint_value rule. A constraint value is a fact about a platform, for example, the CPU is x86_64, or the operating system is Linux. There are a number of constraint values defined in the platforms module. Bazel itself depends on this module, but if you use it, you should declare your own dependency with bazel_dep in MODULE.bazel so you get a predictable minimum version. You can list constraints in that module with bazel query @platforms//.... You can also define your own.

A constraint setting is a category of constraint values, at most one of which may be true for any platform. A constraint setting is defined with the constraint_setting rule. @platforms//os:os and @platforms//cpu:cpu are the two main settings to worry about, but again, you can define your own.

A toolchain is a special target defined with the toolchain rule that associates a toolchain implementation with a list of constraint values for both the target and execution platforms.

A toolchain type is a target defined with the toolchain_type rule, which is a name that identifies a kind of toolchain.

A toolchain implementation is a target that represents the actual toolchain by listing the files that are part of the toolchain (for example, the compiler and standard library) and code needed to use the toolchain. A toolchain implementation must return a ToolchainInfo provider.

So that's a lot to take in. How does it all fit together?

Anyone who's defining a toolchain needs to declare a toolchain_type target. This is just a unique symbol.

The actual toolchains are defined with toolchain targets that point to implementations. We'll define a go_toolchain rule for our implementation, but you can use any rule that returns a ToolchainInfo provider.

A rule can request a toolchain using its type by setting the toolchains parameter in its rule declaration. The rule implementation can then access the toolchain through ctx.toolchains.

Users register toolchains they'd like to use by calling the register_toolchains function in their MODULE.bazel file or by passing the --extra_toolchains flag on the command line.

Finally, when Bazel begins a build, it checks the constraints for the execution and target platforms. It then selects a suitable set of toolchains that are compatible with those constraints. Bazel will provide the ToolchainInfo objects of those toolchains to the rules that request them.

…

Got all that? Actually I'm not sure I do either. It's an elegant system, but it's difficult to grasp. If you want to see how Bazel selects or rejects registered toolchains, use the --toolchain_resolution_debug flag.

The key insight is that Bazel's toolchains are a dynamic dependency injection system. If you've ever used something like Dagger or Guice, Bazel's toolchains are conceptually similar. A toolchain type is like an interface. A toolchain is like a static method with a @Provides annotation. A rule that requires a toolchain is like a constructor with an @Inject annotation. The system automatically finds a suitable implementation for every injected interface, based on the constraint values in the execution and target platforms.

Migrating rules to toolchains

Let's start using toolchains in rules_go_simple. We're now on the v5 branch.

First, we'll declare a toolchain_type. Rules can request this with the label @rules_go_simple//:toolchain_type.

toolchain_type(
    name = "toolchain_type",
    visibility = ["//visibility:public"],
)

Since a toolchain_type is basically an interface, we should document what can be done with that interface. Starlark is a dynamically typed language, and there's no place to write down required method or field names. I declared a dummy provider in providers.bzl with some documentation, but you could write this in a README or wherever makes sense for your project.

Next, we'll create our toolchain implementation rule, go_toolchain.

def _go_toolchain_impl(ctx):
    # Find important files and paths.
    go_cmd = find_go_cmd(ctx.files.tools)
    env = {"GOROOT": paths.dirname(paths.dirname(go_cmd.path))}

    # Return a ToolchainInfo provider. This is the object that rules get
    # when they ask for the toolchain.
    return [platform_common.ToolchainInfo(
        # Functions that generate actions. Rules may call these.
        # This is the public interface of the toolchain.
        compile = go_compile,
        link = go_link,
        build_test = go_build_test,

        # Internal data. Contents may change without notice.
        # Think of these like private fields in a class. Actions may use these
        # (they are methods of the class) but rules may not (they are clients).
        internal = struct(
            go_cmd = go_cmd,
            env = env,
            builder = ctx.executable.builder,
            tools = ctx.files.tools,
            stdlib = ctx.file.stdlib,
        ),
    )]

go_toolchain = rule(
    implementation = _go_toolchain_impl,
    attrs = {
        "builder": attr.label(
            mandatory = True,
            executable = True,
            cfg = "exec",
            doc = "Executable that performs most actions",
        ),
        "tools": attr.label_list(
            mandatory = True,
            doc = "Compiler, linker, and other executables from the Go distribution",
        ),
        "stdlib": attr.label(
            mandatory = True,
            allow_single_file = True,
            cfg = "target",
            doc = "Package files for the standard library compiled by go_stdlib",
        ),
    },
    doc = "Gathers functions and file lists needed for a Go toolchain",
)

go_toolchain is a normal rule that returns a ToolchainInfo provider. When rules request the toolchain, they will get one of these structs. There are no mandatory fields, so you can put anything in here. I included three "methods" (which are actually just functions): compile, link, and build_test. These correspond with the actions our rules need to create, so rules will call these instead of creating actions directly. I also included an internal struct field, which includes private files and metadata. Our methods may access the internal struct, but clients of the toolchain should not, since these values can change without notice.

Next, we'll declare go_toolchain and toolchain targets in BUILD.bazel.go_download.tpl. This file is a template that gets expanded into a build file for the go_download repository rule. See the previous article for details.

# toolchain_impl gathers information about the Go toolchain.
# See the GoToolchain provider.
go_toolchain(
    name = "toolchain_impl",
    builder = ":builder",
    stdlib = ":stdlib",
    tools = [":tools"],
)

# toolchain is a Bazel toolchain that expresses execution and target
# constraints for toolchain_impl. This target should be registered by
# calling register_toolchains in a WORKSPACE file.
toolchain(
    name = "toolchain",
    exec_compatible_with = [
        {exec_constraints},
    ],
    target_compatible_with = [
        {target_constraints},
    ],
    toolchain = ":toolchain_impl",
    toolchain_type = "@rules_go_simple//:toolchain_type",
)

We use the goos and goarch attributes to set the {exec_constraints} and {target_constraints} template parameters in the go_download rule. See repo.bzl.

To complete the toolchain implementation, we'll modify our go_compile, go_link, and go_build_test functions. They can obtain the toolchain using ctx.toolchains. Here's go_compile after this change:

def go_compile(ctx, *, srcs, importpath, deps, out):
    """Compiles a single Go package from sources.

    Args:
        ctx: analysis context.
        srcs: list of source Files to be compiled.
        importpath: the path other libraries may use to import this package.
        deps: list of GoLibraryInfo objects for direct dependencies.
        out: output .a File.
    """
    toolchain = ctx.toolchains["@rules_go_simple//:toolchain_type"]

    args = ctx.actions.args()
    args.add("compile")
    args.add("-stdlib", toolchain.internal.stdlib.path)
    dep_infos = [d.info for d in deps]
    args.add_all(dep_infos, before_each = "-arc", map_each = _format_arc)
    if importpath:
        args.add("-p", importpath)
    args.add("-o", out)
    args.add_all(srcs)

    inputs = (srcs +
              [dep.info.archive for dep in deps] +
              [toolchain.internal.stdlib] +
              toolchain.internal.tools)
    ctx.actions.run(
        outputs = [out],
        inputs = inputs,
        executable = toolchain.internal.builder,
        arguments = [args],
        env = toolchain.internal.env,
        mnemonic = "GoCompile",
    )

Finally, we'll update our rules to request the toolchain and call these functions. Here's go_library after this change.

def _go_library_impl(ctx):
    # Load the toolchain.
    toolchain = ctx.toolchains["@rules_go_simple//:toolchain_type"]

    # Declare an output file for the library package and compile it from srcs.
    archive = ctx.actions.declare_file("{name}.a".format(name = ctx.label.name))
    toolchain.compile(
        ctx,
        srcs = ctx.files.srcs,
        importpath = ctx.attr.importpath,
        deps = [dep[GoLibraryInfo] for dep in ctx.attr.deps],
        out = archive,
    )

    # Return the output file and metadata about the library.
    runfiles = _collect_runfiles(
        ctx,
        direct_files = ctx.files.data,
        indirect_targets = ctx.attr.data + ctx.attr.deps,
    )
    return [
        DefaultInfo(
            files = depset([archive]),
            runfiles = runfiles,
        ),
        GoLibraryInfo(
            info = struct(
                importpath = ctx.attr.importpath,
                archive = archive,
            ),
            deps = depset(
                direct = [dep[GoLibraryInfo].info for dep in ctx.attr.deps],
                transitive = [dep[GoLibraryInfo].deps for dep in ctx.attr.deps],
            ),
        ),
    ]

go_library = rule(
    implementation = _go_library_impl,
    attrs = {
        "srcs": attr.label_list(
            allow_files = [".go"],
            doc = "Source files to compile",
        ),
        "deps": attr.label_list(
            providers = [GoLibraryInfo],
            doc = "Direct dependencies of the library",
        ),
        "data": attr.label_list(
            allow_files = True,
            doc = "Data files available to binaries using this library",
        ),
        "importpath": attr.string(
            mandatory = True,
            doc = "Name by which the library may be imported",
        ),
    },
    doc = "Compiles a Go archive from Go sources and dependencies",
    toolchains = ["@rules_go_simple//:toolchain_type"],
)

Registering toolchains

Users must register toolchains so Bazel can make use of them. This happens in MODULE.bazel. Before doing this, we'll use the go_download repository rule to define two repos: one for linux/amd64 and one for darwin/arm64.

go_download = use_repo_rule("//:go.bzl", "go_download")

go_download(
    name = "go_darwin_arm64",
    goarch = "arm64",
    goos = "darwin",
    sha256 = "544932844156d8172f7a28f77f2ac9c15a23046698b6243f633b0a0b00c0749c",
    urls = ["https://go.dev/dl/go1.25.0.darwin-arm64.tar.gz"],
)

go_download(
    name = "go_linux_amd64",
    goarch = "amd64",
    goos = "linux",
    sha256 = "2852af0cb20a13139b3448992e69b868e50ed0f8a1e5940ee1de9e19a123b613",
    urls = ["https://go.dev/dl/go1.25.0.linux-amd64.tar.gz"],
)

register_toolchains(
    "@go_darwin_arm64//:toolchain",
    "@go_linux_amd64//:toolchain",
)

When Bazel starts, it considers the target platforms (set with --platforms), the execution platforms (set with register_execution_platforms or --extra_execution_platforms), and all registered toolchains (set with register_toolchains or --extra_toolchains). Bazel attempts to select a toolchain for each execution platform and target platform that satisfies all platform constraints. If multiple toolchains are compatible, Bazel picks the first registered toolchain. Modules using register_toolchains are considered in breadth-first pre-order, so toolchains registered in the main module take priority over others. You can get Bazel to show its work using the --toolchain_resolution_debug flag, which takes a regular expression matching the toolchain type.

NOTE: Registering toolchains declared in repository rules as we're doing above has a major disadvantage: Bazel needs to evaluate the repository rule in order to read the toolchain definition. This means we need to download both the macOS and Linux archives, even though we only want to use one. We'll fix this in the next article when we get to module extensions and toolchainization.

Using toolchains

Let's check whether this works by building //tests:hello, a minimal "hello world" go_binary:

$ bazel build --subcommands --toolchain_resolution_debug=. //tests:hello
Starting local Bazel server (8.3.1) and connecting to it...
INFO: Invocation ID: b6fa5c3d-a819-4252-9790-860e5b47e8db
INFO: ToolchainResolution: Target platform @@platforms//host:host: Selected execution platform @@platforms//host:host, 
INFO: ToolchainResolution: Performing resolution of //:toolchain_type for target platform @@platforms//host:host
      ToolchainResolution:   Toolchain @@+_repo_rules+go_darwin_arm64//:toolchain_impl is compatible with target platform, searching for execution platforms:
      ToolchainResolution:     Compatible execution platform @@platforms//host:host
      ToolchainResolution:   All execution platforms have been assigned a //:toolchain_type toolchain, stopping
      ToolchainResolution: Recap of selected //:toolchain_type toolchains for target platform @@platforms//host:host:
      ToolchainResolution:   Selected @@+_repo_rules+go_darwin_arm64//:toolchain_impl to run on execution platform @@platforms//host:host
INFO: ToolchainResolution: Target platform @@platforms//host:host: Selected execution platform @@platforms//host:host, type //:toolchain_type -> toolchain @@+_repo_rules+go_darwin_arm64//:toolchain_impl
INFO: ToolchainResolution: Target platform @@platforms//host:host: Selected execution platform @@platforms//host:host, 
INFO: ToolchainResolution: Target platform @@platforms//host:host: Selected execution platform @@platforms//host:host, 
INFO: Analyzed target //tests:hello (65 packages loaded, 7253 targets configured).
INFO: Found 1 target...
Target //tests:hello up-to-date:
  bazel-bin/tests/hello
INFO: Elapsed time: 3.056s, Critical Path: 0.03s
INFO: 1 process: 9 action cache hit, 1 internal.
INFO: Build completed successfully, 1 total action

Since we're not cross-compiling, Bazel only considered the platform @@platforms//host:host. Let's see what constraints that has.

$ bazel query --output=build @@platforms//host
platform(
  name = "host",
  constraint_values = ["@platforms//cpu:aarch64", "@platforms//os:osx"],
)

And what constraints did we say that toolchain was compatible with?

$ bazel query --output=build @go_darwin_arm64//:toolchain
toolchain(
  name = "toolchain",
  toolchain_type = "//:toolchain_type",
  exec_compatible_with = ["@platforms//os:macos", "@platforms//cpu:aarch64"],
  target_compatible_with = ["@platforms//os:macos", "@platforms//cpu:aarch64"],
  toolchain = "@go_darwin_arm64//:toolchain_impl",
)

Almost the same. @platforms//os:macos is an alias pointing to @platforms//os:osx, so this toolchain satisfies all constraints from the platform.

Conclusion

Platforms and toolchains are a mechanism for decoupling a set of rules from the tools they depend on. This is most immediately useful for isolating the build from the machine it runs on. It also provides flexibility for users: it lets developers (not necessarily the original rule authors) write their own rules compatible with existing toolchains and their own toolchains compatible with existing rules. In our case, someone could create a toolchain for gccgo or TinyGo, and it would work with rules_go_simple as long as it satisfies the interface we documented for our toolchain_type. Someone else could write a go_proto_library rule that builds generated code with the same compiler as go_library.

Ultimately, the toolchain system separates what is being built (rules) from how to build it (toolchain). This means when you change one component, you don't need to rewrite all the build files in your repository. Change is isolated, which is important in any system that needs to scale.