Writing Bazel rules: library rule, depsets, providers

Published on 2018-08-15
Edited on 2025-09-08
Tagged: bazel go

View All Posts | RSS RSS feed

This article is part of the series "Writing Bazel rules".

In the last article, we built a go_binary rule that compiled and linked a Go executable from a list of sources. This time, we'll define a go_library rule that can compile a Go package that other libraries and binaries can depend on.

This article focuses on rules that communicate with each other to build a dependency graph that can be used by a linker (or a linker-like action). All the of the code is from rules_go_simple on the v2 branch.

Once again, you don't need to know Go to understand this. I'm just using Go as an example because that's what I like to work in.

Background

Before we jump in, we need to cover three important concepts: structs, providers, and depsets. They are data structures used to pass information between rules, and we'll need them to gather information about dependencies.

Structs

A struct value is a dictionary of key-value pairs, kind of like an object in Python or JavaScript. Although the struct is a basic data structure, it's provided by Bazel and is not technically part of the Starlark language. You can create a struct value by calling the struct function:

my_value = struct(
    foo = 12,
    bar = 34,
)

You can access fields in the struct the same way you would access fields in an object in Python.

print(my_value.foo + my_value.bar)

You can use the dir function to get a list of field names of a struct. getattr and hasattr work the way you'd expect, but you can't modify or delete attributes after they're set because struct values are immutable. You can convert a struct (or any value) to and from JSON with json.encode and json.decode.

Providers

A provider is a named struct type that conveys information about a rule. A rule implementation function returns provider structs when its' evaluated. A provider can be read by a rule that depends on a rule that returns it. In the last article, our go_binary rule returned a DefaultInfo provider (one of the built-in providers). In this article we'll define a GoLibraryInfo provider that carries metadata about our libraries.

You can define a new provider by calling the provider function.

MyProvider = provider(
    doc = "My custom provider",
    fields = {
        "foo": "A foo value",
        "bar": "A bar value",
    },
)

You can create a provider value just like a struct value:

my_provider = MyProvider(foo = 12, bar = 34)

Depsets

Bazel provides a special purpose data structure called a depset. Like any set, a depset is a set of unique values. Depsets distinguish themselves by being fast to merge and by having a well-defined iteration order.

Depsets are typically used to accumulate information like sources or header files over large dependency graphs. A dependency graph may contain hundreds of thousands of nodes, so it's important that all depset operations run in linear time and space. In this article, we'll use depsets to accumulate information about Go dependencies like import paths and compiled file names. The linker will be able to use this information without requiring go_binary to explicitly list all transitive dependencies.

A depset comprises a list of direct elements, a list of transitive depset children, and an iteration order.

Diagram of a depset

Constructing a depset is fast because it just involves creating an object with direct and transitive lists. This takes O(D+T) time where D is the number of elements in the direct list and T is the number of transitive children. Bazel deduplicates elements of both lists when constructing sets. Iterating a depset or converting it to a list takes O(n) time where n is the number of elements in the set and all of its children, including duplicates.

Defining go_library

The GoLibraryInfo provider

Ok, the theory is out of the way, let's get to the code.

First, we define a new provider. GoLibraryInfo carries information about each library and its dependencies. We define it in a new file, providers.bzl.

GoLibraryInfo = provider(
    doc = "Contains information about a Go library",
    fields = {
        "info": """A struct containing information about this library.
        Has the following fields:
            importpath: Name by which the library may be imported.
            archive: The .a file compiled from the library's sources.
        """,
        "deps": "A depset of info structs for this library's dependencies",
    },
)

doc sets a documentation string for Stardoc. fields lists the allowed fields in the provider, along with their documentation strings. You can set the init argument to a custom constructor function, which may be useful if you want to perform more advanced validation or initialization. See provider for details.

The go_library rule

Now we can define the go_library rule. Here's the new rule declaration in rules.bzl.

go_library = rule(
    implementation = _go_library_impl,
    attrs = {
        "srcs": attr.label_list(
            allow_files = [".go"],
            doc = "Source files to compile",
        ),
        "deps": attr.label_list(
            providers = [GoLibraryInfo],
            doc = "Direct dependencies of the library",
        ),
        "importpath": attr.string(
            mandatory = True,
            doc = "Name by which the library may be imported",
        ),
        "_stdlib": attr.label(
            allow_single_file = True,
            default = "//internal:stdlib",
            doc = "Hidden dependency on the Go standard library",
        ),
    },
    doc = "Compiles a Go archive from Go sources and dependencies",
)

There are four attributes here. srcs is a list of labels that refer to source .go files or rules that generate .go files. deps is a list of labels that refer to other Go library rules. They don't have to be go_library specifically, but they have to return GoLibraryInfo providers to be compatible. importpath is just a string. We'll use that to generate the importcfg files that the compiler and linker use to map import strings to compiled .a files. And finally, _stdlib is a hidden dependency (name starts with _) on the compiled standard library, same as in go_binary.

Here's the implementation of the rule.

def _go_library_impl(ctx):
    # Declare an output file for the library package and compile it from srcs.
    archive = ctx.actions.declare_file("{name}.a".format(name = ctx.label.name))
    go_compile(
        ctx,
        srcs = ctx.files.srcs,
        importpath = ctx.attr.importpath,
        stdlib = ctx.file._stdlib,
        deps = [dep[GoLibraryInfo] for dep in ctx.attr.deps],
        out = archive,
    )

    # Return the output file and metadata about the library.
    return [
        DefaultInfo(files = depset([archive])),
        GoLibraryInfo(
            info = struct(
                importpath = ctx.attr.importpath,
                archive = archive,
            ),
            deps = depset(
                direct = [dep[GoLibraryInfo].info for dep in ctx.attr.deps],
                transitive = [dep[GoLibraryInfo].deps for dep in ctx.attr.deps],
            ),
        ),
    ]

First, we use ctx.actions.declare_file to declare our compiled output file, then go_compile to declare the compile command. We did the same thing in go_binary.

Look at the different ways we access our attributes here.

ctx.files.srcs gives us a list of all files from the srcs attribute. This list may not be the same length as the list of labels passed to srcs: for example, one of those labels might be a filegroup containing any number of files.

ctx.attr.importpath gives us a string value, since importpath is string attribute.

ctx.file._stdlib gives us a single File (actually a directory) for _stdlib, which is allowed because it was declared with allow_single_file = True.

ctx.attr.deps gives us a list of Target. The subscript expression dep[GoLibraryInfo] gives us the GoLibraryInfo provider returned by that target.

Finally, we return a list of two providers, DefaultInfo and GoLibraryInfo. The GoLibraryInfo.info field is a struct with information about the library being compiled. It's important that this struct is immutable and is relatively small, since it will be added to a depset (the GoLibraryInfo.deps field of other libraries) and hashed.

go_compile and go_link

There was an important change to go_compile and go_link. Did you catch it? Both now accept a deps argument, a list of GoLibraryInfo providers for direct dependencies. In both cases, we use this list to generate an importcfg file. The Go compiler and linker use importcfg files to map import strings to compiled .a files.

For go_compile, the importcfg file only needs to list direct dependencies, so we generate its content like this:

dep_importcfg_text = "\n".join([
    "packagefile {importpath}={filepath}".format(
        importpath = dep.info.importpath,
        filepath = dep.info.archive.path,
    )
    for dep in deps
])

For go_link, the importcfg file needs to contain all transitive dependencies, so we create a depset first, then iterate over that. This is why we needed GoLibraryInfo.deps.

deps_set = depset(
    direct = [d.info for d in deps],
    transitive = [d.deps for d in deps],
)
dep_importcfg_text = "\n".join([
    "packagefile {importpath}={filepath}".format(
        importpath = dep.importpath,
        filepath = dep.archive.path,
    )
    for dep in deps_set.to_list()
])

I'll skip over the actual bash script this is injected into since it's ugly and not relevant to writing rules for other languages. We'll clean it up in a later article.

go_binary has one other change: it now includes a deps attribute and calls go_link with GoLibraryInfo providers from those targets. I won't reproduce the entire source here because it's a very small change from last time.

Exposing a public interface

All our definitions are in an internal directory, and we need to make them available for other people to use. So we load them in def.bzl, which just contains our public definitions. We expose both go_library and GoLibraryInfo. The latter will be needed by anyone who wants to implement compatible rules.

load(
    "//internal:rules.bzl",
    _go_binary = "go_binary",
    _go_library = "go_library",
)
load(
    "//internal:providers.bzl",
    _GoLibraryInfo = "GoLibraryInfo",
)

go_binary = _go_binary
go_library = _go_library
GoLibraryInfo = _GoLibraryInfo

Testing the go_library rule

We'll test our new functionality the same way we did before: using an sh_test that runs a go_binary built with our new functionality:

sh_test(
    name = "bin_with_libs_test",
    srcs = ["bin_with_libs_test.sh"],
    args = ["$(rootpath :bin_with_libs)"],
    data = [":bin_with_libs"],
)

go_binary(
    name = "bin_with_libs",
    srcs = ["bin_with_libs.go"],
    deps = [":foo"],
)

go_library(
    name = "foo",
    srcs = ["foo.go"],
    importpath = "rules_go_simple/tests/foo",
    deps = [
        ":bar",
        ":baz",
    ],
)

go_library(
    name = "bar",
    srcs = ["bar.go"],
    importpath = "rules_go_simple/tests/bar",
    deps = [":baz"],
)

go_library(
    name = "baz",
    srcs = ["baz.go"],
    importpath = "rules_go_simple/tests/baz",
)

You can test this out with bazel test //tests/....