Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Nix flake #9023

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Add a Nix flake #9023

wants to merge 4 commits into from

Conversation

sellout
Copy link

@sellout sellout commented Nov 14, 2024

Motivation

I manage all of my systems with Nix, so this grew out of me doing various work around Zebra the past few months. I currently merge this branch into whatever I’m working on (which is easy, because this basically only adds new files1, so no conflicts).

I also saw @teor2345 had previously put up a Nix derivation in #1479, which roughly corresponds to nix/packages/zebra in this PR.

Solution

Here is a summary of what’s added. ${system} can be replaced with any of aarch64-darwin, aarch64-linux, x86_64-darwin, x86_64-linux.

├───apps
│   └───${system}
│       ├───default: app
│       ├───zebra-scanner: app
│       └───zebrad: app
├───checks
│   └───${system}
│       ├───audit: derivation 'crate-audit-0.0.0'
│       ├───clippy: derivation 'zebra-clippy-2.0.1'
│       ├───deny: derivation 'zebra-deny-2.0.1'
│       ├───doc: derivation 'zebra-doc-2.0.1'
│       └───fmt: derivation 'zebra-fmt-2.0.1'
├───darwinConfigurations
│   ├───aarch64-darwin-example: nix-darwin configuration
│   └───x86_64-darwin-example: nix-darwin configuration
├───darwinModules
│   ├───default: nix-darwin module
│   └───zebra: nix-darwin module
├───devShells
│   └───${system}
│       └───default: development environment 'nix-shell'
├───formatter
│   └───${system}: package 'alejandra-3.0.0'
├───homeConfigurations: unknown
│   └───${system}-example: Home Manager configuration
├───homeModules
│   ├───default: Home Manager module
│   └───zebra: Home Manager module
├───nixosConfigurations
│   ├───aarch64-linux-example: NixOS configuration
│   └───x86_64-linux-example: NixOS configuration
├───nixosModules
│   ├───default: NixOS module
│   └───zebra: NixOS module
├───overlays
│   └───default: Nixpkgs overlay
└───packages
    └───${system}
        ├───default: package 'zebra-2.0.1'
        ├───zebra: package 'zebra-2.0.1'
        └───zebra-deps: package 'zebra-deps-2.0.1'
  • packages are builds of the main content of this repo
  • checks are various tests (other than cargo test, which is covered by packages)
  • *Modules allow you to configure zebrad like this
  • *Configurations are examples of those configurations that are built as tests of much of the flake
  • devShells provide a sandboxed development environment with rustc, cargo, etc.

Tests

Everything is built on aarch64-darwin, aarch64-linux, and x86_64-linux at garnix, which also runs the various checks (clippy, fmt, etc.), and it builds the example configurations which implicitly tests the overlays, modules, etc.

For these CI builds to run on not-my-fork, the ZcashFoundation org would need to get a garnix account, or would need to add some Nix-based GitHub workflow (ideally with some caching solution, which garnix handles automatically).

Follow-up Work

The current solution has everything in there, but I think the Nix/Rust tooling could be improved wrt cache-friendliness. It currently uses crane, which condenses everything into only two packages (“zebra”, containing all the crates in this repo, and “zebra-deps”, containing all the dependency crates not in this repo), so if any part of Zebra changes, all of Zebra gets rebuilt, and if a dependency changes, all of Zebra and all deps get rebuilt. Having a separate package for each crate would minimize rebuilds, so a solution like cargo2nix or crate2nix is probably the way to go longer-term2.

The derivation is built with ZEBRA_SKIP_NETWORK_TESTS, because of Nix sandboxing3. But even so, there are a number of failing tests that I’ve explicitly skipped. That file conditionalizes them so you can see in which contexts they fail. One thing I didn’t conditionally enable is tests that pass outside of a sandbox, because I think that makes the dev / CI divide confusing. E.g., a number of disabled tests can pass if __darwinAllowLocalNetworking is enabled, but that can only be done outside of a sandbox. Also interesting is that there are a number of tests that only fail when the elasticsearch feature is enabled (and only on MacOS).

This PR doesn’t provide a default.nix or shell.nix (#1479 did, just under a different name), because I do everything with flakes, but it’s easy enough to expose them with flake-compat if that’s desired.

PR Author's Checklist

  • The PR name will make sense to users.
  • The PR provides a CHANGELOG summary.
  • The solution is tested.
  • The documentation is up to date.
  • The PR has a priority label.

PR Reviewer's Checklist

  • The PR Author's checklist is complete.
  • The PR resolves the issue.

Footnotes

  1. There are three minor changes to existing files:

    1. ignored the Nix build result in .gitignore,
    2. fixed a clippy complaint in disk_db.rs, and
    3. removed an invalid Alias from systemd/zebrad.service.
  2. They’re not being used yet because of a bug in Nix ([OS X] Derivation fails with sandbox NixOS/nix#4119) that prevents packages with a lot of dependencies from building in a sandbox on MacOS (which would cause failures on garnix CI, as it requires everything to be sandboxed).

  3. There are ways to enable network access in a sandbox, so that might allow all (or at least more) tests to be enabled.

@sellout sellout requested a review from a team as a code owner November 14, 2024 19:06
@sellout sellout requested review from arya2 and removed request for a team November 14, 2024 19:06
@github-actions github-actions bot added the C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG label Nov 14, 2024
@gustavovalverde gustavovalverde added A-devops Area: Pipelines, CI/CD and Dockerfiles C-feature Category: New features A-compatibility Area: Compatibility with other nodes or wallets, or standard rules P-Optional ✨ extra-reviews This PR needs at least 2 reviews to merge and removed C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG labels Nov 15, 2024
This adds a fairly rich Nix environment. It provides a shell that includes rustfmt and
rust-analyzer, and a Home Manager module for configuring & running Zebra via Nix.
Simple CI for Nix
@github-actions github-actions bot added the C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG label Nov 30, 2024
@arya2 arya2 added do-not-merge Tells Mergify not to merge this PR and removed do-not-merge Tells Mergify not to merge this PR labels Dec 5, 2024
@mpguerra mpguerra added the no-review-reminders Turn off review reminders label Jan 13, 2025
@shielded-nate
Copy link

Hello. I'm currently building and running zebra on nixos and I created my own flake.nix locally before finding this. I intend to test and review this to provide a second set of eyes on it.

@shielded-nate
Copy link

I'm often learning nix as I go, and the same is true with this PR. I started by rebasing this branch onto a more recent main, and then running nix build.

That asked me a series of questions, so I started documenting my process and stopped to post this:

Security Analysis

Here's a tangent to document development/infrastructure security for the nix system. This could be useful for my own understanding as well as spreading knowledge of the security profile for the rest of the Zebra ecosystem.

The question is "as a developer working on my local machine OR using CI infrastructure, what vectors are there for someone to inject malicious code onto my systems?"

Signing Key(?) Authentication

The nix build process prompts me to accept/reject these keys:

  • cache.garnix.io:CTFPyKSLcx5RMJKfLo5EEPUObbA78b0YQ2DTCJXqr9g=
  • nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs=

Presumably these are public signing keys.

  • anyone: Verify these are public signing keys/fingerprints.
  • @sellout: can you verify these match your local perspective?

Package (Derivation) Write Authority

Next, I want to understand if I build and run a flake A, which depends on B, C, D, and those each in turn depend on others, who has the ability to install arbitrary code on my machine?

Transitive Pinning of Source

My understanding of nix with flakes is that the author of A pins all of the direct dependencies, and those in turn pin their dependencies and so on. So, A is the only direct authority for selecting code on my machine. A often comes from git, so for example in this case, I'm relying on github and anyone with write access to github can inject arbitrary code.

Assuming github isn't compromised, A selects all source code transitively. In practice people are very likely to just rely on publicly available dependencies, so in practice the authors of B, C, D, etc... and often the servers that host their code have the opportunity to inject malicious code. However because of pinning, if any upstream package does contain malicious code, there's a permanent record of that due to A's flake.lock.

That's about as good as modern systems get. (-unless we also committed A to a censorship resistant ledger! 😉)

Caching Security

Next question: since we're relying on caching systems to substitute binaries for a given source, I'd like to document what attack surface that presents for malicious code injection.

My guess is that the key material above are signing keys for caching systems, and I believe that's equivalent security to fetching files over https with a fingerprint-pinned certificate. In other words, there's a server with a private key that can serve up any arbitrary bytes, claiming they are the binaries corresponding to a given source package.

  • Verify if this is the case for caching in nix.

One saving grace for nix here is that every package build should be reproducible, so we could "spot" check caching servers by selecting a random transitive dependency and building it locally to verify it matches what the caching server claims.

  • Check if there's a way to enable random spot checks like this.

@sellout
Copy link
Author

sellout commented Jan 28, 2025

Thanks, @shielded-nate, these are some great questions.

The question is "as a developer working on my local machine OR using CI infrastructure, what vectors are there for someone to inject malicious code onto my systems?"

The nix build process prompts me to accept/reject these keys:

  • cache.garnix.io:CTFPyKSLcx5RMJKfLo5EEPUObbA78b0YQ2DTCJXqr9g=
  • nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs=

Presumably these are public signing keys.

Correct. Accepting them allows you to use cached Nix artifacts rather than building everything locally. It’s also fairly easy to create your own cache to use, either on a service like Cachix, or self-hosted, so then you can eschew the ones you have less control over.

The links below point to the source of the keys, and they match.

  • cache.garnix.io – garnix is a Nix-based CI that caches everything it builds. I have garnix enabled for this PR so that most of the outputs are tested automatically. It’s a small company run by people I personally trust and I use it on all of my projects.
  • nix-community.cachix.org-1 – this cache holds artifacts from projects provided by @nix-community. The ones used in this flake are Fenix, which provides the Rust toolchains, and Home Manager, which is used for testing the homeModules and example homeConfigurations.

Next, I want to understand if I build and run a flake A, which depends on B, C, D, and those each in turn depend on others, who has the ability to install arbitrary code on my machine?

My understanding of nix with flakes is that the author of A pins all of the direct dependencies, and those in turn pin their dependencies and so on. So, A is the only direct authority for selecting code on my machine. A often comes from git, so for example in this case, I'm relying on github and anyone with write access to github can inject arbitrary code.

The author of A can also selectively pin transitive dependencies. E.g.,

inputs = {
  nixpkgs.url = "github:NixOS/nixpkgs/release-24.11";
  b.url = "github:user/b";
  b.inputs.nixpkgs.follows = "nixpkgs";
  c.url = "github:org/c";
  c.inputs.nixpkgs.follows = "nixpkgs";
  d.url = "github:other-user/d";
  d.inputs.nixpkgs.follows = "nixpkgs";
}

ensures that B, C, and D are all using the same version of Nixpkgs as A. This is most commonly done to avoid downloading many copies of nearly-identical flakes, and it can make things fragile in some cases, but it does give A control over exactly which versions they pull.

NB: It should be possible to override inputs at arbitrarily depths, but there is a Nix bug that currently prevents it.

Assuming github isn't compromised, A selects all source code transitively. In practice people are very likely to just rely on publicly available dependencies, so in practice the authors of B, C, D, etc... and often the servers that host their code have the opportunity to inject malicious code. However because of pinning, if any upstream package does contain malicious code, there's a permanent record of that due to A's flake.lock.

Yes. And the commands nix flake lock and nix flake update allow you to see what will be used without evaluating any Nix code (and thus without building anything).

One other thing to be aware of here is that flakes still allow the traditional Nix ways of referencing other URLs, which are not directly reflected in flake.lock. On the plus side, a content hash is required, so any change to those non-flake dependencies results in a change to the flake itself.

Here are two examples of what that can look like (assume this is referenced by B’s flake somewhere):

user-project = stdenv.mkDerivation {
  src = pkgs.fetchFromGitHub {
    owner = "user";
    repo = "project";
    rev = "some-branch";
    hash = "sha256-yMgDJ7D1pa37tHIX8SgO++eMqNCUOM0Bx+A5p10vWWg=";
  };

  patches = [
    (pkgs.fetchpatch {
      name = "fix-issue.patch";
      url = "https://patch-diff.githubusercontent.com/raw/user/project/pull/103.patch";
      hash = "sha256-/XhrSIKDqaitV3Kk+JkOgflgl3821m/8gLrP0yHENP0=";
    })
  ];
};

These can still be overridden with overrideAttrs in your Nix code. And you can combine that with a bit more work to lift them flake.lock, which then allows you to audit them via that file:

{
  inputs = {
    b.url = "github:user/b";
    user-project.url = "github:user/project/some-branch";
    user-project.flake = false;
    fix-user-project-issue.url = "https://patch-diff.githubusercontent.com/raw/user/project/pull/103.patch";
    fix-user-project-issue.flake = false;
  };

  outputs = {b, user-project, fix-user-project-issue, ...}: {overlays = final: prev: {
      ## Here is where the earlier derivation from `B` has its sources overridden.
      user-project = b.user-project.overrideAttrs {
        src = user-project;
        patches = [fix-user-project-issue];
      };
    };}
}

But when you re-pin any upstream flake, you have to identify whether it’s pulling in other sources. There are various ways to explore the Nix derivation graph, though, so you don’t need to discover this by poring over the sources (especially if you only use parts of a flake, as is often the case, these dependencies may never even be fetched, let alone evaluated).

There are probably good sources out there for how to ease this kind of auditing in Nix – commands for diffing the graphs, etc. But I haven’t explored that.

Next question: since we're relying on caching systems to substitute binaries for a given source, I'd like to document what attack surface that presents for malicious code injection.

My guess is that the key material above are signing keys for caching systems, and I believe that's equivalent security to fetching files over https with a fingerprint-pinned certificate. In other words, there's a server with a private key that can serve up any arbitrary bytes, claiming they are the binaries corresponding to a given source package.

This is correct, at least as far as this PR and current common Nix practice goes. However, Nix has “beta-level” support for content-addressed derivations, which can eliminate the arbitrary-bytes issue. I am very interested in this, but haven’t used it at all, and can’t say how feasible it would be to use CA derivations for something like Zebra. Maybe it could at least reduce the number of derivations that aren’t content-addressed … but it’s probably a bigger initial win just to control the cache.

One saving grace for nix here is that every package build should be reproducible, so we could "spot" check caching servers by selecting a random transitive dependency and building it locally to verify it matches what the caching server claims.

I don’t know if any tooling exists for this, but it should be easy enough to implement something basic. However, I feel like the number of transitive dependencies for any project is quite large, and only one needs to be compromised, which would then necessitate a lot of spot checks, so just managing our own cache is probably better.

Also, while reproducibility is certainly a goal, the guarantees are limited. Content-addressed derivations will help enforce this more, and there are other limited uses of content hashes (like the sources for the derivation I showed above and fixed-output derivations). But it is very easy to create a derivation that isn’t reproducible:

pkgs.runCommand "random" { } "echo $RANDOM > $out"

I’m not sure where rustc lands on this, but GHC has a number of cases where its compiler output isn’t reproducible (and they’ve been addressing them as tools like Nix become more widespread).

@mpguerra
Copy link
Contributor

Thank you @sellout for this PR and @shielded-nate for reviewing it!

We're not really sure what to do with this one since we don't have the expertise to review it and take care of any ongoing maintenance that could result from merging this PR.

Is there anywhere else that this flake could be hosted? If so, we would be happy to link to it from our documentation.

@sellout
Copy link
Author

sellout commented Jan 29, 2025

Is there anywhere else that this flake could be hosted? If so, we would be happy to link to it from our documentation.

Yes! It’s easy to have the flake in its own repo, and to add this repo as an input. In that case I’m inclined to have it in sellout/zebra-nix, but I’m happy for it to be ZcashFoundation/zebra-nix (or whatever other name), if you would prefer to have more control over and responsibility for it (and a more official sheen).

I basically just need to add

inputs.zebra.url = "github:ZcashFoundation/zebra/v2.1.0";
inputs.zebra.flake = false;

to get it to fetch all the code from here rather than from the repo containing the flake.

I initially put it here for the ease of my own Zebra development, but having it cloned in a separate directory and pointing it to my clone of Zebra by locally changing

inputs.zebra.url = "github:ZcashFoundation/zebra/v2.1.0";

to

inputs.zebra.url = "path:/local/path/to/zebra/clone";

is actually easier (so long as this PR remains unmerged), since I don’t have to merge/unmerge the branch as I make other changes to Zebra.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compatibility Area: Compatibility with other nodes or wallets, or standard rules A-devops Area: Pipelines, CI/CD and Dockerfiles C-feature Category: New features C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG extra-reviews This PR needs at least 2 reviews to merge no-review-reminders Turn off review reminders P-Optional ✨
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants