Senior Software Reliability Engineer, Nix
Job Description
This position is based in either Marina Del Rey (LA) or San Francisco onsite. We are open to connecting with local candidates and those open to relocation.
About Revel
Revel is building the software infrastructure that controls the world’s most critical hardware systems—across aerospace, energy, robotics, defense, and advanced manufacturing.
Our platform replaces outdated tooling with a modern, high-performance stack for commanding, monitoring, and deploying real-world systems. Engineers rely on Revel to operate hardware where safety, speed, reliability, and clarity matter.
As we scale, we are tackling complex data challenges around ingesting and analyzing large volumes of high-frequency, high-cardinality telemetry.
Role Overview
Revel's engineering team has rebuilt its software build system on Nix and is just getting started. We're extending that investment into NixOS and a broader Nix-based deployment and CI platform. As we scale, our platform growth depends on efficient, reliable, and high-signal testing and verification strategy that leverages a large compute pool, and our deployment tooling is one of the most important problems left to solve in order to keep shipping reliably as the team grows. Revel will deploy workloads into the cloud as well as on-premise installations embedded within customer networks across the globe.
We're looking for a Senior Software Reliability Engineer (Nix) to join our Software Platform team and help us ship code into production with zero-downtime deployments, while keeping engineers fast and productive in the meantime. You'll work alongside our existing Nix lead and the broader platform team on the build, test, CI, and deployment pipeline end-to-end. You'll have real influence over how we manage and scale our compute infrastructure as demand grows.
What You’ll Do
Design, build, and maintain Nix-Native CI and deployment pipelines that take code from commit to production with zero-downtime deploys
Maintain and improve our internal package set in order to enable software distribution across the company and to our customers and drive adoption of NixOS across our infrastructure
Own and optimize our CI compute pool — improve pipeline throughput, identify and resolve bottlenecks, and manage resource allocation across a growing, compute-intensive test and verification workload
Partner with engineering leadership on hardware capacity planning and infrastructure procurement as compute needs scale
Improve developer ergonomics around Nix — tooling, documentation, standards, and workflows that keep engineers productive
Define and evolve our deployment strategy, including how software is delivered to customers across multiple regions/markets
Operate and support high-consequence, customer-facing production systems with rigor and care
Provide mentorship and Nix expertise to the broader engineering org as the platform team grows
Must have experience:
5+ years building internal developer platforms, build systems, or CI/CD infrastructure.
Hands-on experience with a Nix-based build system or comparable for a real product with real customers
Proven experience delivering software reliably for a live system with a large user base, including zero-downtime deployment practices
Experience managing high-consequence systems where reliability and correctness directly impact customers
Experience owning and optimizing a large CI system and compute pool — resource management, pipeline performance, and bottleneck analysis
Deep understanding of how build systems and packaging ecosystems work (Nix, Nixpkgs, NixOS, or comparable systems such as Bazel)
Experience with infrastructure-as-code and configuration management tooling (e.g., Terraform, Ansible, Chef, Salt)
Experience managing and owning cloud infrastructure (AWS, Google Cloud, or Microsoft Azure) or on-premise hardware systems
Able to take a vaguely-defined infrastructure problem, break it down, and ship a solution
Nice to have:
Experience with NixOS in production (not just Nix for builds/packaging)
Experience with Bazel or other large-scale build systems
Experience with globally distributed deployment and the operational challenges that come with it
Experience with hardware/compute procurement and capacity planning
Experience scaling platform infrastructure at a fast-growing startup
Experience with FedRAMP, ITAR, SOX, or other relevant security/compliance frameworks
Knowledge of the Nix packaging ecosystem and/or the NixOS module system
Familiarity with the build tooling behind one of Revel's core language stacks — e.g., CMake/pkg-config (C++), Cargo/Rustc (Rust), Go, Vite (TypeScript), or setuptools/uv/hatchling (Python)
Linux fundamentals: kernel and kernel modules, Kconfig, systemd, Secure Boot, and system hardening
Experience with Linux on embedded targets
Why Revel
Work on systems that support real-world hardware operations
Tackle challenging problems in large-scale data infrastructure
High ownership and impact within a fast-moving engineering team
Opportunity to help shape the future of our platform
ITAR Requirements
To conform to U.S. Government export regulations, applicants must be a (i) U.S. citizen or national, (ii) U.S. lawful permanent resident (green card holder), (iii) Refugee under 8 U.S.C. § 1157, or (iv) Asylee under 8 U.S.C. § 1158, or be eligible to obtain required authorizations from the U.S. Department of State.