Skip to main content

Vehicle Site Reliability Engineer

WayveYokohama Japan | Japan | AsiaToday
HybridRustCloud & InfrastructurePythonSolidJSAWSGCPAzureDockerKubernetesPrometheusGrafanaDatadog

Job Description

We are looking for a Site Reliability Engineer (SRE) who is passionate about pushing the limits of innovation and scale. As part of our team, you will play a crucial role in ensuring the resilience and efficiency of our cutting-edge AI-driven autonomous driving systems. This is a unique opportunity to contribute to a future where autonomous vehicles will transform the way we live, work, and move around our cities. Join us in our mission to drive the next level of growth and set new standards in the industry.

  • Elevate Operational Excellence: Guarantee the seamless operation of our autonomous vehicles on public roads, enhancing our ability to transform urban mobility.

  • Innovate and Automate: Drive the development of cutting-edge tools and automation to streamline operations, from first-line support enhancements to fleet management.

  • Metrics and Monitoring Mastery: Develop key metrics and monitoring/logging systems that preempt problems and promote rapid resolution, ensuring the highest levels of performance and reliability.

Day-to-day / scope of the role

  • Master the intricate dance of software and hardware that powers autonomous navigation, crafting solutions that set new industry benchmarks.

  • Spearhead the adoption of forward-thinking tools and technologies that align with our visionary goals, both now and in the future.

  • Champion automation to continuously elevate our efficiency, aiming to make manual interventions a thing of the past.

  • Participate in on-call rotation

Top hard requirements (skills/experience)

  • Proven experience in Site Reliability Engineering or a similar role, especially in a production environment.

  • Expertise in Python, C++, or Rust, with a solid foundation in cloud computing platforms (AWS, GCP, or Azure).

  • Expert of CI/CD processes, containerization (Docker, Kubernetes), and a deep understanding of networking, distributed systems, and databases.

  • Expert with monitoring and troubleshooting utilities (DataDog, Prometheus, Grafana, ELK stack, Splunk, Humio, etc.).

This is a full-time role based in our office in Japan.  Take part in a team on-call rotation, providing out-of-hours support for live systems when required
At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.

The Rusty Bucket
Weekly curated Rust jobs delivered to your inbox.