GPU acceleration engineer
Job Description
GPU Acceleration Engineer - Calculation Engine
🎯 Main Mission
Massively accelerate the sparse calculation engine of a UK SaaS B2B - Enterprise Planning & Analytics company by porting critical algorithms from Rust/C++ to GPU (CUDA). Transform currently impossible calculations (requiring thousands of years of CPU time) into operations achievable in minutes.
📊 Context
UK SaaS B2B - Enterprise Planning & Analytics company manages planning models reaching 64 quadrillion cells with billions of time periods. Our Hyperblock/Polaris engine is currently limited by:
-
Legacy CPU architecture (Java/Rust/C++)
-
Memory constraints on massive sparse structures
-
Prohibitive calculation times on complex scenarios
Objective: Achieve performance gains of 100x to 1000x via GPU offloading.
🔧 Main Responsibilities
GPU Offloading
-
Port existing Rust/C++ algorithms to CUDA/GPU
-
Identify and extract critical calculation paths to accelerate
-
Optimize sparse matrix operations for GPU architecture
-
Develop performant Rust ↔ CUDA wrappers
-
Benchmark and validate performance gains
Memory Optimization
-
Design GPU memory management strategies for massive datasets
-
Implement efficient patterns for sparse structures
-
Optimize CPU ↔ GPU memory transfers
-
Manage GPU memory limitations on large-scale calculations
Technical Collaboration
-
Work with engineering team on integration
-
Document GPU porting patterns
-
Participate in code reviews and design reviews
-
Train the team on GPU best practices
💻 Technical Stack
Languages (in order of importance)
-
CUDA - Primary GPU development
-
Rust - Source language for algorithms to port
-
C++ - Legacy components and CUDA interoperability
-
(Java - platform context, no dev required)
Key Technologies
-
NVIDIA CUDA (toolkit, libraries: cuBLAS, cuSPARSE)
-
Rust (ownership model, unsafe blocks, FFI)
-
GPU Programming (kernels, memory hierarchy, optimization)
-
Sparse Matrix Operations (compression, storage formats)
-
Profiling Tools (nvprof, Nsight, perf)
✅ Required Profile
Essential Skills
GPU & CUDA (Essential)
-
✅ Significant CUDA programming experience (3+ years)
-
✅ Mastery of GPU kernel optimization
-
✅ Deep knowledge of NVIDIA GPU architecture (memory hierarchy, warps, occupancy)
-
✅ Experience with sparse calculations on GPU (cuSPARSE or equivalent)
Rust (Essential)
-
✅ Production Rust development
-
✅ Mastery of ownership and borrowing system
-
✅ Experience with unsafe Rust and FFI (Foreign Function Interface)
-
✅ Ability to analyze and refactor existing Rust code
C++ (Required)
-
✅ Modern C++ (C++11/14/17)
-
✅ C++ ↔ CUDA integration
-
✅ Templates and metaprogramming (asset)
Algorithms (Required)
-
✅ Data structures for scientific computing
-
✅ Sparse matrix algorithms (CSR, COO, etc.)
-
✅ Performance optimization and profiling
-
✅ Parallelization and concurrency concepts
Highly Valued Experience
-
🎯 Documented CPU → GPU porting projects
-
🎯 HPC experience (supercomputers, GPU clusters)
-
🎯 Memory optimization for large-scale datasets
-
🎯 Scientific computing or numerical simulation
-
🎯 Rust interop with other languages (C/C++/Python)
📍 Working Arrangements
Location & Travel
-
100% remote (France/Europe base preferred)
-
Occasional travel to London
-
Frequency: ~1 week/month for team sprints
-
Project kickoff + key reviews
-
Intensive collaboration sessions
-
Start date: As soon as possible