Senior Simulation Data Engineer
SKILLS
FULL DESCRIPTION
Senior Simulation Data Engineer
[Employer hidden — sign up to reveal] is a deep-tech company with roots in numerical physics and Formula One, dedicated to accelerating hardware innovation at the speed of software. We are building an AI-driven simulation software stack for engineering and manufacturing across advanced industries. This role sits at the intersection of HPC engineering and data engineering, orchestrating long-running CFD simulations at scale.
Location: London, United Kingdom
Work Type: Hybrid (Shoreditch office and work-from-home)
The Role
The Senior Simulation Data Engineer will extend and operate the infrastructure that powers our research Data Factory. Responsibilities include geometry preparation, simulation orchestration, validation, post-processing, and delivery to downstream ML training systems.
What You Will Do
- Extend and operate the Data Factory infrastructure orchestrating thousands of CFD simulations per day on cloud compute
- Design job scheduling systems maximizing throughput while handling failures gracefully
- Build monitoring and alerting for simulation failures, convergence issues, and resource bottlenecks
- Build high-performance data pipelines moving simulation outputs to ML-ready training data
- Implement geometry preprocessing workflows (mesh preparation, morphing, watertightness validation)
- Design and operate post-processing pipelines: surface decimation, field interpolation, format conversion
- Optimize I/O performance for large mesh datasets
- Implement comprehensive validation checks at every pipeline stage
- Build systems to capture and quarantine bad data before they reach training pipelines
- Deliver validated datasets to ML training infrastructure in optimized formats
What You Bring to the Table
- 5+ years of experience in data engineering, HPC engineering, or simulation infrastructure
- Strong experience with orchestration systems: SLURM, Kubernetes, Temporal
- Production data pipeline experience with large volumes of data
- Proficiency in Python for pipeline development and automation
- Systems engineering fundamentals: Linux, networking, storage systems, performance debugging
- Experience with cloud infrastructure; ideally CoreWeave or similar GPU/HPC-focused clouds
- Background in HPC for simulation engineering (CFD, FEA, etc.)
- Experience with geometry processing: mesh manipulation, CAD formats, PyVista
- Familiarity with scientific data formats: HDF5, VTK, NetCDF, Zarr
- Data quality engineering experience: validation frameworks, anomaly detection, data observability
What We Offer
Join [Employer hidden — sign up to reveal] and build what actually matters. Work with a high-caliber team, enjoy hybrid flexibility, equity options, 10% employer pension contribution, free office lunches, enhanced parental leave, 25 days annual leave, private medical insurance, Wellhub subscription, and more.
Apply via the [Employer hidden — sign up to reveal] careers page at [Employer hidden].ai/careers.