Lead Engineer - Software & HPC Engineering

🔒 Confidential Employer
Posted 7 May 2026
LOCATION
Oxford
TYPE
Full-time
LEVEL
Mid-Senior level
CATEGORY
Technology
This employer holds a UK Home Office sponsor license — sponsorship for this specific role is at the employer’s discretion

SKILLS

Linux HPC Systems MPI C++ Fortran Python Slurm Ansible

FULL DESCRIPTION

Lead Engineer - Software & HPC Engineering

Oxford, Oxfordshire | Negotiable | On-site | Full-time

About the Company

We are a pioneering UK-based deep-tech company developing next-generation solutions at the cutting edge of advanced physics, simulation, and machine learning. Our work is focused on unlocking scalable, clean energy through breakthrough approaches, supported by world-class computational capabilities and innovative engineering. Alongside our core mission, we collaborate with leading organisations across advanced industries, applying our proprietary simulation tools and technologies to solve complex, high-impact challenges.

The Role

We're seeking a Lead HPC Engineer - or an experienced Senior HPC Engineer ready to step up - to take ownership of a large-scale, high-performance computing environment. You'll support and evolve an HPC cluster of over 10,000 cores, ensuring reliability, performance, and scalability for workloads ranging from single high-precision runs to thousands of parallel simulations. Working within the Software & HPC Engineering team, you'll collaborate closely with computational scientists, data engineers, and IT specialists to deliver a robust platform that underpins cutting-edge research and development.

Key Responsibilities

  • Maintain and optimise HPC hardware, working with external vendors where required
  • Manage core system software and ensure platform stability
  • Monitor performance, troubleshoot issues, and drive continuous improvements
  • Oversee backups of critical data and system configurations
  • Schedule and perform maintenance aligned with user activity
  • Profile workloads and enhance system efficiency
  • Communicate system status, updates, and major issues to stakeholders
  • Capture user requirements and contribute to upgrade and capacity planning
  • Support procurement processes and vendor negotiations
  • Produce clear documentation for both technical teams and end users
  • Collaborate across engineering and IT teams on shared infrastructure

Current Environment

  • Large-scale multi-vendor server infrastructure (AMD EPYC, Intel Xeon)
  • High-speed networking (100Gb LAN) and high-performance storage systems
  • Linux-based environments (AlmaLinux, Ubuntu)
  • Distributed file systems (Lustre, GlusterFS, NFS)
  • HPC tooling including Slurm, Ansible, and monitoring frameworks
  • Development ecosystems supporting C++, Fortran, MPI, and Python

About You

Essential:

  • Degree in Computer Science (or equivalent experience)
  • Strong expertise in Linux, HPC systems, storage, and networking
  • Experience with MPI and scientific computing environments (C++, Fortran)
  • Familiarity with job schedulers and workload management systems
  • Scripting skills (Shell, Python) and version control (Git)
  • Ability to design, implement, and support complex HPC systems
  • Strong analytical thinking and problem-solving skills
  • Excellent communication and collaboration abilities

Desirable:

  • Deep expertise in HPC optimisation and performance profiling
  • Experience with configuration management tools (e.g. Ansible)
  • Knowledge of containerisation (e.g. Singularity, Apptainer)
  • Experience working with secure or air-gapped environments
  • Familiarity with HPC accounting systems and SQL databases
  • Experience supporting and training end users
Sign up free — access 45,000+ UK sponsor-licensed jobs