Senior Machine Learning Engineer

🔒 Confidential Employer

Posted 23 April 2026

LOCATION

United Kingdom

TYPE

Full-time

LEVEL

Mid-Senior level

SKILLS

Python PyTorch Diffusion models LLMs Fine-tuning GPU Model evaluation ML systems

FULL DESCRIPTION

Join [Employer hidden — view at passion-project.co.uk] as a Senior Machine Learning Engineer and be at the forefront of developing innovative AI solutions across various media modalities including text, image, video, 3D, and audio. We're building a powerful AI media creation platform designed to revolutionize how content is generated.

What You'll Be Doing

Integrate open-source and third-party models into our inference platform
Lead fine-tuning initiatives (LoRA, adapters, PEFT, domain adaptation)
Optimise inference workloads for latency, batching, memory efficiency, and throughput
Benchmark model quality vs cost vs performance across modalities
Improve inference startup times and stability under high load
Build evaluation frameworks and internal tooling for model validation
Work closely with Infrastructure and Backend teams on scalable serving systems
Monitor production performance and drive continuous optimisation
Mentor engineers and help raise the ML engineering bar across the team

What We’re Looking For

Proven experience delivering ML systems to production environments
Strong, low-level Python skills and deep hands-on experience with PyTorch
Experience working with diffusion models, LLMs, or multimodal architectures
Practical experience fine-tuning large models (LoRA, PEFT, adapters, etc.)
Experience optimizing inference workloads in GPU environments
Strong understanding of model evaluation, experimentation, and monitoring
Ability to debug performance, memory, and reliability issues in production
Strong systems thinking understanding how ML decisions impact infrastructure
High ownership and comfort operating in a fast-paced startup environment

Nice to have

Experience with vLLM or custom inference servers
Experience with Kubernetes or containerised ML workloads
Experience working in high-throughput distributed systems
Background in AI media generation (image, video, audio)
Experience building internal ML tooling or developer-facing APIs
Experience with kernels in CUDA/C++