Senior Research Engineer - Audio Post-Training

🔒 Confidential Employer
Posted 7 May 2026
LOCATION
Remote
TYPE
Full-time
LEVEL
Mid-Senior level
CATEGORY
Technology
This employer holds a UK Home Office sponsor license — sponsorship for this specific role is at the employer’s discretion

SKILLS

Generative Modeling Large Language Models (LLMs) PyTorch Distributed Training Audio/Speech Processing Deep Learning Software Engineering Model Optimization

FULL DESCRIPTION

Senior Research Engineer - Audio Post-Training

[Employer hidden — sign up to reveal] - Remote Europe - Posted Jan 19, 2026

About the Company

Welcome to the video first world. From your everyday PowerPoint presentations to Hollywood movies, AI will transform the way we create and consume content. Today, people want to watch and listen, not read — both at home and at work. If you’re reading this and nodding, check out our brand video.

Meet [Employer hidden — sign up to reveal]: We're on a mission to make video easy for everyone. Born in an AI lab, our AI video communications platform simplifies the entire video production process. We’re trusted by leading brands such as Heineken, Zoom, Xerox, McDonald’s and more. In February 2024, G2 named us the fastest growing company in the world. Today, we're at a $2.1bn valuation and we recently raised our Series D.

What you'll do at [Employer hidden — sign up to reveal]:

As a Research Engineer you will join a team of 40+ Researchers and Engineers within the R&D Department working on cutting-edge challenges in the Generative AI space, with a focus on creating high-quality, expressive and real-time synthetic voices. You will join our Audio Post-Training Team, which works on generative speech and voice synthesis. Typical projects include:

  • Adapt models for new conditioning inputs (emotion, speed, prosody, speaker control, etc.).
  • Fine-tune and optimize speech models using advanced techniques such as DPO, LoRA, and other parameter-efficient methods to improve voice quality and expressiveness.
  • Implement post-training optimization techniques (quantization, pruning, distillation) to improve efficiency and latency in real-time speech generation.
  • Integrate and test novel architectures, such as neural codecs, diffusion, or flow-matching models, to enhance realism and responsiveness.
  • Design and implement new evaluation metrics for TTS systems, including automated Mean Opinion Score (MOS) prediction models for continuous quality assessment.
  • Stay updated with the latest research in audio diffusion, autoregressive models, neural codecs, and multimodal LLMs.

What we're looking for:

  • Strong understanding of generative modelling, ideally applied to sequential or multimodal data.
  • Hands-on experience with large language models (LLMs) or similar transformer-based architectures.
  • High proficiency in PyTorch, including experience with distributed training and model optimization.
  • Solid grasp of time-series modelling and tokenization, preferably in the context of audio or speech.
  • Demonstrated ability to prototype quickly, test hypotheses, and iterate efficiently.
  • Proven experience in training deep learning models end-to-end, from data preparation to evaluation.
  • Strong general software engineering skills, enabling contributions to a large, shared research infrastructure.

Nice-to have experience:

  • Familiarity with state-of-the-art architectures in audio and speech generation (e.g., diffusion models, neural codecs, flow-matching models, autoregressive decoders).
  • Experience with speech-to-speech or text-to-speech (TTS) systems.
  • Evidence of original research contributions, such as publications or open-source work in top-tier venues (e.g., ICASSP, Interspeech, NeurIPS, ICML).

Why join [Employer hidden — sign up to reveal]?

We’re living the golden age of AI. The next decade will yield the next iconic companies, and we dare to say we have what it takes to become one. Our culture: At [Employer hidden — sign up to reveal] we’re passionate about building, not talking. We strive to hire the smartest, kindest and most unrelenting people. Serving 50,000+ customers (and 50% of the Fortune 500). Proprietary AI technology built in-house. AI Safety, Ethics and Security – People first. Always. The good stuff: Competitive compensation (salary + stock options + bonus), Fully remote from Europe or hybrid work setting, 25 days of annual leave + public holidays, Great company culture, + other benefits depending on your location.

Sign up free — access 45,000+ UK sponsor-licensed jobs