Reinforcement Learning Specialist

🔒 Confidential Employer

Posted 13 August 2025

LOCATION

United Kingdom

TYPE

Full-time

LEVEL

Mid-Senior level

SKILLS

Reinforcement Learning Deep Learning PyTorch TensorFlow Python OpenAI Gym Machine Learning Git

FULL DESCRIPTION

Summary

[Employer hidden — view at passion-project.co.uk] is hiring a Reinforcement Learning Specialist to design and develop RL algorithms for real-world applications. Responsibilities include building simulation environments, optimizing policy learning, collaborating with other specialists, and contributing to cutting-edge AI research. The ideal candidate has expertise in reinforcement and deep learning frameworks, a strong understanding of related concepts, coding skills in Python, and an advanced degree in a related field.

Key Responsibilities

Designing and developing RL algorithms for real-world applications (e.g. robotics, recommendation systems, finance)
Building simulation environments for training intelligent agents
Optimizing policy learning using techniques such as Q-learning, PPO, A3C, and DDPG
Collaborating with data scientists, engineers, and researchers to deploy RL models into production
Experimenting with model architectures (e.g., actor-critic, deep Q-networks, model-based RL)
Publishing findings and contributing to cutting-edge AI research and development

Core Requirements

Strong expertise in reinforcement learning and deep learning frameworks (e.g. PyTorch, TensorFlow)
Solid understanding of MDPs, reward shaping, exploration-exploitation tradeoffs, and sample efficiency
Experience with simulation platforms (e.g., OpenAI Gym, MuJoCo, Unity ML-Agents)
Strong coding skills in Python and familiarity with version control (Git)
Advanced degree (Master’s or PhD) in Machine Learning, Computer Science, Robotics, or related field

🔧 What You’ll Be Working On:

Designing and developing RL algorithms for real-world applications (e.g. robotics, recommendation systems, finance)
Building simulation environments for training intelligent agents
Optimizing policy learning using techniques such as Q-learning, PPO, A3C, and DDPG
Collaborating with data scientists, engineers, and researchers to deploy RL models into production
Experimenting with model architectures (e.g., actor-critic, deep Q-networks, model-based RL)
Publishing findings and contributing to cutting-edge AI research and development

🎯 What We’re Looking For:

Strong expertise in reinforcement learning and deep learning frameworks (e.g. PyTorch, TensorFlow)
Solid understanding of MDPs, reward shaping, exploration-exploitation tradeoffs, and sample efficiency
Experience with simulation platforms (e.g., OpenAI Gym, MuJoCo, Unity ML-Agents)
Background in applied mathematics, statistics, and control theory is a plus
Strong coding skills in Python and familiarity with version control (Git)
Advanced degree (Master’s or PhD) in Machine Learning, Computer Science, Robotics, or related field