Software Engineer (SRE) - Virtualisation

🔒 Confidential Employer
Posted 6 May 2026
LOCATION
London
TYPE
Full-time
LEVEL
Mid-Senior level
CATEGORY
Software Engineering
This employer holds a UK Home Office sponsor license — sponsorship for this specific role is at the employer’s discretion

SKILLS

Go Java Cloud Operations SLO/SLI definition Observability (Splunk, Grafana, Prometheus) Infrastructure as Code (IaC) CI/CD Virtualization (KVM, QEMU)

FULL DESCRIPTION

Software Engineer (SRE) - Virtualisation

[Employer hidden — sign up to reveal] Services Engineering (ASE) is seeking a Software Engineer (SRE) for the Compute team to build and enhance large-scale clusters for virtual machines and containers. The role involves designing tooling, defining SLOs, leading incident response, and ensuring reliability of [Employer hidden — sign up to reveal]'s services.

Description

[Employer hidden — sign up to reveal] Service Engineering (ASE)’s Compute team is seeking highly motivated software engineer with strong technical and communication skills to join our SRE team on our quest to build and enhance massive clusters hosting Virtual Machines, Containers and associated infrastructure that can scale to meet the demands of [Employer hidden — sign up to reveal]’s Services offerings. You will work with world-class engineers on core components of Virtualization and Containerization technologies, customize it to help fit [Employer hidden — sign up to reveal]’s diverse needs, and engage with the upstream community to drive [Employer hidden — sign up to reveal]’s requirements. Ultimately, you will help build the platform that delivers our applications at scale to our end users. As a Compute Site Reliability Engineer, you will be part of the team responsible for providing the platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and services to flourish.

Responsibilities

  • Design and develop tooling, frameworks, and automation in Go and Java to improve reliability, scalability, and operational efficiency of compute infrastructure (VMs, containers, orchestration).
  • Define and implement SLOs/SLIs for compute services and build the observability pipelines (metrics, logging, tracing) to measure and enforce them.
  • Lead incident response for compute infrastructure, driving triage, root cause analysis, and postmortem corrective actions.
  • Develop and maintain infrastructure-as-code and CI/CD pipelines, ensuring reproducibility, automated testing, and staged rollouts across the fleet.
  • Contribute to compute platform architecture through design reviews, technical design documents, production readiness reviews, capacity planning, and disaster recovery exercises.
  • Partner cross-functionally with engineering, QA, and program management to embed reliability into the development lifecycle, upholding best practices in code review, testing, and documentation.

Minimum Qualifications

  • Must be an expert and have in-depth professional experience with cloud operations, with a focus on “infrastructure-as-a-service” (compute, storage, and network virtualization).
  • Strong software development skills in Go and Java, with experience building production services, tools or automation frameworks.
  • Experience with software development lifecycle practices including version control, code review, CI/CD, and automated testing.
  • Experience operating and engineering large-scale multi-tenant Infrastructure as a Managed service
  • Ability to articulate complex technical concepts to both technical and non-technical stakeholders.

Preferred Qualifications

  • Experience with Infrastructure as a Service orchestration tools (OpenStack, CloudStack, etc) is a plus
  • Experience with Linux system virtualization (Libvirt, QEMU, KVM, etc), along with the APIs
  • Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana, and Prometheus
  • Experience building internal platforms or developer tooling and familiarity with distributed systems concepts
Sign up free — access 45,000+ UK sponsor-licensed jobs