Software Engineer (SRE) - Virtualisation

🔒 Confidential Employer
Posted 6 May 2026
LOCATION
London
TYPE
Full-time
LEVEL
Mid-Senior level
CATEGORY
Software Engineering
This employer holds a UK Home Office sponsor license — sponsorship for this specific role is at the employer’s discretion

SKILLS

Go Java Infrastructure-as-Code CI/CD SLO/SLI Definition Incident Response Virtualization (KVM/QEMU/Libvirt) Observability/Monitoring

FULL DESCRIPTION

Software Engineer (SRE) - Virtualisation

[Employer hidden — sign up to reveal] Services Engineering (ASE) builds and provides systems and infrastructure that power [Employer hidden — sign up to reveal]’s services. We are the foundation on which [Employer hidden — sign up to reveal]’s software developers build the products that our customers love. Our services have to scale globally, stay highly available, and 'just work.' If you love designing, engineering and running systems and infrastructure that will help millions of customers, then this is the place for you!

Description

[Employer hidden — sign up to reveal] Service Engineering (ASE)’s Compute team is seeking a highly motivated software engineer with strong technical and communication skills to join our SRE team on our quest to build and enhance massive clusters hosting Virtual Machines, Containers and associated infrastructure that can scale to meet the demands of [Employer hidden — sign up to reveal]’s Services offerings. You will work with world-class engineers on core components of Virtualization and Containerization technologies, customize it to help fit [Employer hidden — sign up to reveal]’s diverse needs, and engage with the upstream community to drive [Employer hidden — sign up to reveal]’s requirements. Ultimately, you will help build the platform that delivers our applications at scale to our end users. As a Compute Site Reliability Engineer, you will be part of the team responsible for providing the platform for mission-critical cloud systems to maintain constant uptime, scale seamlessly, and allow for new applications and services to flourish.

Responsibilities

  • Design and develop tooling, frameworks, and automation in Go and Java to improve reliability, scalability, and operational efficiency of compute infrastructure (VMs, containers, orchestration).
  • Define and implement SLOs/SLIs for compute services and build the observability pipelines (metrics, logging, tracing) to measure and enforce them.
  • Lead incident response for compute infrastructure, driving triage, root cause analysis, and postmortem corrective actions.
  • Develop and maintain infrastructure-as-code and CI/CD pipelines, ensuring reproducibility, automated testing, and staged rollouts across the fleet.
  • Contribute to compute platform architecture through design reviews, technical design documents, production readiness reviews, capacity planning, and disaster recovery exercises.
  • Partner cross-functionally with engineering, QA, and program management to embed reliability into the development lifecycle, upholding best practices in code review, testing, and documentation.

Minimum Qualifications

  • Must be an expert and have in-depth professional experience with cloud operations, with a focus on “infrastructure-as-a-service” (compute, storage, and network virtualization).
  • Strong software development skills in Go and Java, with experience building production services, tools or automation frameworks.
  • Experience with software development lifecycle practices including version control, code review, CI/CD, and automated testing.
  • Experience operating and engineering large-scale multi-tenant Infrastructure as a Managed service
  • Ability to articulate complex technical concepts to both technical and non-technical stakeholders.

Preferred Qualifications

  • Experience with Infrastructure as a Service orchestration tools (OpenStack, CloudStack, etc) is a plus
  • Experience with Linux system virtualization (Libvirt, QEMU, KVM, etc), along with the APIs
  • Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana, and Prometheus
  • Experience building internal platforms or developer tooling and familiarity with distributed systems concepts

At [Employer hidden — sign up to reveal], we're not all the same. And that's our greatest strength. We draw on the differences in who we are, what we've experienced and how we think. Because to create products that serve everyone, we believe in including everyone. Therefore, we are committed to treating all applicants fairly and equally. As a registered Disability Confident employer, we will work with applicants to make any reasonable accommodations. [Employer hidden — sign up to reveal] will consider for employment all qualified applicants with criminal backgrounds in a manner consistent with applicable law.

Sign up free — access 45,000+ UK sponsor-licensed jobs