Sr. Network Site Reliability Engineer (SREs)
SKILLS
FULL DESCRIPTION
Job Information
- Date Opened 09/12/2025 - Job Type Contract - Industry IT Services - Work Experience 5+ years - City London - Province City of London - Country United Kingdom - Postal Code EC1A
About Us
We provide end-to-end IT solutions and services including Applications services, Data & Analytics services, AI/ML Technologies and Professional services in the UK and EU market.
Job Description
Overview We are seeking a highly experienced Senior Network SRE with deep expertise across multi-vendor network infrastructure, automation, and reliability engineering. The ideal candidate will possess strong technical leadership, hands-on engineering capabilities, and a passion for building resilient, scalable, and observable network environments.
Key Responsibilities - Design, implement, and maintain highly available network solutions across routing, switching, firewalling, and wireless technologies. - Apply SRE principles to improve network reliability, scalability, and performance. - Develop and maintain automation workflows using Ansible, Salt, and related frameworks to reduce operational toil. - Build and operate monitoring, alerting, and observability dashboards using tools such as Grafana and Splunk. - Proactively identify network bottlenecks, performance issues, and reliability risks, implementing long-term fixes rather than reactive solutions. - Support incident response, root cause analysis, and post-incident reviews with a focus on continuous improvement. - Collaborate with cross-functional engineering, security, and operations teams to ensure network solutions meet business and technical requirements. - Contribute to documentation, runbooks, design artifacts, and operational standards. - Participate in capacity planning, network modernization initiatives, and automation-first strategies.
Required Skills & Experience - 10+ years of hands-on experience in enterprise or service provider network engineering. - Expertise in multi-vendor routing, switching, firewalling, and wireless technologies. - Deep understanding of network protocols (BGP, OSPF, EIGRP, STP, VXLAN, VPNs, QoS, MPLS, etc.). - Strong experience with infrastructure automation using Ansible and Salt. - Proficiency with observability tooling such as Grafana, Splunk, or equivalent. - Solid understanding of SRE practices including SLIs, SLOs, error budgets, and proactive reliability engineering. - Strong troubleshooting, analytical, and performance optimization skills. - Excellent communication and collaboration skills, with the ability to influence and guide technical stakeholders.
Nice to Have - Experience with network programmability (Python, API-driven networking, NetConf/RESTConf). - Exposure to cloud networking (AWS, Azure, GCP). - Knowledge of zero-trust, SD-WAN, and network security best practices. - Experience creating self-healing or fully automated network workflows.