Principal Data Engineer
🔒 Confidential Employer
Posted 24 March 2026
LOCATION
London
TYPE
Full-time
LEVEL
Mid-Senior level
CATEGORY
Technology
This employer holds a UK Home Office sponsor license — sponsorship for this specific role is at the employer’s discretion
SKILLS
ETL/ELT data pipelines
Python
Scala/Spark
Microsoft Azure
Databricks
Terraform
Data Warehouse Design
SQL
FULL DESCRIPTION
Principal Data Engineer
Lead [Employer hidden — view at passion-project.co.uk]’s data transformation as Principal Data Engineer: build a robust, scalable platform powering scientific insights, business intelligence, and global supply chain integrity.
Key Responsibilities:
Data Architecture & Strategy
- Platform Leadership: Define and own the technical strategy and architecture for our entire data platform covering ingestion, storage, processing, governance, and consumption. To include use-cases in support of Operations, Data Science, Customer-Facing Portals and Business Intelligence.
- Pipeline Design: Design and implement highly scalable, performant, and reliable ETL/ELT data pipelines to handle diverse data sources, including complex scientific datasets and supply chain inputs alongside business information.
- Technology Selection: Evaluate, recommend, and drive the adoption of new data services and modern data tools to ensure we have a future-proof data ecosystem.
- Data Modeling: Lead the design of canonical data models for our data warehouse and operational data stores, ensuring data quality, consistency, and integrity across the platform.
- Single Source of Truth: Define and maintain identifiers for clients, suppliers and transactions to ensure consistency across systems (e.g. Salesforce, Netsuite, internal databases) and portals.
Implementation & Technical Excellence
- Hands-on Development: Serve as the most senior, hands-on developer, writing high-quality, production-grade code (primarily Python and/or Scala/Spark) to build initial pipelines and core data services.
- Data Governance & Security: Architect data security and governance policies, ensuring compliance and best practices around data access, masking, and retention, particularly for sensitive origin data.
- Data Quality: Implement automated deduplication, conflict resolution and anomaly detection to maintain data integrity across ingestion sources.
- Operational Health: Implement robust monitoring, logging, and alerting for all data pipelines and infrastructure, ensuring high data reliability and performance.
- Infrastructure as Code (IaC): Work closely with the Infrastructure team to define and automate the provisioning of all Azure data resources using Terraform or similar IaC tools.
Cross Functional Leadership
- Scientific Collaboration: Partner closely with the Science teams to understand the structure, complexity, and requirements of raw scientific data, ensuring accurate data translation and ingestion.
- Mentorship: Provide technical guidance and mentorship to software engineers on best practices for interacting with and consuming data services.
- Product Partnership: Collaborate with the Product Director to understand commercial and user-facing data requirements, translating these needs into actionable data infrastructure features.
Skills & Experience
- Principal/Lead Expertise: Extensive experience (typically 7+ years) focused on data engineering, including significant time spent in a Principal, Lead, or Architect role defining data strategy from the ground up.
- Databricks: Deep, practical, and architectural experience of the Databricks platform.
- Azure Data Stack: Operational experience of building and running within the Microsoft Azure data ecosystem (e.g., Azure Data Factory, Azure Data Lake, Azure Synapse Analytics, Azure SQL/Cosmos DB).
- Coding Proficiency: Expert-level proficiency in Python (or Scala) and SQL, with a strong focus on writing clean, tested, and highly performant data processing code.
- Data Warehouse Design: Proven track record designing and implementing scalable data warehouses/data marts for analytical and operational use cases.
- Pipeline Automation: Strong experience with workflow orchestration tools and implementing CI/CD for data pipelines.
- Cloud Infrastructure: Familiarity with Infrastructure as Code (Terraform) and containerisation.
Sign up free — access 45,000+ UK sponsor-licensed jobs