GenAI Data Engineer

🔒 Confidential Employer

Posted 7 May 2026

LOCATION

London

TYPE

Contract

LEVEL

Mid-Senior level

SKILLS

PySpark Python AWS GenAI/LLM RAG SQL Delta Lake ETL

[Employer hidden — sign up to reveal] is hiring a GenAI Data Engineer for a contract role in London.

Job details
Posted: 28 April 2026
Location: London
Job type: Contract
Discipline: Technology

Your Responsibilities:

Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing.
Architect and optimize AWS-based data and AI infrastructure, ensuring secure, performant, and cost‑efficient ingestion, transformation, and storage.
Develop, finetune, benchmark, and evaluate GenAI/LLM models, including custom training and inference optimization.
Implement and maintain RAG pipelines, vector databases, and document-processing workflows for enterprise GenAI applications.
Build reusable frameworks for prompt management, evaluation, and GenAI operations.
Collaborate with cross-functional teams to integrate GenAI capabilities into production systems and ensure high-quality data, governance, and operational reliability

Your Profile:

Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines.
Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing.
Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration.
Hands‑on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting).
Practical experience with GenAI/LLM model creation, finetuning, benchmarking, and evaluation.
Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods.
Experience working with structured and unstructured datasets (documents, logs, text, images).
Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB).
Understanding model optimization techniques (quantization, distillation, inference optimization).
Strong capability to debug, tune, and optimize distributed systems and AI pipelines.

[Employer hidden — sign up to reveal] - Vaishali Srivastava
Email: [Employer hidden — sign up to reveal]
Phone: [contact hidden]

If this position is of interest to you, apply now!