Research Intern - Computer Vision (Visuals)

🔒 Confidential Employer

Posted 25 March 2026

LOCATION

London

TYPE

Internship

LEVEL

Internship

SKILLS

Large Language Models Computer Vision Multi-modality learning PyTorch Diffusion Models Research Coding Communication

FULL DESCRIPTION

Research Intern - Computer Vision (Visuals)

We are looking for a Research Intern to join our London Research Centre’s Computer Vision team. In particular, we seek someone with experience in (Visual) Large Language Models and/or multi-modality learning to join us.

Key Responsibilities:

Leading and participating in cutting-edge research projects focusing on multi-modality learning and visual large language models.
Writing research papers, we like to show the world our latest findings.

Person Specification:

Required:

Strong research track record. It is mandatory to have at least one first-author paper in a top-tier venue (CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR).
Ph.D. student in computer vision or machine learning during the course of the entire internship (we would also recommend applying if you are a strong undergrad/grad student and have first-author papers in the venues listed above).
Hands on experience with (visual) large language models or diffusion models
Strong coding skills: Ability to quickly prototype in PyTorch (it is okay if you only have Tensorflow or JAX experience).
Be updated with state-of-the-art, and like reading lots of papers.
Have good written and oral communication skills.

Desired:

Have experience both on Large Language Models and diffusion models, or in their combination such as Diffusion Large Language Models,