Alberto Baldrati

Research Scientist at Samsung AI Cambridge

google_scholar_image.jpg

I am Alberto Baldrati, a Research Scientist at the Samsung AI Center in Cambridge, UK, focusing on computer vision and machine learning research.

Previously, I successfully defended my PhD thesis in February 2025 as part of the AI Italian National Doctorate program at the University of Pisa. During my PhD, I was hosted by the University of Florence and worked at the Media Integration and Communication Center (MICC) under the supervision of Prof. Marco Bertini and Andrew David Bagdanov.

During my PhD, my research interests revolved around vision and language, with a particular focus on prompt learning and composed image retrieval, and fashion image generation, focusing on multimodal fashion image editing and virtual try-on. As part of my PhD journey, I also had the opportunity to intern as a Computer Vision Research Scientist at Huawei Finland Research Center from March to September 2024, where I worked on video generation.

Currently, my research focuses on efficient vision-and-language applications, particularly efficient VLLMs (see this paper).

News

Apr 6, 2026 One paper about multi-image VLLMs accepted at ACL26 (Findings).
Feb 20, 2026 One paper about efficient VLLMs accepted at CVPR 2026.
Jan 10, 2026 The extended version of our ICCV2023 paper multimodal fashion image editing has been accepted at ACM TOMM.
Jul 26, 2025 The extended version of our ICCV2023 paper on composed image retrieval has been accepted at TPAMI.
Mar 3, 2025 Joined Samsung AI Center in Cambridge as a Research Scientist.

Selected Publications

2026

  1. CVPR
    VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions
    A. Bulat*A. Baldrati*, I. Metaxas*, Y. Ouali, and G. Tzimiropoulos
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026

2025

  1. ICLR
    Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
    M. Mistretta*A. Baldrati*, L. Agnolucci*, M. Bertini, and A. Bagdanov
    In The Thirteenth International Conference on Learning Representations, 2025

2023

  1. ICCV
    Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
    A. Baldrati*, D. Morelli*, G. Cartella, M. Cornia, M. Bertini, and R. Cucchiara
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2023
  2. ICCV
    Zero-Shot Composed Image Retrieval with Textual Inversion
    A. Baldrati*, L. Agnolucci*, M. Bertini, and A. Del Bimbo
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2023