Postgraduate Email M.Agarwal-2@sms.ed.ac.uk Location 2.01 Alexander Graham Bell Building Personal Page https://madhav1ag.github.io/ Social Media LinkedIn Google Scholar Research Institutes Imaging, Data and Communications Biography I am a Ph.D. student at IDCOM in the School of Engineering, University of Edinburgh. I am a member of Vision Group and VIOS, working under the supervision of Dr. Steven McDonagh and Dr. Laura Sevilla. My interest lies in Multimodal Learning, Spatial-Temporal Understanding in Foundation Models, and Generative AI. Before moving to the UK, I spent a wonderful year in Germany working on building lip-syncing and synthetic media generation models. I also spent three months at Visual Computing & Artificial Intelligence group at Technical University of Munich with Prof. Matthias Nießner. I completed MS by Research at CVIT, IIIT Hyderabad under the guidance of Prof. C.V. Jawahar and Prof. Vinay P. Namboodiri. My graduate research focused on Lip-Sync, Talking Head Generation, and Face Reenactment, along with their optimization for real-world problems. Additionally, I worked on the task of Table Detection in Document Images with high accuracy under the supervision of Prof. C.V. Jawahar and Dr. Ajoy Mondal. Prior to this, I worked as a Data Scientist and a team lead with several companies, broadly in the domains of Facial Recognition, Video Surveillance using AI, and Document Image Processing.My work has been published in top computer vision and machine learning conferences. I am also actively involved with start-ups as an advisor and consultant.