IMG

MALeR: Improving Compositional Fidelity in Layout-Guided Generation
Shivank Saxena, Dhruv Srivastava, Makarand Tapaswi
ACM Transactions on Graphics, ACM-TG, 2025
Google Rank :264
What You See is What You Ask: Evaluating Audio Descriptions
Divy Kala, Eshika Khandelwal, Makarand Tapaswi
Conference on Empirical Methods in Natural Language Processing, EMNLP, 2025
Core Rank : A* Google Rank :193
Investigating Mechanisms for In-Context Vision Language Binding
Darshana S, Makarand Tapaswi, Vineet Gandhi
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2025
Core Rank : - Google Rank :-
IdentifyMe: A Challenging Mention Resolution Benchmark for LLMs
S Kawshik Manikantan, Makarand Tapaswi, Vineet Gandhi, Shubham Toshniwal
North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL- HLT, 2025
Core Rank : A Google Rank :132
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
Darshana S, Varun Gupta, Darshan Singh S, Zeeshan Khan, Vineet Gandhi, Makarand Tapaswi
Computer Vision and Pattern Recognition, CVPR, 2025
Core Rank : A* Google Rank :440
The Sound of Water: Inferring Physical Properties from Pouring Liquids
Piyush Bagad, Makarand Tapaswi, Cees G. M. Snoek, Andrew Zisserman
International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 2025
Core Rank : B Google Rank :129
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
Manu Gaur, Darshan Singh S, Makarand Tapaswi
Transactions in Machine Learning Research, TMLR, 2025
Core Rank : - Google Rank :-
Seeing Eye to AI Comparing Human Gaze and Model Attention in Video Memorability
Prajneya Kumar, Eshika Khandelwal, Makarand Tapaswi, Vishnu Sreekumar
Winter Conference on Applications of Computer Vision, WACV, 2025
Core Rank : - Google Rank :109
Detect, Describe, Discriminate: Moving Beyond VQA for MLLM Evaluation
Manu Gaur, Darshan Singh S, Makarand Tapaswi
Workshop Emergent Visual Abilities and Limits of Foundation Models, EVAL-FoMo W, 2024
Localizing Auditory Concepts in CNNs
Pratyaksh Gautam, Makarand Tapaswi, Vinoo Alluri R
ICML Mechanistic Interpretability Workshop, ICMLMI-W, 2024