IMG

MALeR: Improving Compositional Fidelity in Layout-Guided Generation
ACM Transactions on Graphics, ACM-TG, 2025
Google Rank :264
What You See is What You Ask: Evaluating Audio Descriptions
Conference on Empirical Methods in Natural Language Processing, EMNLP, 2025
Core Rank : A* Google Rank :193
Investigating Mechanisms for In-Context Vision Language Binding
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2025
Core Rank : - Google Rank :-
IdentifyMe: A Challenging Mention Resolution Benchmark for LLMs
North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL- HLT, 2025
Core Rank : A Google Rank :132
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment
Computer Vision and Pattern Recognition, CVPR, 2025
Core Rank : A* Google Rank :440
The Sound of Water: Inferring Physical Properties from Pouring Liquids
International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 2025
Core Rank : B Google Rank :129
No Detail Left Behind: Revisiting Self-Retrieval for Fine-Grained Image Captioning
Transactions in Machine Learning Research, TMLR, 2025
Core Rank : - Google Rank :-
Seeing Eye to AI Comparing Human Gaze and Model Attention in Video Memorability
Winter Conference on Applications of Computer Vision, WACV, 2025
Core Rank : - Google Rank :109
Detect, Describe, Discriminate: Moving Beyond VQA for MLLM Evaluation
Workshop Emergent Visual Abilities and Limits of Foundation Models, EVAL-FoMo W, 2024
Localizing Auditory Concepts in CNNs
ICML Mechanistic Interpretability Workshop, ICMLMI-W, 2024