IMG

Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation
Computer Vision and Pattern Recognition, CVPR, 2022
Core Rank : A* Google Rank :440
Learning Object Manipulation Skills from Video via Approximate Differentiable Physics
International Conference on Intelligent Robots and Systems, IROS, 2022
Core Rank : A Google Rank :86
Instruction-driven history-aware policies for robotic manipulations
Conference on Robot Learning, CORL, 2022
Core Rank : - Google Rank :88
Can we Adopt Self-supervised Pretraining for Chest X-Rays?
Machine Learning for Health Workshop, ML4H, 2022
Core Rank : - Google Rank :-
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Neural Information Processing Systems, NeurIPS, 2022
Core Rank : A* Google Rank :337
Sonus Texere! Automated Dense Soundtrack Construction for Books using Movie Adaptations
International Society for Music Information Retrieval, ISMIR, 2022
Core Rank : - Google Rank :40
Grounded Video Situation Recognition
Neural Information Processing Systems, NeurIPS, 2022
Core Rank : A* Google Rank :337
Long term spatio-temporal modeling for action detection
Computer Vision and Image Understanding, CVIU, 2021
Core Rank : - Google Rank :48
Airbert: In-domain Pretraining for Vision-and-Language Navigation
International Conference on Computer Vision, ICCV, 2021
Core Rank : A* Google Rank :291
Feature Generation for Long-tail Classification
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2021
Core Rank : - Google Rank :-