IIITH

LineTR : Unified Text Line Segmentation for Challenging Palm Leaf Manuscripts

International conference on Pattern Recognition, ICPR, 2024

Core Rank : B Google Rank :56

Abs PDF bibTex

@inproceedings{bib_Line_2024, AUTHOR = {Agarwal, Vaibhav and Niharika, and Waseem, HM and Joseph, Amal and Chitluri, Sreenya and Sarvadevabhatla, Ravi Kiran }, TITLE = {LineTR : Unified Text Line Segmentation for Challenging Palm Leaf Manuscripts}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2024}}

LineTR : Unified Text Line Segmentation for Challenging Palm Leaf Manuscripts

Abstract

The dense and unstructured text in historical manuscripts presents significant challenges for precise line segmentation due to large diversity in sizes, scripts and appearances of the documents. Existing approaches tackle this complexity either by performing dataset-specific processing or training per-dataset models. This strategy hampers maintainability and scalability as newer manuscript collections get digitized and annotated. In this paper, we propose LineTR, a novel two-stage line segmentation approach which can process a diverse variety of challenging handwritten documents in a unified, dataset-agnostic manner. LineTR's first stage processes context-adaptive image patches. It consists of a novel DETR-style network which generates parametric representations of text strike-through lines (scribbles) and a novel hybrid CNN-transformer network which generates a text energy map. A dataset-agnostic and robust post-processing procedure is applied on first-stage outputs to obtain document-level scribbles. In the second stage, these scribbles and the text energy map are used within a seam generation framework to obtain highly precise polygons enclosing the manuscript text lines. We also introduce three new diverse text line segmentation datasets comprising challenging Indic and South-East Asian manuscripts. Through experiments, ablations and evaluations, we show that LineTR generates significantly superior line segmentations - all with a single model. Our results also highlight the effectiveness of our unified model for good quality zero-shot inference on the newly introduced datasets.

Lupus Nephritis Subtype Classification with only Slide Level labels

Medical Imaging with Deep Learning, MIDL, 2024

Google Rank :41

Abs PDF bibTex

@inproceedings{bib_Lupu_2024, AUTHOR = {Sharma, Amit and Chauhan, Ekansh and Uppin, Megha S and Rajasekhar, Liza and V, Jawahar C and Krishnanunni, Vinod Palakkad }, TITLE = {Lupus Nephritis Subtype Classification with only Slide Level labels}, BOOKTITLE = {Medical Imaging with Deep Learning}. YEAR = {2024}}

Lupus Nephritis Subtype Classification with only Slide Level labels

Abstract

Lupus Nephritis classification has historically relied on labor-intensive and meticulous glomerular-level labeling of renal structures in whole slide images (WSIs). However, this approach presents a formidable challenge due to its tedious and resource-intensive nature, limiting its scalability and practicality in clinical settings. In response to this challenge, our work introduces a novel methodology that utilizes only slide-level labels, eliminating the need for granular glomerular-level labeling. A comprehensive multi-stained lupus nephritis digital histopathology WSI dataset was created from the Indian population, which is the largest of its kind. LupusNet, a deep learning MIL-based model, was developed to classify LN subtypes. The results underscore its effectiveness, achieving an AUC score of 91.0%, an F1 score of 77.3%, and an accuracy of 81.1% on our dataset in distinguishing membranous and diffused classes of LN

Hardware Deployable Edge AI Solution for Posture Classification using mmWave Radar and Low Computation Machine Learning Model

IEEE Sensors Journal, SJ, 2024

Core Rank : - Google Rank :105

Abs PDF DOI bibTex

@inproceedings{bib_Hard_2024, AUTHOR = {Singh, Yash Pratap and Gupta, Aham and Chaudhary, Devansh and Mahajan, Pranjal and Wajid, Mohd. and Srivastava, Abhishek }, TITLE = {Hardware Deployable Edge AI Solution for Posture Classification using mmWave Radar and Low Computation Machine Learning Model}, BOOKTITLE = {IEEE Sensors Journal}. YEAR = {2024}}

Hardware Deployable Edge AI Solution for Posture Classification using mmWave Radar and Low Computation Machine Learning Model

Abstract

Identifying correct human postures is crucial in areas like patient care in hospitals. However, the traditional vision-based methods widely used for this purpose raise privacy concerns for the subject, and the other wearable sensor-based approaches are impractical for real-world scenarios. In this paper, we propose a contactless, privacy-conscious, and memoryefficient posture classification system based on millimeter wave (mmWave) radar. This system utilizes threedimension(3D) point-cloud data captured using Texas Instrument’s IWR1843BOOST Frequency Modulated Continuous Wave (FMCW) radar module to classify the posture of the subject. Two types of datasets are extracted from this radar data: (i) image dataset derived from the isometric view of the point-cloud data, and (ii) spatial coordinates dataset also extracted from the point-cloud data. A low-computational Tiny Machine Learning (TinyML) model is employed on the datasets for efficient implementation on embedded hardware, Raspberry Pi 3 B+. The proposed model’s parameters were quantized to 8 bits (int8), which accurately classify four postures, i.e., standing, sitting, lying, and bending, with an accuracy of 98.97% for the image data. However, to make it more computationally efficient, the int8 quantized TinyML model was trained on the spatial coordinates dataset, giving an accuracy of 96.12%. This highlights the efficiency and effectiveness of our proposed lightweight model that can be deployed on edge devices for real-world applications.

Transparency in Sleep Staging: Deep Learning Method for EEG Sleep Stage Classification with Model Interpretability

Technical Report, arXiv, 2024

Core Rank : - Google Rank :-

Abs PDF bibTex

@inproceedings{bib_Tran_2024, AUTHOR = {Sharma, Shivam Kumar and Maiti, Suvadeep and Mythirayee, S. and Rajendran, Srijithesh and Surampudi, Bapi Raju }, TITLE = {Transparency in Sleep Staging: Deep Learning Method for EEG Sleep Stage Classification with Model Interpretability}, BOOKTITLE = {Technical Report}. YEAR = {2024}}

Transparency in Sleep Staging: Deep Learning Method for EEG Sleep Stage Classification with Model Interpretability

Abstract

Automated Sleep stage classification using raw single channel EEG is a critical tool for sleep quality assessment and disorder diagnosis. However, modelling the complexity and variability inherent in this signal is a challenging task, limiting their practicality and effectiveness in clinical settings. To mitigate these challenges, this study presents an end-to-end deep learning (DL) model which integrates squeeze and excitation blocks within the residual network to extract features and stacked Bi-LSTM to understand complex temporal dependencies. A distinctive aspect of this study is the adaptation of GradCam for sleep staging, marking the first instance of an explainable DL model in this domain with alignment of its decision-making with sleep expert's insights. We evaluated our model on the publically available datasets (SleepEDF-20, SleepEDF-78, and SHHS), achieving Macro-F1 scores of 82.5, 78.9, and 81.9, respectively. Additionally, a novel training efficiency enhancement strategy was implemented by increasing stride size, leading to 8x faster training times with minimal impact on performance. Comparative analyses underscore our model outperforms all existing baselines, indicating its potential for clinical usage.

Fairness and Privacy Guarantees in Federated Contextual Bandits

International Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2024

Core Rank : A* Google Rank :54

Abs PDF bibTex

@inproceedings{bib_Fair_2024, AUTHOR = {Solanki, Sambhav and Gujar, Sujit Prakash and Jain, Shweta }, TITLE = {Fairness and Privacy Guarantees in Federated Contextual Bandits}, BOOKTITLE = {International Conference on Autonomous Agents and Multiagent Systems}. YEAR = {2024}}

Fairness and Privacy Guarantees in Federated Contextual Bandits

Abstract

This paper studies the contextual multi-armed bandit problem with fairness and privacy guarantees in a federated setting. It proposes a collaborative algorithm, Fed-FairX-LinUCB that achieves sublinear fairness regret and can be adapted to ensure differential privacy. The key challenge is designing a communication protocol that balances privacy and regret. The proposed protocol achieves both sub-linear fairness regret and effective use of privacy budget. Experiments validates the efficacy of both Fed-FairX-LinUCB and its private counterpart, Priv-FairX-LinUCB.

LimGen: Probing the LLMs for Generating Suggestive Limitations of Research Papers

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databa, PKDD/ECML, 2024

Core Rank : A

Abs PDF DOI bibTex

@inproceedings{bib_LimG_2024, AUTHOR = {Faizullah, Abdur Rahman Bin Mohammed and Urlana, Ashok and Mishra, Rahul }, TITLE = {LimGen: Probing the LLMs for Generating Suggestive Limitations of Research Papers}, BOOKTITLE = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databa}. YEAR = {2024}}

LimGen: Probing the LLMs for Generating Suggestive Limitations of Research Papers

Abstract

Examining limitations is a crucial step in the scholarly research reviewing process, revealing aspects where a study might lack decisiveness or require enhancement. This aids readers in considering broader implications for further research. In this article, we present a novel and challenging task of Suggestive Limitation Generation (SLG) for research papers. We compile a dataset called LimGen, encompassing 4068 research papers and their associated limitations from the ACL anthology. We investigate several approaches to harness large language models (LLMs) for producing suggestive limitations, by thoroughly examining the related challenges, practical insights, and potential opportunities. Our LimGen dataset and code can be accessed at https://github. com/arbmf/LimGen.

Adaptive Control of Quadrotor under Actuator Loss and Unknown State-dependent Dynamics

International Conference on Automation Science and Engineering, ICASE, 2024

Core Rank : - Google Rank :29

Abs PDF DOI bibTex

@inproceedings{bib_Adap_2024, AUTHOR = {Munish, Gupta Saksham and Sharma, Amitabh and Mulgundkar, Aditya Srinivas and Yadav, Rishabh Dev and Roy, Spandan }, TITLE = {Adaptive Control of Quadrotor under Actuator Loss and Unknown State-dependent Dynamics}, BOOKTITLE = {International Conference on Automation Science and Engineering}. YEAR = {2024}}

Adaptive Control of Quadrotor under Actuator Loss and Unknown State-dependent Dynamics

Abstract

This paper examines the enhancement of quadrotor efficiency through adaptive control to address the critical need for Fault-Tolerant Control (FTC) in quadrotors amidst operational uncertainties and component inefficiency. State-of-the-art adaptive FTC strategies often assume the uncertainties to be bounded by a constant a priori; however, imprecise knowledge of inertial system parameters lead to state-dependent uncertainties which do not follow such assumption. Remain unattended, statedependent uncertainties can lead to instability, especially under actuator faults. The proposed adaptive FTC offers actuator fault mitigation while tackling unknown (statedependent) uncertainties via suitably designed adaptive laws. Additionally, real-time fault detection and control allocation are used simultaneously to avoid conservative control application. The closed-loop system stability is studied analytically and the effectiveness of the proposed solution is verified on a realistic simulator in comparison to the state of the art.

QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding

International Conference on Intelligent Robots and Systems, IROS, 2024

Core Rank : A Google Rank :86

Abs PDF DOI bibTex

@inproceedings{bib_QueS_2024, AUTHOR = {Mehan, Yash and Gupta, Kumaraditya and Sreekanth, Jayanti Rohit and Govil, Anirudh and Garg, Sourav and Krishna, K Madhava }, TITLE = {QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding}, BOOKTITLE = {International Conference on Intelligent Robots and Systems}. YEAR = {2024}}

QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding

Abstract

Understanding the structural organisation of 3D indoor scenes in terms of rooms is often accomplished via floorplan extraction. Robotic tasks such as planning and navigation require a semantic understanding of the scene as well. This is typically achieved via object-level semantic segmentation. However, such methods struggle to segment out topological regions like “kitchen” in the scene. In this work, we introduce a two-step pipeline. First, we extract a topological map, i.e., floorplan of the indoor scene using a novel multi-channel occupancy representation. Then, we generate CLIP-aligned features and semantic labels for every room instance based on the objects it contains using a self-attention transformer. Our language-topology alignment supports natural language querying, e.g. a “place to cook” locates the “kitchen”. We outperform the current state-of-the-art on room segmentation by ∼20% and room classification by ∼12%. Our detailed qualitative analysis and ablation studies provide insights into the problem of joint structural and semantic 3D scene understanding.

Security Analysis of IoT-based Remote Labs

Future Internet of Things and Cloud, FiCloud, 2024

Abs PDF DOI bibTex

@inproceedings{bib_Secu_2024, AUTHOR = {Walchatwar, Nagesh Laxman and Gureja, Akshit and Ihita, Gangavarapu Vigneswara and Ojha, Adhishree and Chaudhari, Sachin }, TITLE = {Security Analysis of IoT-based Remote Labs}, BOOKTITLE = {Future Internet of Things and Cloud}. YEAR = {2024}}

Security Analysis of IoT-based Remote Labs

Abstract

Remote labs are online laboratories that allow users to access and interact with experimental hardware setups remotely. They are helpful in providing practical learning experiences of theoretical concepts from anywhere in the world with an internet connection. Ensuring the security of these remote labs is essential for protecting the confidentiality, integrity, and availability (CIA) of data and services. This paper presents a comprehensive security analysis of the hardware, communication interfaces, and platform of the IoT-based Remote Labs (RLabs) deployed at IIIT Hyderabad, India. We conducted a detailed vulnerability assessment of potential threats using industry-standard tools and performed four targeted attacks to exploit identified weaknesses. The attacks include remote physical hardware manipulation, creating unavailability of experiments for legitimate users, unauthorized access to the platform and data sniffing between the components. To address these vulnerabilities, we propose mitigation strategies to enhance the experience of user experimentation in remote labs.

Television Discourse Decoded: Comprehensive Multimodal Analytics at Scale

ACM International Conference on Knowledge Discovery and Data Mining, KDD, 2024

Core Rank : A* Google Rank :124

Abs DOI bibTex

@inproceedings{bib_Tele_2024, AUTHOR = {Agarwal, Anmol and Priyadarshi, Pratyush and Sinha, Shiven and Gupta, Shrey and Jangra, Hitkul and Kumaraguru, Ponnurangam and Garimella, Kiran }, TITLE = {Television Discourse Decoded: Comprehensive Multimodal Analytics at Scale}, BOOKTITLE = {ACM International Conference on Knowledge Discovery and Data Mining}. YEAR = {2024}}

Television Discourse Decoded: Comprehensive Multimodal Analytics at Scale

Abstract

In this paper, we tackle the complex task of analyzing televised debates, with a focus on a prime time news debate show from India. Previous methods, which often relied solely on text, fall short in capturing the multimedia essence of these debates [27]. To address this gap, we introduce a comprehensive automated toolkit that employs advanced computer vision and speech-to-text techniques for large-scale multimedia analysis. Utilizing state-of-the-art computer vision algorithms and speech-to-text methods, we transcribe, diarize, and analyze thousands of YouTube videos of prime-time television debates in India. These debates are a central part of Indian media but have been criticized for compromised journalistic integrity and excessive dramatization [18]. Our toolkit provides concrete metrics to assess bias and incivility, capturing a comprehensive multimedia perspective that includes text, audio utterances, and video frames. Our findings reveal significant biases in topic selection and panelist representation, along with alarming levels of incivility. This work offers a scalable, automated approach for future research in multimedia analysis, with profound implications for the quality of public discourse and democratic debate. We will make our data analysis pipeline and collected data publicly available to catalyze further research in this domain.