@inproceedings{bib_Imag_2024, AUTHOR = {Aniket Singh, Anoop Namboodiri}, TITLE = {Image Attribution by Generating Images}, BOOKTITLE = {International Conference on Acoustics, Speech, and Signal Processing}. YEAR = {2024}}
We introduce GPNN-CAM, a novel method for CNN explanation, that bridges two distinct areas of computer vision: Image Attribution, which aims to explain a predictor by highlighting image regions it finds important, and Single Image Generation (SIG), that focuses on learning how to generate variations of a single sample.GPNN-CAM leverages samples generated by Generative Patch Nearest Neighbors (GPNN) into a Class Activation Map (CAM) flavored attribution scheme. Our findings reveal that the incorporation of these samples yields remarkably effective results, enabling GPNN-CAM to demonstrate superior performance across multiple classifier architectures, and datasets.
@inproceedings{bib_TRAQ_2024, AUTHOR = {Om Rajendra Kathalkar, Nitin Nilesh, Sachin Chaudhari, Anoop Namboodiri}, TITLE = {TRAQID - Traffic-Related Air Quality Image Dataset}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2024}}
Air quality estimation through sensor-based methods is widely used. Nevertheless, their frequent failures and maintenance challenges constrain the scalability of air pollution monitoring efforts. Recently, it has been demonstrated that air quality estimation can be done using image-based methods. These methods offer several advantages including ease of use, scalability, and low cost. However, the accuracy of these methods hinges significantly on the diversity and magnitude of the dataset utilized. The advancement of air quality estimation through image analysis has been limited due to the lack of available datasets. Addressing this gap, we present TRAQID - Traffic-Related Air Quality Image Dataset, a novel dataset capturing 26,678 front and rear images of traffic alongside co-located weather parameters, multiple levels of Particulate Matters (PM) and Air Quality Index (AQI) values. Spanning over multiple seasons, with over 70 hours of data collection in the twin cities of Hyderabad and Secunderabad, India, the TRAQID offers diverse day and night imagery amid unstructured traffic conditions, encompassing six AQI categories ranging from “Good” to “Severe”. State-of-the-art air quality estimation techniques, which were trained on a smaller and less-diverse dataset, showed poor results on the dataset presented in this paper. TRAQID models various uncertainty types, including seasonal changes, unstructured traffic patterns, and lighting conditions. The information from the two views (front and rear) of the traffic can be combined to improve the estimation performance in such challenging conditions. As such, the TRAQID serves as a benchmark for image-based air quality estimation tasks and AQI prediction, given its diversity and magnitude.
CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models
@inproceedings{bib_CLIP_2024, AUTHOR = {Kushal Kumar Jain, Steve Grosz, Anoop Namboodiri, Anil K Jain}, TITLE = {CLIP4Sketch: Enhancing Sketch to Mugshot Matching through Dataset Augmentation using Diffusion Models}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2024}}
Forensic sketch-to-mugshot matching is a challenging task in face recognition, primarily hindered by the scarcity of annotated forensic sketches and the modality gap between sketches and photographs. To address this, we propose CLIP4Sketch, a novel approach that leverages diffusion models to generate a large and diverse set of sketch images, which helps in enhancing the performance of face recognition systems in sketch-to-mugshot matching. Our method utilizes Denoising Diffusion Probabilistic Models (DDPMs) to generate sketches with explicit control over identity and style. We combine CLIP and Adaface embeddings of a reference mugshot, along with textual descriptions of style, as the conditions to the diffusion model. We demonstrate the efficacy of our approach by generating a comprehensive dataset of sketches corresponding to mugshots and training a face recognition model on our synthetic data. Our results show significant improvements in sketch-to-mugshot matching accuracy over training on an existing, limited amount of real face sketch data, validating the potential of diffusion models in enhancing the performance of face recognition systems across modalities. We also compare our dataset with datasets generated using GAN-based methods to show its superiority.
PS-StyleGAN: Illustrative Portrait Sketching using Attention-Based Style Adaptation
@inproceedings{bib_PS-S_2024, AUTHOR = {Kushal Kumar Jain, Ankith Varun J, Anoop Namboodiri}, TITLE = {PS-StyleGAN: Illustrative Portrait Sketching using Attention-Based Style Adaptation}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2024}}
Portrait sketching involves capturing identity specific attributes of a real face with abstract lines and shades. Unlike photo-realistic images, a good portrait sketch generation method needs selective attention to detail, making the problem challenging. This paper introduces textbf{Portrait Sketching StyleGAN (PS-StyleGAN)}, a style transfer approach tailored for portrait sketch synthesis. We leverage the semantic $W+$ latent space of StyleGAN to generate portrait sketches, allowing us to make meaningful edits, like pose and expression alterations, without compromising identity. To achieve this, we propose the use of Attentive Affine transform blocks in our architecture, and a training strategy that allows us to change StyleGAN's output without finetuning it. These blocks learn to modify style latent code by paying attention to both content and style latent features, allowing us to adapt the outputs of StyleGAN in an inversion-consistent manner. Our approach uses only a few paired examples ($sim 100$) to model a style and has a short training time. We demonstrate PS-StyleGAN's superiority over the current state-of-the-art methods on various datasets, qualitatively and quantitatively.
Advancing Fingerprint Recognition Quality Assessment: Introducing the FRBQ Metric for Enhanced Fingerprint Recognition
@inproceedings{bib_Adva_2024, AUTHOR = {Prateek Jaiswal, Anoop Namboodiri}, TITLE = {Advancing Fingerprint Recognition Quality Assessment: Introducing the FRBQ Metric for Enhanced Fingerprint Recognition}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2024}}
In the field of biometric security, the quality assessment of fingerprint images is paramount for boosting the accuracy of fingerprint recognition systems. These systems are fundamental for the secure and efficient authentication and identification of individuals. Our research presents FRBQ (Fingerprint Recognition-Based Quality), an innovative quality metric designed to navigate the limitations of the NFIQ2 model. FRBQ exploits deep learning algorithms in a weakly supervised setting and utilizes matching scores from DeepPrint, a Fixed-Length Fingerprint Representation Model. Each score is paired with labels indicating the robustness of fingerprint image matches. However, in a fully referenced setting, these labels can be subjective, lacking a clear definition of what "image quality" inherently means. This weakly labeled approach strives to capture diverse perspectives on image quality, potentially making it a more encompassing metric. In comparison to NFIQ2, our research showcases the superior performance of the FRBQ model. It not only correlates better with recognition scores but also effectively evaluates challenging images that NFIQ2 struggles with. Validated by the esteemed FVC 2004 dataset, FRBQ proves its efficacy in fingerprint image quality assessment. This study underscores the transformative potential of AI in biometrics, emphasizing its capability to capture details that traditional methods might overlook. Our work stresses the critical role of precise quality assessment in the evolution of fingerprint recognition systems.
@inproceedings{bib_Enha_2024, AUTHOR = {Gavas Ekta Balkrishna, Kaustubh Olpadkar, Anoop Namboodiri}, TITLE = {Enhancement-Driven Pretraining for Robust Fingerprint Representation Learning}, BOOKTITLE = {International Conference on Computer Vision Theory and Applications}. YEAR = {2024}}
Fingerprint recognition stands as a pivotal component of biometric technology, with diverse applications from identity verification to advanced search tools. In this paper, we propose a unique method for deriving robust fingerprint representations by leveraging enhancement-based pre-training. Building on the achievements of U- Net-based fingerprint enhancement, our method employs a specialized encoder to derive representations from fingerprint images in a self-supervised manner. We further refine these representations, aiming to enhance the verification capabilities. Our experimental results, tested on publicly available fingerprint datasets, reveal a marked improvement in verification performance against established self-supervised training techniques. Our findings not only highlight the effectiveness of our method but also pave the way for potential advancements. Crucially, our research indicates that it s feasible to extract meaningful fingerprint representations from de- graded images without relying on enhanced samples.
@inproceedings{bib_An_E_2023, AUTHOR = {Tushar Chandra, Anoop Namboodiri}, TITLE = {An Efficient Model for Non-Distal Presentation Attack Detection}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2023}}
The increasing prominence of fingerprint recognition as a biomet ric identifier has made it more vulnerable to presentation attacks, specifically non-distal attacks that exploit ridge and minutiae pat terns foundinnon-distalphalanges.Inthisstudy,wepresentpresen tation attacks through non-distal/toe prints and a state-of-the-art lightweight inverted residual network that excels at differentiating between distal and non-distal prints, providing unrivaled perfor mance in terms of accuracy, inference time, and false negative rate (FNR). Our proposed model surpasses other statistical machine learning methods, such as variable-margin SVM, and lightweight models like MobileNet v2, MobileNet v3, and ResNet18. We meticu lously evaluate our model using a diverse array of datasets, includ ing the NIST dataset, an in-house collected dataset, a toe dataset, a synthetic dataset generated by VeriFinger software, and a six-class dataset. To assess performance when only minutiae points are avail able, we develop analgorithmthat converts fingerprints to minutiae points and subsequently reconstructs fingerprints. Furthermore, we examine the ridge density of distal and non-distal prints across datasets, emphasizing their similarities and underscoring the need for advanced detection techniques. To the best of our knowledge, this study represents the first endeavor to propose a solution for presentation attack detection in non-distal phalanges. Our research demonstrate various challenges of presentation attacks, the effectiveness of our approach, which holds the potential to significantly influence the domain of finger print recognition and security. By sharing our dataset, model, and experimental details with the research community, we aim to foster further advancements in this crucial area. Upon publication, we will make our dataset and experimental details available alongside the paper
Ayan Biswas,Sai Amrit Patnaik,A. H. Abdul Hafez,Anoop Namboodiri
@inproceedings{bib_Char_2023, AUTHOR = {Ayan Biswas, Sai Amrit Patnaik, A. H. Abdul Hafez, Anoop Namboodiri}, TITLE = {Characterizing Face Recognition for Resource Efficient Deployment on Edge}, BOOKTITLE = {International Conference on Computer Vision Workshops}. YEAR = {2023}}
Deployment of Face Recognition systems on the edge has seen significant growth due to advancements in hardware design and efficient neural architectures. However, tailoring SOTA Face Recognition solutions to a specific edge device is still not easy and is vastly unexplored. Although, benchmark data is available for some combinations of model, device, and framework, it is neither comprehensive nor scalable. We propose an approximation to determine the relationship between a model and its inference time in an edge deployment scenario. Using a small number of data points, we are able to predict the throughput of custom models in an explainable manner. The prediction errors are small enough to be considered noise in observations. We also analyze which approaches are most efficient and make better use of hardware in terms of accuracy and error rates to gain a better understanding of their behaviour. Related & necessary modules such as Face Anti-Spoofing are also analyzed. To the best of our knowledge, we are the first to tackle this issue directly. The data and code along with future updates to the models and hardware will be made available at https://github.com/AyanBiswas19/Resource_Efficient_FR.
@inproceedings{bib_S-BA_2023, AUTHOR = {Praguna Manvi, Desai Achintya Manohar, Srinathan Kannan, Anoop Namboodiri}, TITLE = {S-BAN: Secure Biometric Authentication using Noise}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2023}}
Biometric signal consisting of irrelevant or non-distinctive features can contain useful correlational properties that privacy-preserving verification schemes can exploit. While an efficient protocol for iris verification using noise has been presented, it is not applicable to other widely used modalities, i.e., face and fingerprint, since the methods of noise extraction and comparison are different. In this work, we design a verification protocol for secure dot product computation and also propose noise extraction mechanisms for face and fingerprint modalities. We evaluate the performance of the protocol on CFP, LFW, CelebA, FVC 2004 DB1A, DB2A, DB3A, and SOCOFing datasets. While the protocol exhibits a slight degradation in accuracy, it provides information-theoretic security with a practical computational complexity.
Finger-UNet: A U-Net Based Multi-Task Architecture for Deep Fingerprint Enhancement
@inproceedings{bib_Fing_2023, AUTHOR = {Gavas Ekta Balkrishna, Anoop Namboodiri}, TITLE = {Finger-UNet: A U-Net Based Multi-Task Architecture for Deep Fingerprint Enhancement}, BOOKTITLE = {International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applicat}. YEAR = {2023}}
For decades, fingerprint recognition has been prevalent for security, forensics, and other biometric applications. However, the availability of good-quality fingerprints is challenging, making recognition difficult. Fingerprint images might be degraded with a poor ridge structure and noisy or less contrasting backgrounds. Hence, fingerprint enhancement plays a vital role in the early stages of the fingerprint recognition/verification pipeline. In this paper, we investigate and improvise the encoder-decoder style architecture and suggest intuitive modifications to U-Net to enhance low-quality fingerprints effectively. We investigate the use of Discrete Wavelet Transform (DWT) for fingerprint enhancement and use a wavelet attention module instead of max pooling which proves advantageous for our task. Moreover, we replace regular convolutions with depthwise separable convolutions, which significantly reduces the memory footprint of the model without degrading the performance. We also demonstrate that incorporating domain knowledge with fingerprint minutiae prediction task can improve fingerprint reconstruction through multi-task learning. Furthermore, we also integrate the orientation estimation task to propagate the knowledge of ridge orientations to enhance the performance further. We present the experimental results and evaluate our model on FVC 2002 and NIST SD302 databases to show the effectiveness of our approach compared to previous works.
System and method for generating dewarped image using projection patterns captured from omni-directional stereo cameras
Rajat Aggarwal,Anoop Namboodiri,Parikshit sakurikar
United States Patent, Us patent, 2023
@inproceedings{bib_Syst_2023, AUTHOR = {Rajat Aggarwal, Anoop Namboodiri, Parikshit Sakurikar}, TITLE = {System and method for generating dewarped image using projection patterns captured from omni-directional stereo cameras}, BOOKTITLE = {United States Patent}. YEAR = {2023}}
A system for generating high-resolution de-warped omni-directional stereo image from captured omni-directional stereo image by correcting optical distortions using projection patterns is provided. The system includes a projection pattern capturing arrangement, a projector or a display, and a de-warping server. The projection pattern capturing arrangement includes one or more omnidirectional cameras to capture projection patterns from the captured omni-directional stereo image from each omni-directional stereo camera. The projector or the display displays the projection patterns. The de-warping server obtain the projection patterns and processes the projection patterns to generate high resolution de-warped omni-directional stereo image by correcting optical distortions in the captured omni-directional stereo image and mapping the captured omni-directional stereo image and the high resolution de-warped omni-directional stereo image.
AdvGen: Physical Adversarial Attack on Face Presentation Attack Detection Systems
Sai Amrit Patnaik,Shivali Chansoriya,Anoop Namboodiri,Anil K. Jain
International Joint Conference on Biometrics, IJCBio, 2023
@inproceedings{bib_AdvG_2023, AUTHOR = {Sai Amrit Patnaik, Shivali Chansoriya, Anoop Namboodiri, Anil K. Jain}, TITLE = {AdvGen: Physical Adversarial Attack on Face Presentation Attack Detection Systems}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2023}}
Evaluating the risk level of adversarial images is essential for safely deploying face authentication models in the real world. Popular approaches for physical-world attacks, such as print or replay attacks, suffer from some limitations, like including physical and geometrical artifacts. Recently adversarial attacks have gained attraction, which try to digitally deceive the learning strategy of a recognition system using slight modifications to the captured image. While most previous research assumes that the adversarial image could be digitally fed into the authentication systems, this is not always the case for systems deployed in the real world. This paper demonstrates the vulnerability of face authentication systems to adversarial images in physical world scenarios. We propose AdvGen, an automated Generative Adversarial Network, to simulate print and replay attacks and generate adversarial images that can fool state-of-the-art PADs in a physical domain attack setting. Using this attack strategy, the attack success rate goes up to 82.01%. We test AdvGen extensively on four datasets and ten state-of-the-art PADs. We also demonstrate the effectiveness of our attack by conducting experiments in a realistic, physical environment.
On Attribute Aware Open-set Face Verification
Arun Kumar Subramaniam,Anoop Namboodiri
International Conference on Computer Vision Theory and Applications, VISAPP, 2023
@inproceedings{bib_On_A_2023, AUTHOR = {Arun Kumar Subramaniam, Anoop Namboodiri}, TITLE = {On Attribute Aware Open-set Face Verification}, BOOKTITLE = {International Conference on Computer Vision Theory and Applications}. YEAR = {2023}}
Deep Learning on face recognition problems has shown extremely high accuracy owing to their ability in finding strongly discriminating features. However, face images in the wild show variations in pose, lighting, expressions, and the presence of facial attributes (for example eyeglasses). We ask, why then are these vari- ations not detected and used during the matching process? We demonstrate that this is indeed possible while restricting ourselves to facial attribute variation, to prove the case in point. We show two ways of doing so. a) By using the face attribute labels as a form of prior, we bin the matching template pairs into three bins depend- ing on whether each template of the matching pair possesses a given facial attribute or not. By operating on each bin and averaging the result, we better the EER of SOTA by over 1 % over a large set of matching pairs. b) We use the attribute labels and correlate them with each neuron of an embedding generated by a SOTA architecture pre-trained DNN on a large Face dataset and fine-tuned on face-attribute labels. We then suppress a set of maximally correlating neurons and perform matching after doing so. We demonstrate this improves the EER by over 2 %.
Split and Knit: 3D Fingerprint Capture with a Single Camera
Apoorva Srivastava,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2022
@inproceedings{bib_Spli_2022, AUTHOR = {Apoorva Srivastava, Anoop Namboodiri}, TITLE = {Split and Knit: 3D Fingerprint Capture with a Single Camera}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2022}}
3𝐷 fingerprint capture is less sensitive to skin moisture levels and avoids skin deformation, which is common in contact-based sensors, in addition to capturing depth information. Unfortunately, its adoption is limited due to high cost and system complexity. Photometric stereo provides an opportunity to build low-cost, simple sensors capable of high-quality 3𝐷 capture. However, it assumes that the surface being imaged is lambertian (unlike our fingers). We introduce the Split and Knit algorithm (SnK), a 3𝐷 reconstruction pipeline based on the photometric stereo for finger surfaces. It introduces an efficient way of estimating the direct illumination component, thus allowing us to do a higher-quality reconstruction of the entire finger surface. The algorithm also introduces a novel method to obtain the overall finger shape under NIR illumination, all using a single camera. Finally, we combine the overall finger shape and the ridge-valley point cloud to obtain a 3𝐷 finger phalange. The high-quality 3𝐷 reconstruction also results in better matching accuracy of the captured fingerprints. 1
Cycle Consistency Based Method for Learning Disentangled Representation for Stochastic Video Prediction
Ujjwal Tiwari,P ADITYA SREEKAR,Anoop Namboodiri
International Conference on Image Analysis and Processing, ICIAP, 2022
Abs | | bib Tex
@inproceedings{bib_Cycl_2022, AUTHOR = {Ujjwal Tiwari, P ADITYA SREEKAR, Anoop Namboodiri}, TITLE = {Cycle Consistency Based Method for Learning Disentangled Representation for Stochastic Video Prediction}, BOOKTITLE = {International Conference on Image Analysis and Processing}. YEAR = {2022}}
Video frame prediction is an interesting computer vision problem of predicting the future frames of a video sequence from a given set of context frames. Video prediction models have found wide-scale perspective applications in autonomous navigation, representation learning, and healthcare. However, predicting future frames is challenging due to the high dimensional and stochastic nature of video data. This work proposes a novel cycle consistency loss to disentangle video representation into a low dimensional time-dependent pose and time-independent content latent factors in two different VAE based video prediction models. The key motivation behind cycle consistency loss is that future frame predictions are more plausible and realistic if they reconstruct the previous frames. The proposed cycle consistency loss is also generic because it can be applied to other VAE-based stochastic video prediction architectures with slight architectural modifications. We validate our disentanglement hypothesis and the quality of long-range predictions on standard synthetic and challenging real-world datasets such as Stochastic Moving MNIST and BAIR.
SYSTEM AND METHOD FOR GENERATING DEWARPED IMAGE USING PROJECTION PATTERNS CAPTURED FROM OMNI - DIRECTIONAL STEREO CAMERAS
Rajat Agarwal,Anoop Namboodiri,Parikshit sakurikar
United States Patent, Us patent, 2022
@inproceedings{bib_SYST_2022, AUTHOR = {Rajat Agarwal, Anoop Namboodiri, Parikshit Sakurikar}, TITLE = {SYSTEM AND METHOD FOR GENERATING DEWARPED IMAGE USING PROJECTION PATTERNS CAPTURED FROM OMNI - DIRECTIONAL STEREO CAMERAS}, BOOKTITLE = {United States Patent}. YEAR = {2022}}
A system for generating high - resolution de - warped omni directional stereo image from captured omni - directional stereo image by correcting optical distortions using projec tion patterns is provided . The system includes a projection pattern capturing arrangement , a projector or a display , and a de - warping server . The projection pattern capturing arrangement includes one or more omnidirectional cameras to capture projection patterns from the captured omni directional stereo image from each omni - directional stereo camera . The projector or the display displays the projection patterns . The de - warping server obtain the projection pat terns and processes the projection patterns to generate high resolution de - warped omni - directional stereo image by cor recting optical distortions in the captured omni - directional stereo image and mapping the captured omni - directional stereo image and the high resolution de - warped omni directional stereo image
Transformer based Fingerprint Feature Extraction
Saraansh Tandon,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2022
@inproceedings{bib_Tran_2022, AUTHOR = {Saraansh Tandon, Anoop Namboodiri}, TITLE = {Transformer based Fingerprint Feature Extraction}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2022}}
Fingerprint feature extraction is a task that is solved using either a global or a local representation. State-of-the-art global approaches use heavy deep learning models to process the full fingerprint image at once, which makes the corresponding approach memory intensive. On the other hand, local approaches involve minutiae based patch extraction, multiple feature extraction steps and an expensive matching stage, which make the corresponding approach time intensive. However, both these approaches provide useful and sometimes exclusive insights for solving the problem. Using both approaches together for extracting fingerprint representations is semantically useful but quite inefficient. Our convolutional transformer based approach with an in-built minutiae extractor provides a time and memory efficient solution to extract a global as well as a local representation of the fingerprint. The use of these …
One-Shot Sensor and Material Translator : A Bilinear Decomposer for Fingerprint Presentation Attack Generalization
Gowri Lekshmy,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2022
@inproceedings{bib_One-_2022, AUTHOR = {Gowri Lekshmy, Anoop Namboodiri}, TITLE = {One-Shot Sensor and Material Translator : A Bilinear Decomposer for Fingerprint Presentation Attack Generalization}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2022}}
Automatic fingerprint recognition systems are currently under the constant threat of presentation attacks (PAs). Existing fingerprint presentation attack detection (FPAD) solutions improve cross-sensor and cross-material generalization by utilizing style-transfer-based augmentation wrappers over a two-class PAD classifier. These solutions synthesize data by learning the style as a single entity, containing both sensor and material characteristics. However, these strategies necessitate learning the entire style upon adding a new sensor for an already known material or vice versa. We propose a bilinear decomposition-based wrapper called OSMT to improve cross-sensor and cross-material FPAD. OSMT uses one PA fingerprint to learn the corresponding sensor and material representations by disentanglement. Our approach also reduces the computational complexity by generating compact representations and utilizing lesser combinations of sensors and materials to produce several styles. We present the improvement in PAD performance using our technique on the publicly available LivDet datasets (2015, 2017, 2019 and 2021
SIAN: Secure Iris Authentication using Noise
Praguna Manvi,ACHINTYA DESAI,Srinathan Kannan,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2022
@inproceedings{bib_SIAN_2022, AUTHOR = {Praguna Manvi, ACHINTYA DESAI, Srinathan Kannan, Anoop Namboodiri}, TITLE = {SIAN: Secure Iris Authentication using Noise}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2022}}
Biometric noise is often discarded in many biometric template protection systems. However, the noise ratio be- tween two templates encodes specific correlational proper- ties that template protection schemes can exploit. Biometric authentication usually occurs between mutually distrusting parties, which calls for privacy-preserving techniques. In this paper, we propose a novel biometric authentication pro- tocol, SIAN(Secure Iris Authentication using Noise), adapt- ing secure two-party computation and incorporating un- certainty constraints from biometric noise for security. We evaluate it on three iris datasets: MMU v1, Ubiris v1, and IITD v1, and observe a low EER degradation. The pro- posed protocol has information-theoretic security and low computational complexity, making it suitable for practical real-time applications.
CLACTIVE: EPISODIC MEMORIES FOR RAPID ACTIVE LEARNING
Sidhant Subramanian,SRI AUROBINDO MUNAGALA,SHYAMGOPAL KARTHIK,Ameya Prabhu,Anoop Namboodiri
Conference on Lifelong Learning Agents. PMLR, LLA@PMLR, 2022
@inproceedings{bib_CLAC_2022, AUTHOR = {Sidhant Subramanian, SRI AUROBINDO MUNAGALA, SHYAMGOPAL KARTHIK, Ameya Prabhu, Anoop Namboodiri}, TITLE = {CLACTIVE: EPISODIC MEMORIES FOR RAPID ACTIVE LEARNING}, BOOKTITLE = {Conference on Lifelong Learning Agents. PMLR}. YEAR = {2022}}
Active Learning aims to solve the problem of alleviating labelling costs for large-scale datasets by selecting a subset of data to effectively train on. Deep Active Learning (DAL) techniques typically involve repeated training of a model for sample acquisition over the entire subset of labelled data available in each round. This can be prohibitively expensive to run in real-world scenarios with large and constantly growing data. Some work has been done to address this – notably, SelectionVia-Proxy (SVP) proposed the use of a separate, smaller “proxy” model for acquisition. We explore further optimizations to the standard DAL setup and propose CLActive: an optimization procedure that brings significant speedups which maintains a constant training time for the selection model across rounds and retains information from past rounds using Experience Replay. We demonstrate large improvements in total train-time compared to the fully-trained baselines and SVP. We achieve up to 89×, 7×, 61× speedups over the fully-trained baseline at 50% of dataset collection in CIFAR, Imagenet and Amazon Review datasets, respectively, with little accuracy loss. We also show that CLActive is robust against catastrophic forgetting in a challenging class-incremental active-learning setting. Overall, we believe that CLActive can effectively enable rapid prototyping and deployment of deep AL algorithms in real-world use cases across a variety of settings.
System and method for capturing omni-stereo videos using multi-sensors
RAJAT AGGARWAL,Anoop Namboodiri
United States Patent, Us patent, 2022
@inproceedings{bib_Syst_2022, AUTHOR = {RAJAT AGGARWAL, Anoop Namboodiri}, TITLE = {System and method for capturing omni-stereo videos using multi-sensors}, BOOKTITLE = {United States Patent}. YEAR = {2022}}
A method of calibrating cameras used to collect images to form an omni - stereo image is disclosed . The method may comprise determining intrinsic and extrinsic camera param eters for each of a plurality of left eye cameras and right eye cameras arranged along a viewing circle or ellipse and angled tangentially with respect to the viewing circle or ellipse ; categorizing left - right pairs of the plurality of left eye cameras and the plurality of right eye cameras into at least a first category , a second category or a third category ; aligning the left - right pairs of cameras that fall into the first category ; aligning the left - right pairs of cameras that fall into the second category ; and aligning the left - right pairs of cameras that fall into the third category by using extrinsic parameters of the left - right pairs that fall into the first category , and of the left - right pairs that fall into the second category .
Supervision by Landmarks: An Enhanced Facial De-occlusion Network for VR-based Applications
Surabhi Gupta,Jinka Sai Sagar,Avinash Sharma,Anoop Namboodiri
European Conference on Computer Vision Workshops, ECCV-W, 2022
@inproceedings{bib_Supe_2022, AUTHOR = {Surabhi Gupta, Jinka Sai Sagar, Avinash Sharma, Anoop Namboodiri}, TITLE = {Supervision by Landmarks: An Enhanced Facial De-occlusion Network for VR-based Applications}, BOOKTITLE = {European Conference on Computer Vision Workshops}. YEAR = {2022}}
Face possesses a rich spatial structure that can provide valuable cues to guide various face-related tasks. The eyes are considered an important sociovisual cue for effective communication. They are an integral feature of facial expressions as they are an important aspect of interpersonal communication. However, virtual reality headsets occlude a significant portion of the face and restrict the visibility of certain facial features, particularly the eye region. Reproducing this region with realistic content and handling complex eye movements such as blinks is challenging. Previous facial inpainting methods are not capable enough to capture subtle eye movements. In view of this, we propose a working solution to refine the reconstructions, particularly around the eye region, by leveraging inherent eye structure. We introduce spatial supervision and a novel landmark predictor module to regularize per-frame reconstructions obtained from an existing image-based facial de-occlusion network. experiments verify the usefulness of our approach in enhancing the quality of reconstructions to capture subtle eye movements
System and method for capturing omni-stereo videos using multi-sensors
Rajat Aggarwal,Anoop Namboodiri
United States Patent, Us patent, 2022
@inproceedings{bib_Syst_2022, AUTHOR = {Rajat Aggarwal, Anoop Namboodiri}, TITLE = {System and method for capturing omni-stereo videos using multi-sensors}, BOOKTITLE = {United States Patent}. YEAR = {2022}}
A method of calibrating cameras used to collect images to form an omni-stereo image is disclosed. The method may comprise determining intrinsic and extrinsic camera param eters for each of a plurality of left eye cameras and right eye cameras arranged along a viewing circle or ellipse and angled tangentially with respect to the viewing circle or ellipse; categorizing left-right pairs of the plurality of left eye cameras and the plurality of right eye cameras into at least a first category, a second category or a third category; aligning the left-right pairs of cameras that fall into the first category; aligning the left-right pairs of cameras that fall into the second category; and aligning the left-right pairs of cameras that fall into the third category by using extrinsic parameters of the left-right pairs that fall into the first category, and of the left-right pairs that fall into the second category.
A Unified Model for Fingerprint Authentication and Presentation Attack Detection
Additya Popli,Saraansh Tandon,Joshua J. Engelsma,Naoyuki Onoe,Atsushi Okubo,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2021
Abs | | bib Tex
@inproceedings{bib_A_Un_2021, AUTHOR = {Additya Popli, Saraansh Tandon, Joshua J. Engelsma, Naoyuki Onoe, Atsushi Okubo, Anoop Namboodiri}, TITLE = {A Unified Model for Fingerprint Authentication and Presentation Attack Detection}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2021}}
Typical fingerprint recognition systems are comprised of a spoof detection module and a subsequent recognition module, running one after the other. In this paper, we reformulate the workings of a typical fingerprint recognition system. In particular, we posit that both spoof detection and fingerprint recognition are correlated tasks. Therefore, rather than performing the two tasks separately, we propose a joint model for spoof detection and matching1 to simultaneously perform both tasks without compromising the accuracy of either task. We demonstrate the capability of our joint model to obtain an authentication accuracy (1:1 matching) of TAR = 100% @ FAR = 0.1% on the FVC 2006 DB2A dataset while achieving a spoof detection ACE of 1.44% on the LiveDet 2015 dataset, both maintaining the performance of stand-alone methods. In practice, this reduces the time and memory requirements of the fingerprint recognition system by 50% and 40%, respectively; a significant advantage for recognition systems running on resource-constrained devices and communication channels.
System and method for capturing omni-stereo videos using multi-sensors
RAJAT AGGARWAL,Anoop Namboodiri
United States Patent, Us patent, 2021
@inproceedings{bib_Syst_2021, AUTHOR = {RAJAT AGGARWAL, Anoop Namboodiri}, TITLE = {System and method for capturing omni-stereo videos using multi-sensors }, BOOKTITLE = {United States Patent}. YEAR = {2021}}
A system for capturing Omni - Stereo videos using multi sensor includes left cameras , right cameras and a viewing circle . A method of capturing omni stereo videos using multi - sensor approach includes steps of : capturing images of a scene using left cameras , capturing images of a scene using right cameras , processing each image from the left camera and right camera using a computation method , and obtaining a final omni stereo frame through the computation method
A Unified Model for Fingerprint Authentication and Presentation Attack Detection
Additya Popli,Saraansh Tandon,Joshua J. Engelsma,Naoyuki Onoe,Atsushi Okubo,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2021
@inproceedings{bib_A_Un_2021, AUTHOR = {Additya Popli, Saraansh Tandon, Joshua J. Engelsma, Naoyuki Onoe, Atsushi Okubo, Anoop Namboodiri}, TITLE = {A Unified Model for Fingerprint Authentication and Presentation Attack Detection}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2021}}
Typical fingerprint recognition systems are comprised of a spoof detection module and a subsequent recognition module, running one after the other. In this paper, we reformulate the workings of a typical fingerprint recognition system. In particular, we posit that both spoof detection and fingerprint recognition are correlated tasks. Therefore, rather than performing the two tasks separately, we propose a joint model for spoof detection and matching to simultaneously perform both tasks without compromising the accuracy of either task. We demonstrate the capability of our joint model to obtain an authentication accuracy (1:1 matching) of TAR = 100% @ FAR = 0.1% on the FVC 2006 DB2A dataset while achieving a spoof detection ACE of 1.44% on the LiveDet 2015 dataset, both maintaining the performance of stand-alone methods. In practice, this reduces the time and memory requirements of the fingerprint recognition system by 50% and 40%, respectively; a significant advantage for recognition systems running on resource-constrained devices and communication channels.
Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition
Guntireddy Vijay Kumar,Anoop Namboodiri,Jawahar C V
Winter Conference on Applications of Computer Vision, WACV, 2020
@inproceedings{bib_Regi_2020, AUTHOR = {Guntireddy Vijay Kumar, Anoop Namboodiri, Jawahar C V}, TITLE = {Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2020}}
Current approaches for person recognition train an en- semble of region specific convolutional neural networks for representation learning, and then adopt naive fusion strate- gies to combine their features or predictions during testing. In this paper, we propose an unified end-to-end architec- ture that generates a complete person representation based on pooling and aggregation of features from multiple body regions. Our network takes a person image and the pre- determined locations of body regions as input, and gener- ates common feature maps that are shared across all the re- gions. Multiple features corresponding to different regions are then pooled and combined with an aggregation block, where the adaptive weights required for aggregation are obtained through an attention mechanism. Evaluations on three person recognition datasets - PIPA, Soccer and Han- nah show that a single model trained end-to-end is com- putationally faster, requires fewer parameters and achieves improved performance over separately trained models.
Reducing the Variance of Variational Estimates of Mutual Information by Limiting the Critic's Hypothesis Space to RKHS
P ADITYA SREEKAR,Ujjwal Tiwari,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2020
@inproceedings{bib_Redu_2020, AUTHOR = {P ADITYA SREEKAR, Ujjwal Tiwari, Anoop Namboodiri}, TITLE = {Reducing the Variance of Variational Estimates of Mutual Information by Limiting the Critic's Hypothesis Space to RKHS}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2020}}
Mutual information (MI) is an information-theoretic measure of dependency between two random variables. Several methods to estimate MI, from samples of two random variables with unknown underlying probability distributions have been proposed in the literature. Recent methods realize parametric probability distributions or critic as a neural network to approximate unknown density ratios. The approximated density ratios are used to estimate different variational lower bounds of MI. While these methods provide reliable estimation when the true MI is low, they produce high variance estimates in cases of high MI. We argue that the high variance characteristic is due to the uncontrolled complexity of the critic's hypothesis space. In support of this argument, we use the data-driven Rademacher complexity of the hypothesis space associated with the critic's architecture to analyse generalization error bound of variational lower bound estimates of MI. In the proposed work, we show that it is possible to negate the high variance characteristics of these estimators by constraining the critic's hypothesis space to Reproducing Hilbert Kernel Space (RKHS), which corresponds to a kernel learned using Automated Spectral Kernel Learning (ASKL). By analysing the aforementioned generalization error bounds, we augment the overall optimisation objective with effective regularisation term. We empirically demonstrate the efficacy of this regularization in enforcing proper bias variance tradeoff on four variational lower bounds, namely NWJ, MINE, JS and SMILE.
Mutual Information Based Method for Unsupervised Disentanglement of Video Representation
P ADITYA SREEKAR,Ujjwal Tiwari,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2020
@inproceedings{bib_Mutu_2020, AUTHOR = {P ADITYA SREEKAR, Ujjwal Tiwari, Anoop Namboodiri}, TITLE = {Mutual Information Based Method for Unsupervised Disentanglement of Video Representation}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2020}}
Video Prediction is an interesting and challenging task of predicting future frames from a given set context frames that belong to a video sequence. Video prediction models have found prospective applications in Maneuver Planning, Health care, Autonomous Navigation and Simulation. One of the major challenges in future frame generation is due to the high dimensional nature of visual data. In this work, we propose Mutual Information Predictive Auto-Encoder (MIPAE) framework, that reduces the task of predicting high dimensional video frames by factorising video representations into content and low dimensional pose latent variables that are easy to predict. A standard LSTM network is used to predict these low dimensional pose representations. Content and the predicted pose representations are decoded to generate future frames. Our approach leverages the temporal structure of the latent generative factors of a video and a novel mutual information loss to learn disentangled video representations. We also propose a metric based on mutual information gap (MIG) to quantitatively access the effectiveness of disentanglement on DSprites and MPI3D-real datasets. MIG scores corroborate with the visual superiority of frames predicted by MIPAE. We also compare our method quantitatively on evaluation metrics LPIPS, SSIM and PSNR.
Towards Accurate Vehicle Behaviour Classification With Multi-Relational Graph Convolutional Networks
Mylavarapu Venkata Sai Sravan,MAHTAB SANDHU,Priyesh Vijayan,K Madhava Krishna,Balaraman Ravindran,Anoop Namboodiri
Intelligent Vehicles symposium, IV, 2020
@inproceedings{bib_Towa_2020, AUTHOR = {Mylavarapu Venkata Sai Sravan, MAHTAB SANDHU, Priyesh Vijayan, K Madhava Krishna, Balaraman Ravindran, Anoop Namboodiri}, TITLE = {Towards Accurate Vehicle Behaviour Classification With Multi-Relational Graph Convolutional Networks}, BOOKTITLE = {Intelligent Vehicles symposium}. YEAR = {2020}}
Understanding on-road vehicle behaviour from a temporal sequence of sensor data is gaining in popularity. In this paper, we propose a pipeline for understanding vehicle behaviour from a monocular image sequence or video. A monocular sequence along with scene semantics, optical flow and object labels are used to get spatial information about the object (vehicle) of interest and other objects (semantically contiguous set of locations) in the scene. This spatial information is encoded by a Multi-Relational Graph Convolutional Network (MR-GCN), and a temporal sequence of such encodings is fed to a recurrent network to label vehicle behaviours. The proposed framework can classify a variety of vehicle behaviours to high fidelity on datasets that are diverse and include European, Chinese and Indian on-road scenes. The framework also provides for seamless transfer of models across datasets without entailing re-annotation, retraining and even fine-tuning. We show comparative performance gain over baseline Spatio-temporal classifiers and detail a variety of ablations to showcase the efficacy of the framework.
Understanding Dynamic Scenes using Graph Convolution Networks
Mylavarapu Venkata Sai Sravan,MAHTAB SANDHU,Mahtab Sandhu,K Madhava Krishna,Balaraman Ravindran,Anoop Namboodiri
International Conference on Intelligent Robots and Systems, IROS, 2020
@inproceedings{bib_Unde_2020, AUTHOR = {Mylavarapu Venkata Sai Sravan, MAHTAB SANDHU, Mahtab Sandhu, K Madhava Krishna, Balaraman Ravindran, Anoop Namboodiri}, TITLE = {Understanding Dynamic Scenes using Graph Convolution Networks}, BOOKTITLE = {International Conference on Intelligent Robots and Systems}. YEAR = {2020}}
We present a novel Multi Relational Graph Convolutional Network (MRGCN) to model on-road vehicle behaviours from a sequence of temporally ordered frames as grabbed by a moving monocular camera. The input to MRGCN is a Multi Relational Graph (MRG) where the nodes of the graph represent the active and passive participants/agents in the scene while the bidrectional edges that connect every pair of nodes are encodings of the spatio-temporal relations. The bidirectional edges of the graph encode the temporal interactions between the agents that constitute the two nodes of the edge. The proposed method of obtaining his encoding is shown to be specifically suited for the problem at hand as it outperforms more complex end to end learning methods that do not use such intermediate representations of evolved spatio-temporal relations between agent pairs. We show significant performance gain in the form of behaviour classification accuracy on a variety of datasets from different parts of the globe over prior methods as well as show seamless transfer without any resort to fine-tuning across multiple datasets. Such behaviour prediction methods find immediate relevance in a variety of navigation tasks such as behaviour planning, state estimation as well as in applications relating to detection of traffic violations over videos.
Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition
VIJAYA KUMAR R,Anoop Namboodiri,Jawahar C V
Winter Conference on Applications of Computer Vision, WACV, 2020
@inproceedings{bib_Regi_2020, AUTHOR = {VIJAYA KUMAR R, Anoop Namboodiri, Jawahar C V}, TITLE = {Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2020}}
Current approaches for person recognition train an ensemble of region specific convolutional neural networks for representation learning, and then adopt naive fusion strategies to combine their features or predictions during testing. In this paper, we propose an unified end-to-end architecture that generates a complete person representation based on pooling and aggregation of features from multiple body regions. Our network takes a person image and the predetermined locations of body regions as input, and generates common feature maps that are shared across all the regions. Multiple features corresponding to different regions are then pooled and combined with an aggregation block, where the adaptive weights required for aggregation are obtained through an attention mechanism. Evaluations on three person recognition datasets - PIPA, Soccer and Hannah show that a single model trained end-to-end is computationally faster, requires fewer parameters and achieves improved performance over separately trained models.
CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems
ACHARY SUDHEER,Kommu Lakshmi Bhanu Moorthy,Ashar Javed,P Nikitha Shravan,Vineet Gandhi,Anoop Namboodiri
Eurographics Workshop on Intelligent Cinematography and Editing, WICED, 2020
@inproceedings{bib_Cine_2020, AUTHOR = {ACHARY SUDHEER, Kommu Lakshmi Bhanu Moorthy, Ashar Javed, P Nikitha Shravan, Vineet Gandhi, Anoop Namboodiri}, TITLE = {CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems}, BOOKTITLE = {Eurographics Workshop on Intelligent Cinematography and Editing}. YEAR = {2020}}
Autonomous camera systems are often subjected to an optimization/filtering operation to smoothen and stabilize the rough trajectory estimates. Most common filtering techniques do reduce the irregularities in data; however, they fail to mimic the behavior of a human cameraman. Global filtering methods modeling human camera operators have been successful; however, they are limited to offline settings. In this paper, we propose two online filtering methods called Cinefilters, which produce smooth camera trajectories that are motivated by cinematographic principles. The first filter (CineConvex) uses a sliding window-based convex optimization formulation, and the second (CineCNN) is a CNN based encoder-decoder model. We evaluate the proposed filters in two different settings, namely a basketball dataset and a stage performance dataset. Our models outperform previous methods and baselines on both quantitative and qualitative metrics. The CineConvex and CineCNN filters operate at about 250fps and 1000fps, respectively, with a minor latency (half a second), making them apt for a variety of real-time applications.
IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments
Girish Varma,Anbumani Subramanian,Manmohan Chandraker,Anoop Namboodiri,Jawahar C V
Winter Conference on Applications of Computer Vision, WACV, 2019
@inproceedings{bib_IDD:_2019, AUTHOR = {Girish Varma, Anbumani Subramanian, Manmohan Chandraker, Anoop Namboodiri, Jawahar C V}, TITLE = {IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2019}}
While several datasets for autonomous navigation have become available in recent years, they have tended to focus on structured driving environments. This usually corresponds to well-delineated infrastructure such as lanes, a small number of well-defined categories for traffic participants, low variation in object or background appearance and strong adherence to traffic rules. We propose DS, a novel dataset for road scene understanding in unstructured environments where the above assumptions are largely not satisfied. It consists of 10,004 images, finely annotated with 34 classes collected from 182 drive sequences on Indian roads. The label set is expanded in comparison to popular benchmarks such as Cityscapes, to account for new classes. It also reflects label distributions of road scenes significantly different from existing datasets, with most classes displaying greater within-class diversity. Consistent with …
Universal Material Translator: Towards Spoof Fingerprint Generalization
ROHIT KUMAR GAJAWADA,Additya Popli,Tarang Chugh,Anoop Namboodiri,Anil K. Jain
International conference on Biometrics, IJCB, 2019
@inproceedings{bib_Univ_2019, AUTHOR = {ROHIT KUMAR GAJAWADA, Additya Popli, Tarang Chugh, Anoop Namboodiri, Anil K. Jain}, TITLE = {Universal Material Translator: Towards Spoof Fingerprint Generalization}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2019}}
Spoof detectors are classifiers that are trained to distinguish spoof fingerprints from bonafide ones. However, state of the art spoof detectors do not generalize well on unseen spoof materials. This study proposes a style transfer based augmentation wrapper that can be used on any existing spoof detector and can dynamically improve the robustness of the spoof detection system on spoof materials for which we have very low data. Our method is an approach for synthesizing new spoof images from a few spoof examples that transfers the style or material properties of the spoof examples to the content of bonafide fingerprints to generate a larger number of examples to train the classifier on. We demonstrate the effectiveness of our approach on materials in the publicly available LivDet 2015 dataset and show that the proposed approach leads to robustness to fingerprint spoofs of the target material.
System and method for capturing omni-stereo videos using multi-sensors
Rajat Aggarwal,Anoop Namboodiri
@inproceedings{bib_Syst_2019, AUTHOR = {Rajat Aggarwal, Anoop Namboodiri}, TITLE = {System and method for capturing omni-stereo videos using multi-sensors}, BOOKTITLE = {}. YEAR = {2019}}
A system and method for capturing Omni-Stereo videos using multi-sensor is disclosed. The system includes left cameras, right cameras and a viewing circle. The method of capturing omni stereo videos using multi-sensor approach includes steps of: capturing images of a scene using left cameras, capturing images of a scene using right cameras, processing each image from the left camera and right camera using a computation method, and obtaining a final omni stereo frame through the computation method.
Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images
Vahid Mirjalili,Sebastian Raschka,Anoop Namboodiri,Arun Ross
International conference on Biometrics, IJCB, 2018
@inproceedings{bib_Semi_2018, AUTHOR = {Vahid Mirjalili, Sebastian Raschka, Anoop Namboodiri, Arun Ross}, TITLE = {Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2018}}
In this paper, we design and evaluate a convolutional autoencoder that perturbs an input face image to impart privacy to a subject. Specifically, the proposed autoencoder transforms an input face image such that the transformed image can be successfully used for face recognition but not for gender classification. In order to train this autoencoder, we propose a novel training scheme, referred to as semiadversarial training in this work. The training is facilitated by attaching a semi-adversarial module consisting of an auxiliary gender classifier and an auxiliary face matcher to the autoencoder. The objective function utilized for training this network has three terms: one to ensure that the perturbed image is a realistic face image; another to ensure that the gender attributes of the face are confounded; and a third to ensure that biometric recognition performance due to the perturbed image is not impacted. Extensive experiments confirm the efficacy of the proposed architecture in extending gender privacy to face images.
Semi-supervised annotation of faces in image collection
VIJAYA KUMAR R,Anoop Namboodiri,Jawahar C V
Signal,Image and Video Processing, SIViP, 2018
@inproceedings{bib_Semi_2018, AUTHOR = {VIJAYA KUMAR R, Anoop Namboodiri, Jawahar C V}, TITLE = {Semi-supervised annotation of faces in image collection}, BOOKTITLE = {Signal,Image and Video Processing}. YEAR = {2018}}
Deep expander networks: Efficient deep networks from graph theory
PRABHU AMEYA PANDURANG,Girish Varma,Anoop Namboodiri
European Conference on Computer Vision, ECCV, 2018
@inproceedings{bib_Deep_2018, AUTHOR = {PRABHU AMEYA PANDURANG, Girish Varma, Anoop Namboodiri}, TITLE = {Deep expander networks: Efficient deep networks from graph theory}, BOOKTITLE = {European Conference on Computer Vision}. YEAR = {2018}}
Efficient CNN designs like ResNets and DenseNet were proposed to improve accuracy vs efficiency trade-offs. They essentially increased the connectivity, allowing efficient information flow across layers. Inspired by these techniques, we propose to model connections between filters of a CNN using graphs which are simultaneously sparse and well connected. Sparsity results in efficiency while well connectedness can preserve the expressive power of the CNNs. We use a well-studied class of graphs from theoretical computer science that satisfies these properties known as Expander graphs. Expander graphs are used to model connections between filters in CNNs to design networks called X-Nets. We present two guarantees on the connectivity of X-Nets: Each node influences every node in a layer in logarithmic steps, and the number of paths between two sets of nodes is proportional to the product of their sizes. We also propose efficient training and inference algorithms, making it possible to train deeper and wider X-Nets effectively. Expander based models give a 4% improvement in accuracy on MobileNet over grouped convolutions, a popular technique, which has the same sparsity but worse connectivity. X-Nets give better performance trade-offs than the original ResNet and DenseNet-BC architectures. We achieve model sizes comparable to state-of-the-art pruning techniques using our simple architecture design, without any pruning. We hope that this work motivates other approaches to utilize results from graph theory to develop efficient network architectures.
Distribution-aware binarization of neural networks for sketch recognition
PRABHU AMEYA PANDURANG,BATCHU VENKAT VISHAL,SRI AUROBINDO MUNAGALA,ROHIT KUMAR GAJAWADA,Anoop Namboodiri
Winter Conference on Applications of Computer Vision, WACV, 2018
@inproceedings{bib_Dist_2018, AUTHOR = {PRABHU AMEYA PANDURANG, BATCHU VENKAT VISHAL, SRI AUROBINDO MUNAGALA, ROHIT KUMAR GAJAWADA, Anoop Namboodiri}, TITLE = {Distribution-aware binarization of neural networks for sketch recognition}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2018}}
Deep neural networks are highly effective at a range of computational tasks. However, they tend to be computationally expensive, especially in vision-related problems, and also have large memory requirements. One of the most effective methods to achieve significant improvements in computational/spatial efficiency is to binarize the weights and activations in a network. However, naive binarization results in accuracy drops when applied to networks for most tasks. In this work, we present a highly generalized, distribution-aware approach to binarizing deep networks that allows us to retain the advantages of a binarized network, while reducing accuracy drops. We also develop efficient implementations for our proposed approach across different architectures. We present a theoretical analysis of the technique to show the effective representational power of the resulting layers, and explore the forms of data they …
Hybrid binary networks: optimizing for accuracy, efficiency and memory
PRABHU AMEYA PANDURANG,BATCHU VENKAT VISHAL,ROHIT KUMAR GAJAWADA,SRI AUROBINDO MUNAGALA,Anoop Namboodiri
Winter Conference on Applications of Computer Vision, WACV, 2018
@inproceedings{bib_Hybr_2018, AUTHOR = {PRABHU AMEYA PANDURANG, BATCHU VENKAT VISHAL, ROHIT KUMAR GAJAWADA, SRI AUROBINDO MUNAGALA, Anoop Namboodiri}, TITLE = {Hybrid binary networks: optimizing for accuracy, efficiency and memory}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2018}}
Binarization is an extreme network compression approach that provides large computational speedups along with energy and memory savings, albeit at significant accuracy costs. We investigate the question of where to binarize inputs at layer-level granularity and show that selectively binarizing the inputs to specific layers in the network could lead to significant improvements in accuracy while preserving most of the advantages of binarization. We analyze the binarization tradeoff using a metric that jointly models the input binarization-error and computational cost and introduce an efficient algorithm to select layers whose inputs are to be binarized. Practical guidelines based on insights obtained from applying the algorithm to a variety of models are discussed. Experiments on Imagenet dataset using AlexNet and ResNet-18 models show 3-4% improvements in accuracy over fully binarized networks with minimal
Cross-modal style transfer
SAHIL CHELARAMANI,Abhishek Jha,Anoop Namboodiri
International Conference on Image Processing, ICIP, 2018
@inproceedings{bib_Cros_2018, AUTHOR = {SAHIL CHELARAMANI, Abhishek Jha, Anoop Namboodiri}, TITLE = {Cross-modal style transfer}, BOOKTITLE = {International Conference on Image Processing}. YEAR = {2018}}
We, humans, have the ability to easily imagine scenes that depict sentences such as “Today is a beautiful sunny day” or “There is a Christmas feel, in the air”. While it is hard to precisely describe what one person may imagine, the essential high-level themes associated with such sentences largely remains the same. The ability to synthesize novel images that depict the feel of a sentence is very useful in a variety of applications such as education, advertisement, and entertainment. While existing papers tackle this problem given a style image, we aim to provide a far more intuitive and easy to use solution that synthesizes novel renditions of an existing image, conditioned on a given sentence. We present a method for cross-modal style transfer between an English sentence and an image, to produce a new image that imbibes the essential theme of the sentence. We do this by modifying the style transfer mechanism
System and method for capturing horizontal disparity stereo panorama
Anoop Namboodiri,Rajat Aggarwal ,Amrisha Vohra
@inproceedings{bib_Syst_2018, AUTHOR = {Anoop Namboodiri, Rajat Aggarwal , Amrisha Vohra}, TITLE = {System and method for capturing horizontal disparity stereo panorama}, BOOKTITLE = {}. YEAR = {2018}}
A system for capturing horizontal disparity stereo panorama is disclosed . The system includes a multi surface selective light reflector unit , a secondary reflector and a computing unit . The multi surface selective light reflector unit ( a ) obtains light rays from a 3D scene of outside world that are relevant to create ( i ) a left eye panorama and ( ii ) a right eye panorama and ( b ) reflects the light rays without internal reflections between the light rays . The secondary reflector ( a ) obtains the reflected light rays from the multi surface selective light reflector unit and ( b ) reflects the light rays through the viewing aperture . The computing unit captures ( i ) the reflected light rays from the secondary reflector and ( ii ) the upper part of the 3D scene from a concave lens as a warped image and processes the warped image to ( a ) the left eye panorama and ( b ) the right eye panorama .
Learning Deep and Compact Models for Gesture Recognition
KOUSTAV MULLICK,Anoop Namboodiri
International Conference on Image Processing, ICIP, 2017
@inproceedings{bib_Lear_2017, AUTHOR = {KOUSTAV MULLICK, Anoop Namboodiri}, TITLE = {Learning Deep and Compact Models for Gesture Recognition}, BOOKTITLE = {International Conference on Image Processing}. YEAR = {2017}}
We look at the problem of developing a compact and accurate model for gesture recognition from videos in a deep-learning framework. Towards this we propose a joint 3DCNN-LSTM model that is end-to-end trainable and is shown to be better suited to capture the dynamic information in actions. The solution achieves close to state-of-the-art accuracy on the ChaLearn dataset, with only half the model size. We also explore ways to derive a much more compact representation in a knowledge distillation framework followed by model compression. The final model is less than 1 MB in size, which is less than one hundredth of our initial model, with a drop of 7% in accuracy, and is suitable for real-time gesture recognition on mobile devices.
Multiscale Two-view Stereo using Convolutional Neural Networks for Unrectified Images
Y N PRAMOD,Anoop Namboodiri
Machine Vision and Applications, MVA, 2017
@inproceedings{bib_Mult_2017, AUTHOR = {Y N PRAMOD, Anoop Namboodiri}, TITLE = {Multiscale Two-view Stereo using Convolutional Neural Networks for Unrectified Images}, BOOKTITLE = {Machine Vision and Applications}. YEAR = {2017}}
wo-view stereo problem is a well researchedproblem in 3D computer vision. Algorithms proposedin the past have focussed on rectified stereo imageswhere the epipolar lines are parallel to the horizon-tal axis.The general problem of computing stereocorrespondences for unrectified images without anyknowledge of calibration parameters is an importantproblem but unexplored as yet. Our idea in this paperis to predict depth maps from two unrectified stereoimages using a modified Flownet architecture. Since,datasets for depth map reconstruction for unrectifiedstereo images for deep learning do not exist, we havecreated a dataset of turn table sequences of 3D modelsfrom Google 3D warehouse. Following the conceptsof Attention modelling, we implement an architecturefor combining correlations computed at multipleresolutions using a simple element-wise multiplicationof the correlations to aid the architecture to resolvecorrespondences for textureless and repeated texturedsurfaces. Our experiments show both qualitaitve andquantitative improvements of depth maps over theoriginal Flownet architecture.
Words speak for Actions: Using Text to find Video Highlights
KUDI SUKANYA HANMANTRAO,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2017
@inproceedings{bib_Word_2017, AUTHOR = {KUDI SUKANYA HANMANTRAO, Anoop Namboodiri}, TITLE = {Words speak for Actions: Using Text to find Video Highlights}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2017}}
Video highlights are a selection of the most interesting parts of a video. The problem of highlight de-tection has been explored for video domains like egocentric,sports, movies, and surveillance videos. Existing methods are limited to finding visually important parts of the video but does not necessarily learn semantics. Moreover, the available benchmark datasets contain audio muted, single activity,short videos, which lack any context apart from a few keyframes that can be used to understand them. In this work,we explore highlight detection in the TV series domain, which features complex interactions with the surroundings. The existing methods would fare poorly in capturing the video semantics in such videos. To incorporate the importance of dialogues/audio, we propose using the descriptions of shots of the video as cues to learning visual importance.Note that while the audio information is used to determine visual importance during training, the highlight detection still works using only the visual information from videos.We use publicly available text ranking algorithms to rank the descriptions. The ranking scores are used to train a visual pair wise shot ranking model (VPSR) to find the highlights of the video. The results are reported on TV series videos of the Video Set dataset and a season of Buffy the Vampire SlayerTV series.
Pose-aware person recognition
VIJAYA KUMAR R,Anoop Namboodiri,Manohar Paluri,Jawahar C V
Computer Vision and Pattern Recognition, CVPR, 2017
@inproceedings{bib_Pose_2017, AUTHOR = {VIJAYA KUMAR R, Anoop Namboodiri, Manohar Paluri, Jawahar C V}, TITLE = {Pose-aware person recognition}, BOOKTITLE = {Computer Vision and Pattern Recognition}. YEAR = {2017}}
Person recognition methods that use multiple body regions have shown significant improvements over traditional face-based recognition. One of the primary challenges in full-body person recognition is the extreme variation in pose and view point. In this work,(i) we present an approach that tackles pose variations utilizing multiple models that are trained on specific poses, and combined using pose-aware weights during testing.(ii) For learning a person representation, we propose a network that jointly optimizes a single loss over multiple body regions.(iii) Finally, we introduce new benchmarks to evaluate person recognition in diverse scenarios and show significant improvements over previously proposed approaches on all the benchmarks including the photo album setting of PIPA.
Robust Transgender Face Recognition: Approach based on Appearance and Therapy Factors
Vijay Kumar,R. Raghavendra,Anoop Namboodiri,Christoph Busch
International Conference on Identity, Security and Behavior Analysis, ISBA, 2016
@inproceedings{bib_Robu_2016, AUTHOR = {Vijay Kumar, R. Raghavendra, Anoop Namboodiri, Christoph Busch}, TITLE = {Robust Transgender Face Recognition: Approach based on Appearance and Therapy Factors}, BOOKTITLE = {International Conference on Identity, Security and Behavior Analysis}. YEAR = {2016}}
Transgender face recognition is gaining increasing attention in the face recognition community because ofits potential in real life applications. Despite extensive progress in traditional face recognition domain, it is very challenging to recognize faces under transgender setting.The gender transformation results in significant face variations, both in shape and texture gradually over time. This introduces additional complexities to existing face recognition algorithms to achieve a reliable performance. In this paper, we present a novel framework that incorporates appearance factor and a transformation factor caused due to Hormone Replacement Therapy (HRT) for recognition.To this extent, we employ the Hidden Factor Analysis(HFA) to jointly model a face under therapy as a linear combination of appearance and transformation factors.This is based on the intuition that the appearance factor captures the features that are unaffected by the therapy and transformation factor captures the feature changes due to therapy. Extensive experiments carried out on publicly available HRT transgender face database shows the efficacy of the proposed scheme with a recognition accuracy of82.36%
Panoramic Stereo Videos with a Single Camera
RAJAT AGGARWAL,AMRISHA VOHRA,Anoop Namboodiri
Computer Vision and Pattern Recognition, CVPR, 2016
@inproceedings{bib_Pano_2016, AUTHOR = {RAJAT AGGARWAL, AMRISHA VOHRA, Anoop Namboodiri}, TITLE = {Panoramic Stereo Videos with a Single Camera}, BOOKTITLE = {Computer Vision and Pattern Recognition}. YEAR = {2016}}
We present a practical solution for generating 360◦stereo panoramic videos using a single camera. Current approaches either use a moving camera that captures multiple images of a scene, which are then stitched together to form the final panorama, or use multiple cameras that are synchronized. A moving camera limits the solution to static scenes, while multi-camera solutions require dedicated calibrated setups. Our approach improves upon the existing solutions in two significant ways: It solves the problem using a single camera, thus minimizing the calibration problem and providing us the ability to convert any digital camera into a panoramic stereo capture device. It captures all the light rays required for stereo panoramas in a single frame using a compact custom designed mirror, thus making the design practical to manufacture and easier to use. We analyze several properties of the design as well as present panoramic stereo and depth estimation results.
Leveraging Multiple Tasks to Regularize Fine-Grained Classification
RIDDHIMAN DASGUPTA,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2016
@inproceedings{bib_Leve_2016, AUTHOR = {RIDDHIMAN DASGUPTA, Anoop Namboodiri}, TITLE = {Leveraging Multiple Tasks to Regularize Fine-Grained Classification}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2016}}
Fine-grained classification is an extremely challenging problem in computer vision, compounded by subtle differences in shape, pose, illumination and appearance. While convolutional neural networks have become the versatile jack-of-all-trades tool in modern computer vision, approaches for fine-grained recognition still rely on localization of keypoints and parts to learn discriminative features for recognition. In order to achieve this, most approaches use a localization module and subsequently learn classifiers for the inferred locations,thus necessitating large amounts of manual annotations for bounding boxes and key points. In order to tackle this problem,we aim to leverage the (taxonomic and/or semantic) relationships present among fine-grained classes. The ontology tree is a free source of labels that can be used as auxiliary tasks to train a multi-task loss. Additional tasks can act as regularizers, and increase the generalization capabilities of the network. Multiple tasks try to take the network in diverging directions, and the network has to reach a common minimum by adapting and learning features common to all tasks in its shared layers.We train a multi-task network using auxiliary tasks extracted from taxonomical or semantic hierarchies, using a novel method to update task-wise learning rates to ensure that the related tasks aid and unrelated tasks does not hamper performance on the primary task. Experiments on the popular CUB-200-2011 dataset show that employing super-classes in an end-to-end model improves performance, compared to methods employing additional expensive annotations such as key points and bounding boxes and/or using multi-stage pipelines.
Detection and Segmentation of Mirror-like Surfaces Using Structured Illumination
RAJAT AGGARWAL,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2016
@inproceedings{bib_Dete_2016, AUTHOR = {RAJAT AGGARWAL, Anoop Namboodiri}, TITLE = {Detection and Segmentation of Mirror-like Surfaces Using Structured Illumination}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2016}}
In computer vision, many active illumination techniques em-ploy Projector-Camera systems to extract useful information from the scenes. Known illumination patterns are projected onto the scene and their deformations in the captured images are then analyzed. We observe that the local frequencies in the captured pattern for the mirror-like surfaces is different from the projected pattern. This property allows us to design a custom Projector-Camera system to segment mirror-like surfaces by analyzing the local frequencies in the captured images. The system projects a sinusoidal pattern and capture the images from projector’s point of view. We present segmentation results for the scenes including multiple reflections and inter-reflections from the mirror-like sur-faces. The method can further be used in the separation of direct and global components for the mirror-like surfaces by illuminating the non-mirror-like objects separately. We show how our method is also useful for accurate estimation of shape of the non-mirror-like regions in the presence of mirror-like regions in a scene.
Laplacian pyramids for deep feature inversion
ANIKET SINGH,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2015
@inproceedings{bib_Lapl_2015, AUTHOR = {ANIKET SINGH, Anoop Namboodiri}, TITLE = {Laplacian pyramids for deep feature inversion}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2015}}
Modern feature extraction pipelines, especially the on esusing deep networks, involve an increasing variety of elements. With layered approaches heaping abstraction upon abstraction, it becomes difficult to understand what it is that these features are capturing. One appealing way of solving this puzzle is feature visualization, where features are mapped back to the image domain. Our work improves the generic approach of performing gradient descent (GD) in the image space to match a given set of features to achieve a visualization. Specifically, we note that coarse features of an image like blobs, outlines etc. are useful by them-selves for classification purposes. We develop an inversion scheme based on this idea by recovering coarse features of the image before finer details. This is done by modeling the image as the composition of a Laplacian Pyramid. We show that by performing GD on the pyramid in a level-wise manner, we can recover meaningful images. Results are presented for inverting a shallow network: the densely calculated SIFT as well as a deep network: Krizehvsky et al.’sImagenet CNN (Alexnet).
Learning Clustered Sub-spaces for Sketch-based Image Retrieval
KOUSTAV GHOSAL,PRABHU AMEYA PANDURANG,RIDDHIMAN DASGUPTA,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2015
@inproceedings{bib_Lear_2015, AUTHOR = {KOUSTAV GHOSAL, PRABHU AMEYA PANDURANG, RIDDHIMAN DASGUPTA, Anoop Namboodiri}, TITLE = {Learning Clustered Sub-spaces for Sketch-based Image Retrieval}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2015}}
Most of the traditional sketch-based image retrieval systems compare sketches and images using morphological features. Since these features belong to two different modalities, they are compared either by reducing the image to a sparse sketch like form or by transforming the sketches to a denser image like representation. However, this cross-modal transformation leads to information loss or adds undesirable noise to the system. We propose a method, in which, instead of comparing the two modalities directly, across-modal correspondence is established between the images and sketches. Using an extended version of Canonical Correlation Analysis (CCA), the samples are projected onto a lower dimensional subspace, where the images and sketches of the same class are maximally correlated. We test the efficiency of our method on images from Caltech, PAS-CAL and sketches from TU-BERLIN dataset. Our resultsshow significant improvement in retrieval performance with the cross-modal correspondence.
Fine pose estimation of known objects in cluttered scene images
SUDIPTO BANERJEE,SANCHIT AGGARWAL,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2015
@inproceedings{bib_Fine_2015, AUTHOR = {SUDIPTO BANERJEE, SANCHIT AGGARWAL, Anoop Namboodiri}, TITLE = {Fine pose estimation of known objects in cluttered scene images}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2015}}
Understanding the precise 3D structure of an environ-ment is one of the fundamental goals of computer vision and is challenging due to a variety of factors such as appearance variation, illumination, pose, noise, occlusion and scene clutter. A generic solution to the problem is ill-posed due to the loss of depth information during imaging. In this paper, we consider a specific but common situation, where the scene contains known objects. Given 3D models of a set of known objects and a cluttered scene image, we try to detect these objects in the image, and align 3D models to their images to find their exact pose. We develop an approach that poses this as a 3D-to-2D alignment problem.We also deal with pose estimation of 3D articulated objects in images. We evaluate our proposed method on Big Bird dataset and our own tabletop dataset, and present experimental comparisons with state-of-the-art methods.
Visual phrases for exemplar face detection
N VIJAY KUMAR,Anoop Namboodiri,Jawahar C V
International Conference on Computer Vision, ICCV, 2015
@inproceedings{bib_Visu_2015, AUTHOR = {N VIJAY KUMAR, Anoop Namboodiri, Jawahar C V}, TITLE = {Visual phrases for exemplar face detection}, BOOKTITLE = {International Conference on Computer Vision}. YEAR = {2015}}
Recently, exemplar based approaches have been successfully applied for face detection in the wild. Contrary to traditional approaches that model face variations from a large and diverse set of training examples, exemplar-based approaches use a collection of discriminatively trained exemplars for detection. In this paradigm, each exemplar casts a vote using retrieval framework and generalized Hough voting, to locate the faces in the target image. The advantage of this approach is that by having a large database that covers all possible variations, faces in challenging conditions can be detected without having to learn explicit models for different variations. Current schemes, however, make an assumption of independence between the visual words, ignoring their relations in the process. They also ignore the spatial consistency of the visual words. Consequently, every exemplar word contributes equally during voting regardless of its location. In this paper, we propose a novel approach that incorporates higher order information in the voting process. We discover visual phrases that contain semantically related visual words and exploit them for detection along with the visual words. For spatial consistency, we estimate the spatial distribution of visual words and phrases from the entire database and then weigh their occurrence in exemplars. This ensures that a visual word or a phrase in an exemplar makes a major contribution only if it occurs at its semantic location, thereby suppressing the noise significantly. We perform extensive experiments on standard FDDB, AFW and G-album datasets and show significant improvement over previous exemplar …
Online handwriting recognition using depth sensors
RAJAT AGGARWAL,SIRNAM SWETHA,Anoop Namboodiri,Jayanthi Sivaswamy,Jawahar C V
International Conference on Document Analysis and Recognition, ICDAR, 2015
@inproceedings{bib_Onli_2015, AUTHOR = {RAJAT AGGARWAL, SIRNAM SWETHA, Anoop Namboodiri, Jayanthi Sivaswamy, Jawahar C V}, TITLE = {Online handwriting recognition using depth sensors}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2015}}
In this work, we propose an online handwriting solution, where the data is captured with the help of depth sensors. Users may write in the air and our method recognizes it in real time using the proposed feature representation. Our method uses an efficient fingertip tracking approach and reduces the necessity of pen-up/pen-down switching. We validate our method on two depth sensors, Kinect and Leap Motion Controller. On a dataset collected from 20 users, we achieve a recognition accuracy of 97.59% for character recognition. We also demonstrate how this system can be extended for lexicon recognition with reliable performance. We have also prepared a dataset containing 1,560 characters and 400 words with the intention of providing common benchmark for handwritten character recognition using depth sensors and related research.
Semantic Classification of Boundaries of an RGBD Image.
Anoop Namboodiri,Jawahar C V,Srikumar Ramalingam
British Machine Vision Conference, BMVC, 2015
@inproceedings{bib_Sema_2015, AUTHOR = {Anoop Namboodiri, Jawahar C V, Srikumar Ramalingam}, TITLE = {Semantic Classification of Boundaries of an RGBD Image.}, BOOKTITLE = {British Machine Vision Conference}. YEAR = {2015}}
The problem of labeling the edges present in a single color image as convex, concave, and occluding entities is one of the fundamental problems in computer vision. It has been shown that this information can contribute to segmentation, reconstruction and recognition problems. Recently, it has been shown that this classification is not straightforward even using RGBD data. This makes us wonder whether this apparent simple cue has more information than a depth map? In this paper, we propose a novel algorithm using random forest for classifying edges into convex, concave and occluding entities. We release a data set with more than 500 RGBD images with pixel-wise ground labels. Our method produces promising results and achieves an F-score of 0.84 on the data set.
Learning minutiae neighborhoods: A new binary representation for matching fingerprints
AKHIL VIJ,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2014
@inproceedings{bib_Lear_2014, AUTHOR = {AKHIL VIJ, Anoop Namboodiri}, TITLE = {Learning minutiae neighborhoods: A new binary representation for matching fingerprints}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2014}}
Representation of fingerprints is one of the key factors that limits the accuracy and efficiency of matching algorithms. Most popular methods represent each fingerprint as an unordered set of minutiae with variable cardinality and the matching algorithms are left with the task of finding the best correspondence between the two sets of minutiae. While this makes the representation more flexible and matching more accurate, the task becomes computationally intensive. Fixed length representations with aligned features are highly efficient to match. However, creating an aligned representation without the knowledge of the sample to which it is to be matched, makes the problem of representation more complex. Some of the fixed-length representations only provide partial alignment, leaving the rest to the matching stage. In this paper, we propose a fixed length representation for fingerprints that provides exact alignment between the features, thus enabling high-speed matching with minimal computational effort. The representation extends the idea of object representation using bag of words into a bag of minutiae neighborhoods. The representation is provably invariant to affine transformations (rotation, translation and uniform scaling), and is shown to be highly discriminative for the task of verification. Experimental results on FVC 2002 and 2004 datasets clearly show the superiority of the representation with competing methods. As the proposed representation can be computed from the standard minutiae templates, the method is applicable to existing datasets, where the original fingerprint images are not available.
Estimating Floor Regions in Cluttered Indoor Scenes from First Person Camera View
SANCHIT AGGARWAL,Anoop Namboodiri,Jawahar C V
International conference on Pattern Recognition, ICPR, 2014
@inproceedings{bib_Esti_2014, AUTHOR = {SANCHIT AGGARWAL, Anoop Namboodiri, Jawahar C V}, TITLE = {Estimating Floor Regions in Cluttered Indoor Scenes from First Person Camera View}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2014}}
The ability to detect floor regions from an image enables a variety of applications such as indoor scene understanding, mobility assessment, robot navigation, path planning and surveillance. In this work, we propose a framework for estimating floor regions in cluttered indoor environments. The problem of floor detection and segmentation is challenging in situations where floor and non-floor regions have similar appearances. It is even harder to segment floor regions when clutter, specular reflections, shadows and textured floors are present within the scene. Our framework utilizes a generic classifier trained from appearance cues as well as floor density estimates, both trained from a variety of indoor images. The results of the classifier is then adapted to a specific test image where we integrate appearance, position and geometric cues in an iterative framework. A Markov Random Field framework is used to integrate the cues to segment floor regions. In contrast to previous settings that relied on optical flow, depth sensors or multiple images in a calibrated setup, our method can work on a single image. It is also more flexible as we avoid assumptions like Manhattan world scene or restricting clutter only to wall-floor boundaries. Experimental results on the public MIT Scene dataset as well as a more challenging dataset that we acquired, demonstrate the robustness and efficiency of our framework on the above mentioned complex situations.
Face recognition in videos by label propagation
N VIJAY KUMAR,Anoop Namboodiri,Jawahar C V
International conference on Pattern Recognition, ICPR, 2014
@inproceedings{bib_Face_2014, AUTHOR = {N VIJAY KUMAR, Anoop Namboodiri, Jawahar C V}, TITLE = {Face recognition in videos by label propagation}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2014}}
We consider the problem of automatic identification of faces in videos such as movies, given a dictionary of known faces from a public or an alternate database. This has applications in video indexing, content based search, surveillance, and real time recognition on wearable computers. We propose a two stage approach for this problem. First, we recognize the faces in a video using a sparse representation framework using l1-minimization and select a few key-frames based on a robust confidence measure.We then use transductive learning to propagate the labels from the key-frames to the remaining frames by incorporating constraints simultaneously in temporal and feature spaces. This is in contrast to some of the previous approaches where every test frame/track is identified independently, ignoring the correlation between the faces in video tracks. Having a few key frames belonging to few subjects for label propagation rather than a large dictionary of actors reduces the amount of confusion. We evaluate the performance of our algorithm on Movie Trailer face dataset and five movie clips, and achieve a significant improvement in labeling accuracy compared to previous approaches
Providing Services on Demand By User Action Modeling on Smart Phones
KUMAR VISHAL,ROMIL BANSAL,Anoop Namboodiri,Jawahar C V
international joint conference on pervasive and ubiquitous computing, Ubicomp, 2014
@inproceedings{bib_Prov_2014, AUTHOR = {KUMAR VISHAL, ROMIL BANSAL, Anoop Namboodiri, Jawahar C V}, TITLE = {Providing Services on Demand By User Action Modeling on Smart Phones}, BOOKTITLE = {international joint conference on pervasive and ubiquitous computing}. YEAR = {2014}}
One-Time Biometric Token based Authentication
KULKARNI ROHAN PRADEEP,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2014
@inproceedings{bib_One-_2014, AUTHOR = {KULKARNI ROHAN PRADEEP, Anoop Namboodiri}, TITLE = {One-Time Biometric Token based Authentication}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2014}}
Widely used online commerce systems require an user to submit his sole banking credentials or credit card details for availing desired services, thus involving high risks with untrusted service providers. Often used one-time password based systems provide additional transaction security, but are still incapable of differentiating between a genuine user trying to authenticate or an adversary with stolen credentials. This brings out a strong need for biometrics based onetime password systems. In this paper we propose a one-time biometric token based authentication protocol which works within the framework of current online transaction schemes allowing an user to carry out a financial transaction with a service provider which completes with an authorization from the bank. The proposed protocol is based on key-binding biometric cryptosystems and upholds the requirements of secure authentication, template protection and revocability while providing privacy to individual’s biometrics and anonymity from the service provider. We demonstrate our system’s security and performance using iris biometrics to authenticate individuals.
A Sketch-Based Approach To Video Retrieval Using Qualitative Features
KOUSTAV GHOSAL,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2014
@inproceedings{bib_A_Sk_2014, AUTHOR = {KOUSTAV GHOSAL, Anoop Namboodiri}, TITLE = {A Sketch-Based Approach To Video Retrieval Using Qualitative Features}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2014}}
Motion trajectories extracted from certain videos contain sufficient spatio-temporal information which can be effectively used to characterize those videos. But the task of framing text-based queries for such videos in content-based video retrieval systems is very complicated. Sketch based query is an efficient tool to construct motion-based queries but perceptual differences like spatial and temporal variability pose serious challenges to query modelling. In this work we propose a new method of modelling sketch based queries which attempts to extract the qualitative features of motion by minimizing the perceptual variability. We also develop a multilevel filter for indexing a query, in which the search results are refined at each stage using a cumulative scoring mechanism. Finally, we show the effectiveness of our algorithm on a dataset of real pool videos and a synthetic dataset containing simulated videos having very complex motion trajectories.
Fingerprint enhancement using unsupervised hierarchical feature learning
MIHIR SAHASRABUDHE,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2014
@inproceedings{bib_Fing_2014, AUTHOR = {MIHIR SAHASRABUDHE, Anoop Namboodiri}, TITLE = {Fingerprint enhancement using unsupervised hierarchical feature learning}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2014}}
We present an approach for learning low- and high-level fingerprint structures in an unsupervised manner, which we use for enhancement of fingerprint images and estimation of orientation fields, frequency images, and region masks. We incorporate the use of a convolutional deep belief network to learn features from greyscale, clean fingerprint images. We also show that reconstruction performed by the learnt network works as a suitable enhancement of the fingerprint, and hierarchical probabilistic inference is able to estimate overall fingerprint structures as well. Our approach performs better than Gabor-based enhancement and short time Fourier transform-assisted enhancement on images it was trained on. We further use information from the learnt features in first layer, which are short and oriented ridge structures, to extract the orientation field, frequency image, and region mask of input fingerprints
Secure hamming distance based biometric authentication
KULKARNI ROHAN PRADEEP,Anoop Namboodiri
International conference on Biometrics, IJCB, 2013
@inproceedings{bib_Secu_2013, AUTHOR = {KULKARNI ROHAN PRADEEP, Anoop Namboodiri}, TITLE = {Secure hamming distance based biometric authentication}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2013}}
Concerns of privacy, template security and efficiency ofbiometric authentication systems have received consider-able research attention in the recent past. Binary templaterepresentations have been shown to provide significant im-provements in efficiency without compromising accuracyfor a variety of biometric modalities including fingerprints,palmprints and iris. Most of the secure authentication pro-tocols work with generic feature representations or proposenew secure templates for specific modalities. In this work,we propose an authentication protocol that works with anybinary biometric representation that employs hamming dis-tance for comparison and convert it into a secure, privacypreserving protocol that provides template protection. Wealso provide the ability to use masks while computing thehamming distance. The protocol retains the accuracy of theunderlying representation as demonstrated by our experi-ments
Sparse document image coding for restoration
Vijay Kumar,AMIT KUMAR BANSAL,GOUTAM HARI TULSIYAN,ANAND MISHRA,Anoop Namboodiri,Jawahar C V
International Conference on Document Analysis and Recognition, ICDAR, 2013
@inproceedings{bib_Spar_2013, AUTHOR = {Vijay Kumar, AMIT KUMAR BANSAL, GOUTAM HARI TULSIYAN, ANAND MISHRA, Anoop Namboodiri, Jawahar C V}, TITLE = {Sparse document image coding for restoration}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2013}}
Sparse representation based image restoration tech-niques have shown to be successful in solving various inverse problems such as denoising, in painting, and super-resolution,etc. on natural images and videos. In this paper, we explore the use of sparse representation based methods specifically to restore the degraded document images. While natural images form a very small subset of all possible images admitting the possibility of sparse representation, document images are significantly more restricted and are expected to be ideally suited for such a representation. However, the binary nature of textual document images makes dictionary learning and coding techniques unsuitable to be applied directly. We leverage the fact that different characters possess similar strokes, curves, and edges, and learn a dictionary that gives sparse decomposition for patches. Experimental results show significant improvement in image quality and OCR performance on documents collected from a variety of sources such as magazines and books. This method is therefore, ideally suited for restoring highly degraded images in repositories such as digital libraries.
A ballistic stroke representation of online handwriting for recognition
PRABHU TEJA S,Anoop Namboodiri
International Conference on Document Analysis and Recognition, ICDAR, 2013
@inproceedings{bib_A_ba_2013, AUTHOR = {PRABHU TEJA S, Anoop Namboodiri}, TITLE = {A ballistic stroke representation of online handwriting for recognition}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2013}}
Robust segmentation of ballistic strokes fromonline handwritten traces is critical in parameter estimationof stroke based models for applications such as recognition,synthesis, and writer identification. In this paper we propose anew method for segmenting ballistic strokes from online hand-writing. Traditional methods of ballistic stroke segmentationrely on detection of local minima of pen speed. Unfortunately,this approach is highly sensitive to noise, in sensing and inboth spatial and temporal dimensions. We decompose theproblem into two steps, where the spatial noise is filteredout in the first step. The ballistic stroke boundaries are thendetected at the local curvature maxima, which we show to beinvariant to temporal sampling noise. We also propose a bag-of-strokes representation based on ballistic stroke segmentationfor online character recognition that improves the state-of-the-art recognition accuracies on multiple datasets
Ink-bleed reduction using layer separation
SHRIKANT BARONIA,Anoop Namboodiri
International Conference on Document Analysis and Recognition, ICDAR, 2013
@inproceedings{bib_Ink-_2013, AUTHOR = {SHRIKANT BARONIA, Anoop Namboodiri}, TITLE = {Ink-bleed reduction using layer separation}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2013}}
We present a novel method for reducing the effectsof ink-bleed in handwritten documents. We go beyond the existing works on ink bleed detection and removal. We consider each pixel in a document as a result of combination of foreground,ink-bleed and background. We carry of a decomposition of the document image into separate foreground ink, ink-bleed, and background layers. We propose an efficient MRF formulation to achieve this separation. Degradation model for the ink and paper is proposed. The ability to extract the contributions of the three components to each pixel allows us to recover finer details of the writing. Quantitative and qualitative results on a set of historic manuscripts as well as synthetically generated documents demonstrate the effectiveness of our approach
Learning fingerprint orientation fields using continuous restricted Boltzmann machines
MIHIR SAHASRABUDHE,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2013
@inproceedings{bib_Lear_2013, AUTHOR = {MIHIR SAHASRABUDHE, Anoop Namboodiri}, TITLE = {Learning fingerprint orientation fields using continuous restricted Boltzmann machines}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2013}}
We aim to learn local orientation field patterns in fingerprints and correct distorted field patterns in noisy fingerprint images. This is formulated as a learning problem and achieved using two continuous restricted Boltzmann machines. The learnt orientation fields are then used in conjunction with traditional Gabor based algorithms for fingerprint enhancement. Orientation fields extracted by gradient-based methods are local, and do not consider neighboring orientations. If some amount of noise is present in a fingerprint, then these methods perform poorly when enhancing the image, affecting fingerprint matching. This paper presents a method to correct the resulting noisy regions over patches of the fingerprint by training two continuous restricted Boltzmann machines. The continuous RBMs are trained with clean fingerprint images and applied to overlapping patches of the input fingerprint. Experimental results show that one can successfully restore patches of noisy fingerprint images.
An ICA based Approach for Complex Color Scene Text Binarization
SIDDHARTH KHERADA,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2013
@inproceedings{bib_An_I_2013, AUTHOR = {SIDDHARTH KHERADA, Anoop Namboodiri}, TITLE = {An ICA based Approach for Complex Color Scene Text Binarization}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2013}}
Binarization of text in natural scene images is a challenging task due to the variations in color, size, and font of the text and the results are often affected by complex back-grounds, different lighting conditions, shadows and reflections.A robust solution to this problem can significantly enhance the accuracy of scene text recognition algorithms leading to a variety of applications such as scene understanding, automatic localization and navigation, and image retrieval. In this paper,we propose a method to extract and binarize text from images that contains complex background. We use an Independent Component Analysis (ICA) based technique to map out the text region, which is inherently uniform in nature, while removing shadows, specularity and reflections, which are included in the background. The technique identifies the text regions from the components extracted by ICA using a global thresholding method to isolate the foreground text. We show the results of our algorithm on some of the most complex word images from the ICDAR 2003 Robust Word Recognition Dataset and compare with previously reported methods
Sparse representation based face recognition with limited labeled samples
Vijay Kumar,Anoop Namboodiri,Jawahar C V
Asian Conference on Pattern Recognition, ACPR, 2013
@inproceedings{bib_Spar_2013, AUTHOR = {Vijay Kumar, Anoop Namboodiri, Jawahar C V}, TITLE = {Sparse representation based face recognition with limited labeled samples}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2013}}
Sparse representations have emerged as a powerful approach for encoding images in a large class of machine recognition problems including face recognition. These methods rely on the use of an over-complete basis set for representing an image. This often assumes the availability of a large number of labeled training images, especially for high dimensional data. In many practical problems, the number of labeled training samples are very limited leading to significant degradations in classification performance. To address the problem of lack of training samples, we propose a semi-supervised algorithm that labels the unlabeled samples through a multi-stage label propagation combined with sparse representation. In this representation, each image is decomposed as a linear combination of its nearest basis images, which has the advantage of both locality and sparsity.Extensive experiments on publicly available face databases show that the results are significantly better compared to state-of-the-art face recognition methods in semi-supervised setting and are on par with fully supervised techniques
Shape reconstruction from single relief image
HARSHIT AGRAWAL,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2013
@inproceedings{bib_Shap_2013, AUTHOR = {HARSHIT AGRAWAL, Anoop Namboodiri}, TITLE = {Shape reconstruction from single relief image}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2013}}
Reconstructing geometric models of relief carv-ings are of great importance in preserving cultural heritages digitally. In case of reliefs, using laser scanners and structured lighting techniques is not always feasible or are very expensive given the uncontrolled environment. Single image shape from shading is an under-constrained problem that tries to solve for the surface normals given the intensity image. Various constraints are used to make the problem tractable. To avoid the uncontrolled lighting, we use a pair of images with and without the flash and compute an image under a known illumination.This image is used as an input to the shape reconstruction algorithms. We present techniques that try to reconstruct theshape from relief images using the prior information learned from examples. We learn the variations in geometric shape corresponding to image appearances under different lighting conditions using sparse representations. Given a new image,we estimate the most appropriate shape that will result in the given appearance under the specified lighting conditions. We integrate the prior with the normals computed from reflectance equation in a MAP framework. We test our approach on relief images and compare them with the state-of-the-art shape from shading algorithms.
Salient object detection in SfM point cloud
DIVYANSH AGARWAL,SONI NISHITKUMAR HITESHKUMAR,Anoop Namboodiri
National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, NCVPRIPG, 2013
@inproceedings{bib_Sali_2013, AUTHOR = {DIVYANSH AGARWAL, SONI NISHITKUMAR HITESHKUMAR, Anoop Namboodiri}, TITLE = {Salient object detection in SfM point cloud}, BOOKTITLE = {National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics}. YEAR = {2013}}
n this paper we present a max-flow min-cut based salient object detection in 3D point cloud that results from Struc-ture from Motion (SfM) pipeline. The SfM pipeline generates noisy point cloud due to the unwanted scenes captured along with the object in the image dataset of SfM. The background points being sparse and not meaningful, it becomes necessary to remove them. Hence, any further processes (like surface reconstruction)utilizing the cleaned up model will have no hinderance from the noise removed. We present a novel approach where the camera centers are used to segment out the salient object. The algorithm is completely autonomous and does not need any user input. We test our proposed method on Indian historical models reconstructed through SfM. We evaluate the results in terms of selectivity and specificity.
MRF and DP based specular surface reconstruction
Ravindra Redddy K,Anoop Namboodiri
National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics, NCVPRIPG, 2013
@inproceedings{bib_MRF__2013, AUTHOR = {Ravindra Redddy K, Anoop Namboodiri}, TITLE = {MRF and DP based specular surface reconstruction}, BOOKTITLE = {National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics}. YEAR = {2013}}
his paper addresses the problem of reconstruction of specular surfaces using a combination of Dynamic Programming and Markov Random Fields formulation. Unlike traditional methods that require the exact position of environment points to be known, our method requires only the relative position of the environment points to be known for computing approximate normals and infer shape from them. We present an approach which estimates the depth from dynamic programming routine and MRF stereo matching and use MRF optimization to fuse the results to get the robust estimate of shape. We used smooth color gradient image as our environment texture so that shape can be recovered using just a single shot. We evaluate our method using synthetic experiments on 3D models like Stanford bunny and show the real experiment results on golden statue and silver coated statue.
Improving Realism of 3D Texture using Component Based Modeling
SIDDHARTH KHERADA,PRATEEK PANDEY,Anoop Namboodiri
Winter Conference on Applications of Computer Vision Workshops, WACV-W, 2012
@inproceedings{bib_Impr_2012, AUTHOR = {SIDDHARTH KHERADA, PRATEEK PANDEY, Anoop Namboodiri}, TITLE = {Improving Realism of 3D Texture using Component Based Modeling}, BOOKTITLE = {Winter Conference on Applications of Computer Vision Workshops}. YEAR = {2012}}
3D textures are often described by parametric functions for each pixel, that models the variation in its appearance with respect to varying lighting direction. However, parametric models such as Polynomial Texture Maps (PTMs) tend to smoothen the changes in appearance. We propose a technique to effectively model natural material surfaces and their interactions with changing light conditions. We show that the direct and global components of the image have different nature, and when modeled separately, leads to a more accurate and compact model of the 3D surface texture. For a given lighting position, both components are computed separately and combined to render a new image. This method models sharp shadows and specularities, while preserving the structural relief and surface color. Thus rendered image have enhanced photorealism as compared to images rendered by existing single pixel models such as PTMs.
Cascaded filtering for fingerprint identification using random projections
ATIF IQBAL,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2012
@inproceedings{bib_Casc_2012, AUTHOR = {ATIF IQBAL, Anoop Namboodiri}, TITLE = {Cascaded filtering for fingerprint identification using random projections}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2012}}
Biometric identification often involves explicit comparison of a probe against each template stored in a database.This process becomes extremely time-consuming as the size of the database increases. Filtering approaches use a lightweight comparison to select a smaller set of candidate templates from the database for explicit comparison. However,most existing filtering schemes use specific features that are hand-crafted for the biometric trait at each stage of the filtering. In this work, we explore the effectiveness of weak features in a cascade for filtering fingerprint databases. We start with a set of potential indexing features computed from minutiae triplets and minutiae quadruplets. Each stage of filtering consists of projecting the probe onto a specific line and the removal of database samples outside a window around the probe. The critical problem in this process is the selection of lines for projection at each stage of the filtering. We show that by using a set of random lines and the proposed fitness function, one can achieve better results that optimization methods such as PCA or LDA. Experimental results show that using an ensemble of projections we can reduce the penetration to26%at a hit rate of99%. As each stage of the cascade is extremely fast, and filtering is progressive along the cascade, one can terminate the cascade at any point to achieve the desired performance. One can also combine this method with other indexing methods to improve the overall accuracy and speed. We present de-tailed experimental results on various aspects of the process on the FVC 2002 dataset.
Fingerprint indexing based on local arrangements of minutiae neighborhoods
AKHIL VIJ,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2012
@inproceedings{bib_Fing_2012, AUTHOR = {AKHIL VIJ, Anoop Namboodiri}, TITLE = {Fingerprint indexing based on local arrangements of minutiae neighborhoods}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2012}}
This paper proposes a hash-based indexing method to speed up fingerprint identification in large databases. For each minutia, its local neighborhood information is com-puted with features defined based on the geometric arrangements of its neighboring minutiae points. The features used are provably invariant to translation, rotation, scale and shear. These features are used to create an affine invariant local descriptor, called an arrangement vector, for each minutia. To account for missing and spurious minutiae, we consider subsets of the neighboring minutiae and hashes of these structures are used in the indexing process. The primary goal of the work is to explore the effectiveness of affine invariant features for representing local minutiae structures. Experiments on FVC 2002 databases show that representation is quite effective even though the technique performs slightly below the state-of-the-art methods. One could use the representation in combination with other techniques to improve the overall performance.
Object detecting using PCA image reconstruction and Optical flow
Sajjad einy,Anoop Namboodiri
International Journal of Engineering Research and Technology, IJERT, 2012
@inproceedings{bib_Obje_2012, AUTHOR = {Sajjad Einy, Anoop Namboodiri}, TITLE = {Object detecting using PCA image reconstruction and Optical flow}, BOOKTITLE = {International Journal of Engineering Research and Technology}. YEAR = {2012}}
This paper addresses the issue of moving object tracking from video. Two steps of processing are PCA reconstruction for classification the area and Optical flow-based tracking of feature points. Optimal flow-based tracking algorithm predicts and restores the feature of area in the real time object tracking. The proposed system is computationally more efficient for learning of dynamic object and object tracking to shape information. PCA based method compare to other methods is powerful algorithm in object detection and the Optical flowbased tracking algorithm could be used for decreasing the false point tracking and remove the error of the motion tracking in real time system tracking. The proposed algorithm tracks a set of feature points; during the tracking, feature is restored inside the predicted region. One important contribution of this work is to provide a restoration process for missing feature points, which occurs at almost every frame under realistic, noisy environment.
Detection and segmentation of approximate repetitive patterns in relief images
HARSHIT AGRAWAL,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2012
@inproceedings{bib_Dete_2012, AUTHOR = {HARSHIT AGRAWAL, Anoop Namboodiri}, TITLE = {Detection and segmentation of approximate repetitive patterns in relief images}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2012}}
Algorithms for detection of repeating patterns in images of-ten assume that the repetition is regular and highly similar across the instances. Approximate repetitions are also of interest in many domains such as hand carved sculptures, wall decorations, groups of natural objects, etc. Detection of such repetitive structures can help in applications such as image retrieval, image in painting, 3D reconstruction, etc. In this work, we look at a specific class of approximate repetitions:those in images of hand carved relief structures. We present a robust hierarchical method for detecting such repetitions.Given a single image with reliefs, our algorithm finds dense matches of local features across the image at various scales.The matching features are then grouped based on their ge-ometric configuration to find repeating elements. We also propose a method to group the repeating elements to segment the repetitive patterns in an image. In relief images,foreground and background have nearly the same texture,and matching of a single feature would not provide reliable evidence of repetition. Our grouping algorithm integrates evidences of repetition to reliably find repeating patterns.Input image is processed on a scale-space pyramid to effectively detect all possible repetitions at different scales. Our method has been tested on images with large varieties of complex repetitive patterns and the qualitative results show the robustness of our approach.
Efficient texture mapping by homogeneous patch discovery
R. VIKRAM PRATAP SINGH,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2012
@inproceedings{bib_Effi_2012, AUTHOR = {R. VIKRAM PRATAP SINGH, Anoop Namboodiri}, TITLE = {Efficient texture mapping by homogeneous patch discovery}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2012}}
Texture mapping algorithms use mesh parameterization methods to find an optimal map for the vertices of a 3D model in texture space. These techniques vary in the properties they try to optimize such as stretch and skewness of the texture when mapped onto the surface. While most of them do well in terms of quality, they tend to be computationally intensive for large mesh models, which limits their use in interactive applications. We propose a greedy alternative that is significantly faster than current algorithms and achieves comparable quality. We use a priority queue to store polygons and use tangential vectors to guide the texture over the surface. Our algorithm is simple to implement and can texture over a million polygons per second on a typical desktop.The algorithm does not impose any constraints on the meshtopology and we do not require the model to be cut into patches before texturing. Stretch and distortion measures are stable across models and are comparable to current algorithms. We also propose a method to generate self tileable textures for use in conjunction with our texture mapping algorithm. We present qualitative and quantitative results in comparison with several other texture mapping algorithms.The efficiency and robustness of our algorithm makes it use-ful in interactive modeling applications and texture mapping large mesh models such as heritage monuments.
Cascaded filtering for biometric identification using random projections
ATIF IQBAL,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2011
@inproceedings{bib_Casc_2011, AUTHOR = {ATIF IQBAL, Anoop Namboodiri}, TITLE = {Cascaded filtering for biometric identification using random projections}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2011}}
Biometric identification often involves explicit com-parison of a probe template against each template stored in a database. This approach becomes extremely time-consuming as the size of the database increases. Filtering approaches use alight-weight comparison to reduce the database to smaller set of candidates for explicit comparison. However, most existing filtering schemes use specific features that are hand-crafted for the biometric trait at each stage of the filtering. In this work,we show that a cascade of simple linear projections on random lines can achieve significant levels of filtering. Each stage offiltering consists of projecting the probe onto a specific line and removal of database samples outside a window around the probe.The approach provides a way of automatic generation of filters and avoids the need of developing specific features for different biometric traits. The method also provides us with a variety of parameters such as the projection lines, the number and order of projections, and the window sizes to customize the filtering process to a specific application. Experimental results show that using an ensemble of projections reduce the search space by 60%without increasing the false negative identification rate.
A Semi-supervised SVM Framework for Character Recognition
AMIT ARORA,Anoop Namboodiri
International Conference on Document Analysis and Recognition, ICDAR, 2011
@inproceedings{bib_A_Se_2011, AUTHOR = {AMIT ARORA, Anoop Namboodiri}, TITLE = {A Semi-supervised SVM Framework for Character Recognition}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2011}}
In order to incorporate various writing styles orfonts in a character recognizer, it is critical that a large amount of labeled data is available, which is difficult to obtain. In this work, we present a semi-supervised SVM based framework that can incorporate the unlabeled data for improvement of recognition performance. Existing semi-supervised learning methods for SVMs work well only for two-class problems. We propose a method to extend this to large-class problems by incorporating a participation termin to the optimization process. The proposed system uses a Decision Directed Acyclic Graphs (DDAG) of SVM classifiers,which have proven to be very effective for such recognition problems. We present experimental results on three different digits dataset with varying complexity, as well as additional multi-class datasets from the UCI repository for comparison with existing approaches. In addition we show that approximate annotations at the word or sentence level can be used for evaluation as well as active learning to further improve there cognition results.
Fingerprint feature extraction from gray scale images by ridge tracing
Devansh Arpit,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2011
@inproceedings{bib_Fing_2011, AUTHOR = {Devansh Arpit, Anoop Namboodiri}, TITLE = {Fingerprint feature extraction from gray scale images by ridge tracing}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2011}}
This paper deals with extraction of fingerprint features directly from gray scale images by the method of ridge tracing. While doing so, we make substantial use of contextual information gathered during the tracing process. Narrow bandpass based filtering methods for fingerprint image enhancement are extremely robust as noisy regions do not affect the result of cleaner ones. However, these method often generate artifacts whenever the underlying image does not fit the filter model, which may be due to the presence of noise and singularities. The proposed method allows us to use the contextual information to better handle such noisy regions. Moreover, the various parameters used in the algorithm have been made adaptive in order to circumvent human supervision. The experimental results from our algorithm have been compared with those from Gabor based filtering and feature extraction, as well as with the original ridge tracing work from Maio and Maltoni [11]. The results clearly indicate that the proposed approach makes ridge tracing more robust to noise and makes the extracted features more reliable.
Fingerprint Enhancement using Hierarchical Markov Random Fields
NAGAVIJAYA RAMA REDDY K,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2011
@inproceedings{bib_Fing_2011, AUTHOR = {NAGAVIJAYA RAMA REDDY K, Anoop Namboodiri}, TITLE = {Fingerprint Enhancement using Hierarchical Markov Random Fields}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2011}}
We propose a novel approach to enhance the fingerprint image and extract features such as directional fields, minutiae and singular points reliably using a Hierarchical Markov Random Field Model. Unlike traditional fingerprint enhancement techniques, we use previously learned prior patterns from a set of clean fingerprints to restore a noisy one. We are able to recover the ridge and valley structure from degraded and noisy fingerprint images by formulating it as a hierarchical interconnected MRF that processes the information at multiple resolutions. The top layer incorporates the compatibility between an observed degraded fingerprint patch and prior training patterns in addition to ridge continuity across neighboring patches. A second layer accounts for spatial smoothness of the orientation field and its discontinuity at the singularities. Further layers could be used for incorporating higher level priors such as the class of the fingerprint. The strength of the proposed approach lies in its flexibility to model possible variations in fingerprint images as patches and from its ability to incorporate contextual information at various resolutions. Experimental results (both quantitative and qualitative) clearly demonstrate the effectiveness of this approach.
Blind authentication: a secure crypto-biometric verification protocol
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
IEEE Transactions on Information Forensics and Security, TIFS, 2010
@inproceedings{bib_Blin_2010, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Blind authentication: a secure crypto-biometric verification protocol}, BOOKTITLE = {IEEE Transactions on Information Forensics and Security}. YEAR = {2010}}
Concerns on widespread use of biometric authentication systems are primarily centered around template security, revocability, and privacy. The use of cryptographic primitives to bolster the authentication process can alleviate some of these concerns as shown by biometric cryptosystems. In this paper, we propose a provably secure and blind biometric authentication protocol, which addresses the concerns of user’s privacy, template protection, and trust issues. The protocol is blind in the sense that it reveals only the identity, and no additional information about the user or the biometric to the authenticating server or vice-versa. As the protocol is based on asymmetric encryption of the biometric data, it captures the advantages of biometric authentication as well as the security of public key cryptography. The authentication protocol can run over public networks and provide nonrepudiable identity verification. The encryption also provides template protection, the ability to revoke enrolled templates, and alleviates the concerns on privacy in widespread use of biometrics. The proposed approach makes no restrictive assumptions on the biometric data and is hence applicable to multiple biometrics. Such a protocol has significant advantages over existing biometric cryptosystems, which use a biometric to secure a secret key, which in turn is used for authentication. We analyze the security of the protocol under various attack scenarios. Experimental results on four biometric datasets (face, iris, hand geometry, and fingerprint) show that carrying out the authentication in the encrypted domain does not affect the accuracy, while the encryption key acts as an additional layer of security.
Efficient privacy preserving k-means clustering
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
Pacific Asia Workshop on Intelligence and Security Informatics., PAISI, 2010
@inproceedings{bib_Effi_2010, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Efficient privacy preserving k-means clustering}, BOOKTITLE = {Pacific Asia Workshop on Intelligence and Security Informatics.}. YEAR = {2010}}
This paper introduces an efficient privacy-preserving protocol for distributed K-means clustering over an arbitrary partitioned data, shared among N parties. Clustering is one of the fundamental algorithms used in the field of data mining. Advances in data acquisition methodologies have resulted in collection and storage of vast quantities of user’s personal data. For mutual benefit, organizations tend to share their data for analytical purposes, thus raising privacy concerns for the users. Over the years, numerous attempts have been made to introduce privacy and security at the expense of massive additional communication costs. The approaches suggested in the literature make use of the cryptographic protocols such as Secure Multiparty Computation (SMC) and/or homomorphic encryption schemes like Paillier’s encryption. Methods using such schemes have proven communication overheads. And in practice are found to be slower by a factor of more than 106. In light of the practical limitations posed by privacy using the traditional approaches, we explore a paradigm shift to side-step the expensive protocols of SMC. In this work, we use the paradigm of secret sharing, which allows the data to be divided into multiple shares and processed separately at different servers. Using the paradigm of secret sharing, allows us to design a provably-secure, cloud computing based solution which has negligible communication overhead compared to SMC and is hence over a million times faster than similar SMC based protocols
Video based palmprint recognition
CHHAYA METHANI,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2010
@inproceedings{bib_Vide_2010, AUTHOR = {CHHAYA METHANI, Anoop Namboodiri}, TITLE = {Video based palmprint recognition}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2010}}
The use of camera as a biometric sensor is desirable due to its ubiquity and low cost, especially for mobile devices. Palmprint is an effective modality in such cases due to its discrimination power, ease of presentation and the scale and size of texture for capture by commodity cameras. However, the unconstrained nature of pose and lighting introduces several challenges in the recognition process. Even minor changes in pose of the palm can induce significant changes in the visibility of the lines. We turn this property to our advantage by capturing a short video, where the natural palm motion induces minor pose variations, providing additional texture information. We propose a method to register multiple frames of the video without requiring correspondence, while being efficient. Experimental results on a set of different 100 palms show that the use of multiple frames reduces the error rate from 12.75% to 4.7%. We also propose a method for detection of poor quality samples due to specularities and motion blur, which further reduces the EER to 1.8%.
A hybrid model for recognition of online handwriting in Indian scripts
AMIT ARORA,Anoop Namboodiri
International Conference on Frontiers in Handwriting Recognition, ICFHR, 2010
@inproceedings{bib_A_hy_2010, AUTHOR = {AMIT ARORA, Anoop Namboodiri}, TITLE = {A hybrid model for recognition of online handwriting in Indian scripts}, BOOKTITLE = {International Conference on Frontiers in Handwriting Recognition}. YEAR = {2010}}
We present a complete online handwritten character recognition system for Indian languages that handles the ambiguities in segmentation as well as recognition of the strokes. The recognition is based on a generative model of handwriting formation, coupled with a discriminative model for classification of strokes. Such an approach can seamlessly integrate language and script information in the generative model and deal with similar strokes using the discriminative stroke classification model. The recognition is performed in a purely bottomup fashion, starting with the strokes, and the ambiguities at each stage are preserved and transferred to the next stage for obtaining the most probable results at each stage. We also present the results of various preprocessing, feature selection and classification studies on a large data set collected from native language writers in two different Indian languages: Malayalam and Telugu. The system achieves a stroke level accuracy of 95.78% and 95.12% on Malayalam and Telugu data, respectively. The akshara level accuracy of the system is around 78% on a corpus of 60, 492 words from 367 writers.
Image based PTM synthesis for realistic rendering of low resolution 3D models
NALLAGANCHU PRADEEP RAJIV,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2010
@inproceedings{bib_Imag_2010, AUTHOR = {NALLAGANCHU PRADEEP RAJIV, Anoop Namboodiri}, TITLE = {Image based PTM synthesis for realistic rendering of low resolution 3D models}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2010}}
Capturing the shape and texture of large structures such as monuments and statues at very high resolution is extremely expensive, both in terms of time as well as storage space. In many cases the finer details are generated by surface properties of the material, and the appearance is statistically uniform. In this paper, we present an approach to add surface details to a coarse 3D model of an object based on two additional information: a set of images of the object and a high resolution model of the material that the object is made of. The material model that we employ is the Polynomial Texture Map (PTM), which captures the appearance of a surface under various illumination conditions. We use the observed images of the object as constraints to synthesize texture samples for each triangle of the object under any given illumination. The primary challenge is to synthesize a polynomial model of the texture, where the constraints arise in the image domain. We use the knowledge of object illumination to map the texture models into image space and compute the optimal patch. The texture transfer then happens as a complete 3D texturemodel. We also consider the problems of pose, scale, reflectance and smoothness of surface while carrying out the texture transfer. We synthesize the texture of an object at a per-triangle basis while carrying out operations such as normalization and blending to take care of discontinuities at the edges.
Hierarchical Local Maps for Robust Approximate Nearest Neighbor Computation
PRATYUSH BHATT,Anoop Namboodiri
International Conference on Applied Pattern Recognition, ICAPR, 2009
@inproceedings{bib_Hier_2009, AUTHOR = {PRATYUSH BHATT, Anoop Namboodiri}, TITLE = {Hierarchical Local Maps for Robust Approximate Nearest Neighbor Computation}, BOOKTITLE = {International Conference on Applied Pattern Recognition}. YEAR = {2009}}
In this paper, we propose a novel method for fast nearest neighbors retrieval in non-Euclidean and non-metric spaces. We organize the data into a hierarchical fashion that preserves the local similarity structure. A method to find the approximate nearest neighbor of a query is proposed, that drastically reduces the total number of explicit distance measures that need to be computed. The representation overcomes the restrictive assumptions in traditional manifold mappings, while enabling fast nearest neighbor’s search. Experimental results on the Unipen and CASIA Iris datasets clearly demonstrates the advantages of the approach and improvements over state of the art algorithms. The algorithm can work in batch mode as well as in sequential mode and is highly scalable.
Pose invariant palmprint recognition
CHHAYA METHANI,Anoop Namboodiri
International conference on Biometrics, IJCB, 2009
@inproceedings{bib_Pose_2009, AUTHOR = {CHHAYA METHANI, Anoop Namboodiri}, TITLE = {Pose invariant palmprint recognition}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2009}}
A palmprint based authentication system that can work with a multipurpose camera in uncontrolled circumstances, such as those mounted on a laptop, mobile device or those for surveillance, can dramatically increase the applicability of such a system. However, the performance of existing techniques for palmprint authentication fall considerably, when the camera is not aligned with the surface of the palm. The problems arise primarily due to variations in appearance introduced due to varying pose, but is compounded by specularity of the kin and blur due to motion and focus. In this paper, we propose a method to deal with variations in pose in unconstrained palmprint imaging. The method can robustly estimate and correct variations in pose, and compute a similarity measure between the corrected test image and a reference image. Experimental results on a set of 100 user’s palms captured at varying poses show a reduction in Equal Error Eate from 22.4% to 8.7%.
Efficient Biometric Verification in Encrypted Domain
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
International conference on Biometrics, IJCB, 2009
@inproceedings{bib_Effi_2009, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Efficient Biometric Verification in Encrypted Domain}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2009}}
Biometric authentication over public networks leads to a variety of privacy issues that needs to be addressed before it can become popular. The primary concerns are that the biometrics might reveal more information than the identity itself, as well as provide the ability to track users over an extended period of time. In this paper, we propose an authentication protocol that alleviates these concerns. The protocol takes care of user privacy, template protection and trust issues in biometric authentication systems. The protocol uses asymmetric encryption, and captures the advantages of biometric authentication. The protocol provides non-repudiable identity verification, while not revealing any additional information about the user to the server or vice versa. We show that the protocol is secure under various attacks. Experimental results indicate that the overall method is efficient to be used in practical scenarios.
Contextual restoration of severely degraded document images
JYOTIRMOY BANERJEE,Anoop Namboodiri,Jawahar C V
Computer Vision and Pattern Recognition, CVPR, 2009
@inproceedings{bib_Cont_2009, AUTHOR = {JYOTIRMOY BANERJEE, Anoop Namboodiri, Jawahar C V}, TITLE = {Contextual restoration of severely degraded document images}, BOOKTITLE = {Computer Vision and Pattern Recognition}. YEAR = {2009}}
We propose an approach to restore severely degraded document images using a probabilistic context model. Unlike traditional approaches that use previously learned prior models to restore an image, we are able to learn the text model from the degraded document itself, making the approach independent of script, font, style, etc. We model the contextual relationship using an MRF. The ability to work with larger patch sizes allows us to deal with severe degradations including cuts, blobs, merges and vandalized documents. Our approach can also integrate document restoration and super-resolution into a single framework, thus directly generating high quality images from degraded documents. Experimental results show significant improvement in image quality on document images collected from various sources including magazines and books, and comprehensively demonstrate the robustness and adaptability of the approach. It works well with document collections such as books, even with severe degradations, and hence is ideally suited for repositories such as digital libraries
Retrieval of online handwriting by synthesis and matching
Jawahar C V,A BALA SUBRAMANIAN,MILLION MESHESHA,Anoop Namboodiri
Pattern Recognition, PR, 2009
@inproceedings{bib_Retr_2009, AUTHOR = {Jawahar C V, A BALA SUBRAMANIAN, MILLION MESHESHA, Anoop Namboodiri}, TITLE = {Retrieval of online handwriting by synthesis and matching}, BOOKTITLE = {Pattern Recognition}. YEAR = {2009}}
Search and retrieval is gaining importance in the ink domain due to the increase in the availability of online handwritten data. However, the problem is challenging due to variations in handwriting between various writers, digitizers and writing conditions. In this paper, we propose a retrieval mechanism for online handwriting, which can handle different writing styles, specifically for Indian languages. The proposed approach provides a keyboard-based search interface that enables to search handwritten data from any platform, in addition to pen-based and example-based queries. One of the major advantages of this framework is that information retrieval techniques such as ranking relevance, detecting stopwords and controlling word forms are extended to work with search and retrieval in the ink domain. The framework also allows cross-lingual document retrieval across Indian languages
Learning and adaptation for improving handwritten character recognizers
NAVEEN CHANDRA TEWARI,Anoop Namboodiri
International Conference on Document Analysis and Recognition, ICDAR, 2009
@inproceedings{bib_Lear_2009, AUTHOR = {NAVEEN CHANDRA TEWARI, Anoop Namboodiri}, TITLE = {Learning and adaptation for improving handwritten character recognizers}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2009}}
Writer independent handwriting recognition systems are limited in their accuracy, primarily due the large variations in writing styles of most characters. Samples from a single character class can be thought of as emanating from multiple sources, corresponding to each writing style. This also makes the inter-class boundaries, complex and disconnected in the feature space. Multiple kernel methods have emerged as a potential framework to model such decision boundaries effectively, which can be coupled with maximal margin learning algorithms. We show that formulating the problem in the above framework improves the recognition accuracy. We also propose a mechanism to adapt the resulting classifier by modifying the weights of the support vectors as well as that of the individual kernels. Experimental results are presented on a data set of 16,000 alphabets collected from 470 writers using a digitizing tablet
Efficient privacy preserving video surveillance
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
International Conference on Computer Vision, ICCV, 2009
@inproceedings{bib_Effi_2009, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Efficient privacy preserving video surveillance}, BOOKTITLE = {International Conference on Computer Vision}. YEAR = {2009}}
Widespread use of surveillance cameras in offices and other business establishments, pose a significant threat to the privacy of the employees and visitors. The challenge of introducing privacy and security in such a practical surveillance system has been stifled by the enormous computational and communication overhead required by the solutions. In this paper, we propose an efficient framework to carry out privacy preserving surveillance. We split each frame into a set of random images. Each image by itself does not convey any meaningful information about the original frame, while collectively, they retain all the information. Our solution is derived from a secret sharing scheme based on the Chinese Remainder Theorem, suitably adapted to image data. Our method enables distributed secure processing and storage, while retaining the ability to reconstruct the original data in case of a legal requirement. The system installed in an office like environment can effectively detect and track people, or solve similar surveillance tasks. Our proposed paradigm is highly efficient compared to Secure Multiparty Computation, making privacy preserving surveillance, practical.
Frequency Domain Visual Servoing using Planar Contours
Visesh Chari,AVINASH SHARMA,Anoop Namboodiri,Jawahar C V
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2008
@inproceedings{bib_Freq_2008, AUTHOR = {Visesh Chari, AVINASH SHARMA, Anoop Namboodiri, Jawahar C V}, TITLE = {Frequency Domain Visual Servoing using Planar Contours}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2008}}
Fourier domain methods have had a long association with geometric vision. In this paper, we introduce Fourier domain methods into the field of visual servoing for the first time. We show how different properties of Fourier transforms may be used to address specific issues in traditional visual servoing methods, giving rise to algorithms that are more flexible. Specifically, we demonstrate how Fourier analysis may be used to obtain straight camera paths in the Cartesian space, do path following and correspondenceless visual servoing. Most importantly, by introducing Fourier techniques, we set a framework into which robust Fourier based geometry processing algorithms may be incorporated to address the various issues in servoing.
Robust image registration with illumination, blur and noise variations for super-resolution
HIMANSHU ARORA,Anoop Namboodiri,Jawahar C V
International Conference on Acoustics, Speech, and Signal Processing, ICASSP, 2008
@inproceedings{bib_Robu_2008, AUTHOR = {HIMANSHU ARORA, Anoop Namboodiri, Jawahar C V}, TITLE = {Robust image registration with illumination, blur and noise variations for super-resolution}, BOOKTITLE = {International Conference on Acoustics, Speech, and Signal Processing}. YEAR = {2008}}
Super-resolution reconstruction algorithms assume the availability of exact registration and blur parameters. Inaccurate estimation of these parameters adversely affects the quality of the reconstructed image. However, traditional approaches for image registration are either sensitive to image degradations such as variations in blur, illumination and noise, or are limited in the class of image transformations that can be estimated. We propose an accurate registration algorithm that uses the local phase information, which is robust to the above degradations. We derive the theoretical error rate of the estimates in presence of non-ideal band-pass behavior of the filter and show that the error converges to zero over iterations. We also show the invariance of local phase to a class of blur kernels. Experimental results on images taken under varying conditions clearly demonstrates the robustness of our approach.
Projected texture for hand geometry based authentication
AVINASH SHARMA,NISHANT SHOBHIT,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2008
@inproceedings{bib_Proj_2008, AUTHOR = {AVINASH SHARMA, NISHANT SHOBHIT, Anoop Namboodiri}, TITLE = {Projected texture for hand geometry based authentication}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2008}}
We propose a novel approach to 3D hand geometry based person authentication using projected light patterns. Instead of explicitly computing a depth map of the palm for recognition, we capture the depth information in the deformations of a projected texture pattern, and use it directly for recognition. The deformed pattern is characterized using local texture measures, which can encode the certain depth characteristics of the palm. An authentication system built using the proposed technique achieves an equal error rate of 0.84% on a dataset of 1341 samples collected from 149 users, as opposed to 4.03% using traditional 2D features on an identical dataset. The approach is robust as well as computationally efficient and could be applied to other 3D object recognition problems as well.
Text dependent writer verification using boosting
SACHIN GUPTA,Anoop Namboodiri
International Conference on Frontiers in Handwriting Recognition, ICFHR, 2008
@inproceedings{bib_Text_2008, AUTHOR = {SACHIN GUPTA, Anoop Namboodiri}, TITLE = {Text dependent writer verification using boosting}, BOOKTITLE = {International Conference on Frontiers in Handwriting Recognition}. YEAR = {2008}}
Text-dependent writer verification systems are preferred over text-independent systems due to the accuracy they achieve with small amount of data. However, text-dependent systems are prone to forgery. This paper proposes a novel boosting based framework for writerspecific text generation to increase the accuracy and a method of text variation to make the system robust to forgery. The approach is able to achieve error rates of 5% with just 6 words as compared to random(11%) or most discriminative(22%) primitive selection methods on a dataset containing 20 writers. Boosting based text selection also provides the flexibility to incorporate text variation across multiple authentications, which in turn makes the system robust to forgery
Projected texture for object classification
AVINASH SHARMA,Anoop Namboodiri
European Conference on Computer Vision, ECCV, 2008
@inproceedings{bib_Proj_2008, AUTHOR = {AVINASH SHARMA, Anoop Namboodiri}, TITLE = {Projected texture for object classification}, BOOKTITLE = {European Conference on Computer Vision}. YEAR = {2008}}
Algorithms for classification of 3D objects either recover the depth information lost during imaging using multiple images, structured lighting, image cues, etc. or work directly the images for classification. While the latter class of algorithms are more efficient and robust in comparison, they are less accurate due to the lack of depth information. We propose the use of structured lighting patterns projected on the object, which gets deformed according to the shape of the object. Since our goal is object classification and not shape recovery, we characterize the deformations using simple texture measures, thus avoiding the error prone and computationally expensive step of depth recovery. Moreover, since the deformations encode depth variations of the object, the 3D shape information is implicitly used for classification. We show that the information thus derived can significantly improve the accuracy of object classification algorithms, and derive the theoretical limits on height variations that can be captured by a particular projector-camera setup. A 3D texture classification algorithm derived from the proposed approach achieves a ten-fold reduction in error rate on a dataset of 30 classes, when compared to state-of-the-art image based approaches. We also demonstrate the effectiveness of the approach for a hand geometry based authentication system, which achieves a four-fold reduction in the equal error rate on a dataset containing 149 users.
Projected pixel localization and artifact removal in captured images
HIMANSHU ARORA,Anoop Namboodiri
IEEE Region 10 Conference, TENCON, 2008
@inproceedings{bib_Proj_2008, AUTHOR = {HIMANSHU ARORA, Anoop Namboodiri}, TITLE = {Projected pixel localization and artifact removal in captured images}, BOOKTITLE = {IEEE Region 10 Conference}. YEAR = {2008}}
Projector-Camera systems are extensively used for various applications in computer vision, immersive environments, visual servoing, etc. Due to gaps between neighboring pixels on the projector’s image plane and variations in scene depth, the mage projected onto a scene shows pixelation and blurring artifacts. In certain vision and graphics applications, it is desirable that a high quality composition of the scene and the projected image, excluding the artifacts, is captured, while retaining the scene characteristics. Localization of projected pixels can also help in dense estimation of scene shape. In this paper, we address the problem of localizing each of the projected pixels from a captured scene and restoring the captured image so that the pixelation and blurring artifacts of the projector are removed. We improve the quality of the captured image further by virtualizing a high resolution projector. i.e., we modify the captured image as if the scene were illuminated with a high-resolution projector. We propose robust solutions to these problems and demonstrate their effectiveness on scenes of different complexities.
How much zoom is the right zoom from the perspective of Super-Resolution?
HIMANSHU ARORA,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2008
@inproceedings{bib_How__2008, AUTHOR = {HIMANSHU ARORA, Anoop Namboodiri}, TITLE = {How much zoom is the right zoom from the perspective of Super-Resolution?}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2008}}
Constructing a high-resolution (HR) image from low resolution (LR) image(s) has been a very active research topic recently with focus shifting from multi-frames to learning based single-frame super-resolution (SR). Multi-frame R algorithms attempt the exact reconstruction of reality, but are limited to small magnification factors. Learning based SR algorithms learn the correspondences between LR and HR patches. Accurate replacements or revealing the exact underlying information is not guaranteed in many scenarios. In this paper we propose an alternate solution. We propose to capture images at right zoom such that it has just sufficient amount of information so that further resolution enhancements can be easily achieved using any off the shelf single-frame SR algorithm. This is true under the assumption that such a zoom factor is not very high, which is true for most man-made structures. The low-resolution image is divided into small patches and ideal resolution is predicted for every patch. The contextual information is incorporated using a Markov Random Field based prior. Training data is generated from high-quality images and can use any singleframe SR algorithm. Several constraints are proposed to minimize the extent of zoom-in. We validate the proposed approach on synthetic data and real world images to show the robustness
Object Category Recognition with Projected Texture
AVINASH SHARMA,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2008
@inproceedings{bib_Obje_2008, AUTHOR = {AVINASH SHARMA, Anoop Namboodiri}, TITLE = {Object Category Recognition with Projected Texture}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2008}}
Recognition of object categories from their images is extremely challenging due to the large intra-class variations, and variations in pose, illumination and scale, in addition to lack of depth information of the object. Recovering the depth information from multiple images or from image cues such as variations in illumination or focus, is both computationally intensive and error prone. In contrast, the appearance based approaches are more robust and computationally efficient. However, they lack the potential accuracy of 3D feature based approaches due to the lack of shape information. We propose the use of structured lighting patterns projected on the object, which gets deformed according to the shape of the object for recognition. Since our goal is object classification and not shape recovery, we characterize the deformations using simple texture measures, thus avoiding depth recovery step. Moreover, the shape information present in the deformations is implicitly used for classification. We show that the information thus derived can significantly improve the accuracy of object category recognition from arbitrary-pose images.
Document structure and layout analysis
Anoop Namboodiri,Anil K. Jain
Digital Document Processing,, DDP, 2007
@inproceedings{bib_Docu_2007, AUTHOR = {Anoop Namboodiri, Anil K. Jain}, TITLE = {Document structure and layout analysis}, BOOKTITLE = {Digital Document Processing,}. YEAR = {2007}}
A document image is composed of a variety of physical entities or regions such as text blocks, lines, words, figures, tables, and background. We could also assign functional or logical labels such as sentences, titles, captions, author names, and addresses to some of these regions. The process of document structure and layout analysis tries to decompose a given document image into its component regions and understand their functional roles and relationships. The processing is carried out in multiple steps, such as preprocessing, page decomposition, structure understanding, etc. We will look into each of these steps in detail in the following sections. Document images are often generated from ph
Support Vector Machine based Hierarchical Classifiers for Large Class Problems
CH.TEJO KRISHNA,Anoop Namboodiri,Jawahar C V
International Conference on Applied Pattern Recognition, ICAPR, 2007
@inproceedings{bib_Supp_2007, AUTHOR = {CH.TEJO KRISHNA, Anoop Namboodiri, Jawahar C V}, TITLE = {Support Vector Machine based Hierarchical Classifiers for Large Class Problems}, BOOKTITLE = {International Conference on Applied Pattern Recognition}. YEAR = {2007}}
One of the prime challenges in designing a classifier for large-class problems such as Indian language OCRs is the presence of a large similar looking character set. The nature of the character set introduces problems with accuracy and efficiency of the classifier. Hierarchical classifiers such as Binary Hierarchical Decision Trees (BHDTs) using SVMs as component classifiers have been effectively used to tackle such large-class classification problems. The accuracy and efficiency of a BHDT classifier will depend on: i) the accuracy of the component classifiers, ii) the separability of the clusters at each node in a hierarchical classifier, and iii) the balance of the BHDT. We propose methods to tackle each of the above problems in the case of binary character images. We present a new distance measure, which is intuitively suitable when Support Vector Machines are used as component classifiers. We also propose a novel method for balancing the BHDT to improve its efficiency, while maintaining the accuracy. Finally we propose a method to generate overlapping partitions to improve the accuracy of BHDTs. Comparison of the method with other forms of classifier combination techniques such as 1vs1, 1vsRest and Decision Directed Acyclic Graphs shows that the proposed approach is highly efficient, while being comparable with the more expensive techniques in terms of accuracy. The experiments are focused on the problem of Indian language OCR, while the framework is usable for other problems as well
Accurate image registration from local phase information
HIMANSHU ARORA,Anoop Namboodiri,Jawahar C V
National Conference on Communications, NCC, 2007
@inproceedings{bib_Accu_2007, AUTHOR = {HIMANSHU ARORA, Anoop Namboodiri, Jawahar C V}, TITLE = {Accurate image registration from local phase information}, BOOKTITLE = {National Conference on Communications}. YEAR = {2007}}
Accurate registration of images is essential for many computer vision algorithms for medical image analysis, super-resolution, and image mosaicing. Performance of traditional correspondence-based approaches is restricted by the reliability of the feature detector. Popular frequency domain approaches use the magnitude of global frequencies for registration, and are limited in the class of transformations that can be estimated. We propose the use of local phase information for accurate image registration as it is robust to noise and illumination conditions and the estimates are obtained at sub-pixel accuracy without any correspondence computation. We form an overdetermined system of equations from the phase differences to estimate the parameters of image registration. We demonstrate the effectiveness of the approach for affine transformation under Gaussian white noise and varying illumination conditions.
Repudiation detection in handwritten documents
SACHIN GUPTA,Anoop Namboodiri
International conference on Biometrics, IJCB, 2007
@inproceedings{bib_Repu_2007, AUTHOR = {SACHIN GUPTA, Anoop Namboodiri}, TITLE = {Repudiation detection in handwritten documents}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2007}}
Forensic document verification presents a different and interesting set of challenges as opposed to traditional writer identification and verification tasks using natural handwriting. The handwritten data presented to a forensic examiner is often deliberately altered, in addition to being limited in quantity. Specifically, the alterations can be either forged, where one imitates another person’s handwriting; or repudiated, where one deliberately distorts his handwriting in order to avoid identification. In this paper, we present a framework to detect repudiation in forensic documents, where we only have one pair of documents to arrive at a decision. The approach generates a statistically significant confidence score from matching two documents, which can be used to screen the documents that are passed on to an expert examiner. The approach can be extended for detection of forgeries as well.
On using classical poetry structure for Indian language post-processing
Anoop Namboodiri,Narayanan P J,Jawahar C V
International Conference on Document Analysis and Recognition, ICDAR, 2007
@inproceedings{bib_On_u_2007, AUTHOR = {Anoop Namboodiri, Narayanan P J, Jawahar C V}, TITLE = {On using classical poetry structure for Indian language post-processing}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2007}}
Post-processors are critical to the performance of language recognizers like OCRs, speech recognizers, etc. Dictionary-based post-processing commonly employ either an algorithmic approach or a statistical approach. Other linguistic features are not exploited for this purpose. The language analysis is also largely limited to the prose form.This paper proposes a framework to use the rich metric and formal structure of classical poetic forms in Indian languages for post-processing a recognizer like an OCR engine. We show that the structure present in the form of the vrtta and prasa ¯ can be efficiently used to disambiguate some cases that may be difficult for an OCR. The approach is efficient, and complementary to other post-processing approaches and can be used in conjunction with them.
Learning segmentation of documents with complex scripts
K S SESH KUMAR,Anoop Namboodiri,Jawahar C V
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2006
@inproceedings{bib_Lear_2006, AUTHOR = {K S SESH KUMAR, Anoop Namboodiri, Jawahar C V}, TITLE = {Learning segmentation of documents with complex scripts}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2006}}
Most of the state-of-the-art segmentation algorithms are de-signed to handle complex document layouts and backgrounds, while assuming a simple script structure such as in Roman script. They perform poorly when used with Indian languages, where the components are not strictly collinear. In this paper, we propose a document segmentation algorithm that can handle the complexity of Indian scripts in large document image collections. Segmentation is posed as a graph cut problem that incorporates the a prior information from script structure in the objective function of the cut. We show that this information can be learned automatically and be adapted within a collection of documents (a book)and across collections to achieve accurate segmentation. We show the results on Indian language documents in Telugu script. The approach is also applicable to other languages with complex scripts such as Bangla,Kannada, Malayalam, and Urdu
Model-based annotation of online handwritten datasets
ANAND KUMAR,A BALA SUBRAMANIAN,Anoop Namboodiri,Jawahar C V
International Conference on Frontiers in Handwriting Recognition, ICFHR, 2006
@inproceedings{bib_Mode_2006, AUTHOR = {ANAND KUMAR, A BALA SUBRAMANIAN, Anoop Namboodiri, Jawahar C V}, TITLE = {Model-based annotation of online handwritten datasets}, BOOKTITLE = {International Conference on Frontiers in Handwriting Recognition}. YEAR = {2006}}
Annotated datasets of handwriting are a prerequisite to attempt a variety of problems such as building recognizers, developing writer identification algorithms, etc. However, the annotation of large datasets is a tedious and expensive process, especially at the character or stroke level. In this paper we propose a novel, automated method for annotation at the character level, given a parallel corpus of online handwritten data and the corresponding text. The method employs a model-based handwriting synthesis unit to map the two corpora to the same space and the annotation is propagated to the word level and then to the individual characters using elastic matching. The initial results of annotation are used to improve the handwriting synthesis model for the user under consideration, which in turn refines the annotation. The method can take care of errors in the handwriting such as spurious and missing strokes or characters. The output is stored in the UPXInkML format
Text Independent Writer Identification from Online Handwriting
Anoop Namboodiri,SACHIN GUPTA
International Conference on Frontiers in Handwriting Recognition, ICFHR, 2006
@inproceedings{bib_Text_2006, AUTHOR = {Anoop Namboodiri, SACHIN GUPTA}, TITLE = {Text Independent Writer Identification from Online Handwriting}, BOOKTITLE = {International Conference on Frontiers in Handwriting Recognition}. YEAR = {2006}}
Automatic identification of the author of a document has a variety of applications for both online and offline handwritten data such as facilitating the use of write dependent recognizers, verification of claimed identity for security, enabling personalized HCI and countering repudiations for legal purposes. Most of the existing writer identification techniques require the data to be from a specific text or a recognizer be available, which is not always feasible. Text-independent approaches often require large amount of data to be confident of good results. In this work, we propose a text-independent writer identification framework that uses a specified set of primitives of online handwritten data to ascertain the identity of the writer. The framework allows us to learn the properties of the script and the writers simultaneously and hence can be used with multiple languages or scripts. We demonstrate the applicability of our framework by choosing shapes of curves as primitives and show results on five different scripts and on different data sets.
Document understanding system using stochastic context-free grammars
John C. Handley,Anoop Namboodiri,Richard Zanibbi
International Conference on Document Analysis and Recognition, ICDAR, 2005
@inproceedings{bib_Docu_2005, AUTHOR = {John C. Handley, Anoop Namboodiri, Richard Zanibbi}, TITLE = {Document understanding system using stochastic context-free grammars}, BOOKTITLE = {International Conference on Document Analysis and Recognition}. YEAR = {2005}}
We present a document understanding system in which the arrangement of lines of text and block separators within a document are modeled by stochastic context free gram-mars. A grammar corresponds to a document genre; our system may be adapted to a new genre simply by replacing the input grammar. The system incorporates an optical character recognition system that outputs characters,their positions and font sizes. These features are combined to form a document representation of lines of text and separators. Lines of text are labeled as tokens using regular expression matching. The maximum likelihood parse of this stream of tokens and separators yields a functional labeling of the document lines. We describe business card and business letter applications.
Learning to segment document images
K S SESH KUMAR,Anoop Namboodiri,Jawahar C V
Conference on Pattern Recognition and Machine Intelligence, PReMI, 2005
@inproceedings{bib_Lear_2005, AUTHOR = {K S SESH KUMAR, Anoop Namboodiri, Jawahar C V}, TITLE = {Learning to segment document images}, BOOKTITLE = {Conference on Pattern Recognition and Machine Intelligence}. YEAR = {2005}}
A hierarchical framework for document segmentation is proposed as an optimization problem. The model incorporates the dependencies between various levels of the hierarchy unlike traditional document segmentation algorithms.This framework is applied to learn the parameters of the document segmentation algorithm using optimization methods like gradient descent and Q-learning.The novelty of our approach lies in learning the segmentation parameters in the absence of ground truth.
Retrieval of On-line Hand-Drawn Sketches.
Anoop Namboodiri, Anil K. Jain
International conference on Pattern Recognition, ICPR, 2004
@inproceedings{bib_Retr_2004, AUTHOR = {Anoop Namboodiri, Anil K. Jain}, TITLE = {Retrieval of On-line Hand-Drawn Sketches.}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2004}}
Sketch matching algorithms are commonly used for indexing and retrieval of documents based on printed or hand-drawn sketches. One could use a hand-held computer to do sketch-based queries to a database containing hand-drawn and printed sketches. We present an on-line hand-drawn sketch matching algorithm based on a line-based representation of sketches. A distance measure is defined for comparing two sketches based on this representation. The algorithm is computationally efficient and achieves a recall rate of 88.44% at the same precision, when tested on a database of 150 sketches collected from 5 users.