@inproceedings{bib_On_A_2023, AUTHOR = {Arun Kumar Subramaniam, Anoop Namboodiri}, TITLE = {On Attribute Aware Open-set Face Verification}, BOOKTITLE = {International Conference on Computer Vision Theory and Applications}. YEAR = {2023}}
Deep Learning on face recognition problems has shown extremely high accuracy owing to their ability in finding strongly discriminating features. However, face images in the wild show variations in pose, lighting, expressions, and the presence of facial attributes (for example eyeglasses). We ask, why then are these vari- ations not detected and used during the matching process? We demonstrate that this is indeed possible while restricting ourselves to facial attribute variation, to prove the case in point. We show two ways of doing so. a) By using the face attribute labels as a form of prior, we bin the matching template pairs into three bins depend- ing on whether each template of the matching pair possesses a given facial attribute or not. By operating on each bin and averaging the result, we better the EER of SOTA by over 1 % over a large set of matching pairs. b) We use the attribute labels and correlate them with each neuron of an embedding generated by a SOTA architecture pre-trained DNN on a large Face dataset and fine-tuned on face-attribute labels. We then suppress a set of maximally correlating neurons and perform matching after doing so. We demonstrate this improves the EER by over 2 %.
AdvGen: Physical Adversarial Attack on Face Presentation Attack Detection Systems
Sai Amrit Patnaik,Shivali Chansoriya,Anoop Namboodiri,Anil K. Jain
@inproceedings{bib_AdvG_2023, AUTHOR = {Sai Amrit Patnaik, Shivali Chansoriya, Anoop Namboodiri, Anil K. Jain}, TITLE = {AdvGen: Physical Adversarial Attack on Face Presentation Attack Detection Systems}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2023}}
Evaluating the risk level of adversarial images is essential for safely deploying face authentication models in the real world. Popular approaches for physical-world attacks, such as print or replay attacks, suffer from some limitations, like including physical and geometrical artifacts. Recently adversarial attacks have gained attraction, which try to digitally deceive the learning strategy of a recognition system using slight modifications to the captured image. While most previous research assumes that the adversarial image could be digitally fed into the authentication systems, this is not always the case for systems deployed in the real world. This paper demonstrates the vulnerability of face authentication systems to adversarial images in physical world scenarios. We propose AdvGen, an automated Generative Adversarial Network, to simulate print and replay attacks and generate adversarial images that can fool state-of-the-art PADs in a physical domain attack setting. Using this attack strategy, the attack success rate goes up to 82.01%. We test AdvGen extensively on four datasets and ten state-of-the-art PADs. We also demonstrate the effectiveness of our attack by conducting experiments in a realistic, physical environment.
Finger-UNet: A U-Net Based Multi-Task Architecture for Deep Fingerprint Enhancement
@inproceedings{bib_Fing_2023, AUTHOR = {Gavas Ekta Balkrishna, Anoop Namboodiri}, TITLE = {Finger-UNet: A U-Net Based Multi-Task Architecture for Deep Fingerprint Enhancement}, BOOKTITLE = {International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applicat}. YEAR = {2023}}
For decades, fingerprint recognition has been prevalent for security, forensics, and other biometric applications. However, the availability of good-quality fingerprints is challenging, making recognition difficult. Fingerprint images might be degraded with a poor ridge structure and noisy or less contrasting backgrounds. Hence, fingerprint enhancement plays a vital role in the early stages of the fingerprint recognition/verification pipeline. In this paper, we investigate and improvise the encoder-decoder style architecture and suggest intuitive modifications to U-Net to enhance low-quality fingerprints effectively. We investigate the use of Discrete Wavelet Transform (DWT) for fingerprint enhancement and use a wavelet attention module instead of max pooling which proves advantageous for our task. Moreover, we replace regular convolutions with depthwise separable convolutions, which significantly reduces the memory footprint of the model without degrading the performance. We also demonstrate that incorporating domain knowledge with fingerprint minutiae prediction task can improve fingerprint reconstruction through multi-task learning. Furthermore, we also integrate the orientation estimation task to propagate the knowledge of ridge orientations to enhance the performance further. We present the experimental results and evaluate our model on FVC 2002 and NIST SD302 databases to show the effectiveness of our approach compared to previous works.
@inproceedings{bib_S-BA_2023, AUTHOR = {Praguna Manvi, Desai Achintya Manohar, Srinathan Kannan, Anoop Namboodiri}, TITLE = {S-BAN: Secure Biometric Authentication using Noise}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2023}}
Biometric signal consisting of irrelevant or non-distinctive features can contain useful correlational properties that privacy-preserving verification schemes can exploit. While an efficient protocol for iris verification using noise has been presented, it is not applicable to other widely used modalities, i.e., face and fingerprint, since the methods of noise extraction and comparison are different. In this work, we design a verification protocol for secure dot product computation and also propose noise extraction mechanisms for face and fingerprint modalities. We evaluate the performance of the protocol on CFP, LFW, CelebA, FVC 2004 DB1A, DB2A, DB3A, and SOCOFing datasets. While the protocol exhibits a slight degradation in accuracy, it provides information-theoretic security with a practical computational complexity.
Cycle Consistency Based Method for Learning Disentangled Representation for Stochastic Video Prediction
@inproceedings{bib_Cycl_2022, AUTHOR = {Ujjwal Tiwari, P ADITYA SREEKAR, Anoop Namboodiri}, TITLE = {Cycle Consistency Based Method for Learning Disentangled Representation for Stochastic Video Prediction}, BOOKTITLE = {International Conference on Image Analysis and Processing}. YEAR = {2022}}
Video frame prediction is an interesting computer vision problem of predicting the future frames of a video sequence from a given set of context frames. Video prediction models have found wide-scale perspective applications in autonomous navigation, representation learning, and healthcare. However, predicting future frames is challenging due to the high dimensional and stochastic nature of video data. This work proposes a novel cycle consistency loss to disentangle video representation into a low dimensional time-dependent pose and time-independent content latent factors in two different VAE based video prediction models. The key motivation behind cycle consistency loss is that future frame predictions are more plausible and realistic if they reconstruct the previous frames. The proposed cycle consistency loss is also generic because it can be applied to other VAE-based stochastic video prediction architectures with slight architectural modifications. We validate our disentanglement hypothesis and the quality of long-range predictions on standard synthetic and challenging real-world datasets such as Stochastic Moving MNIST and BAIR.
@inproceedings{bib_SIAN_2022, AUTHOR = {Praguna Manvi, ACHINTYA DESAI, Srinathan Kannan, Anoop Namboodiri}, TITLE = {SIAN: Secure Iris Authentication using Noise}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2022}}
Biometric noise is often discarded in many biometric template protection systems. However, the noise ratio be- tween two templates encodes specific correlational proper- ties that template protection schemes can exploit. Biometric authentication usually occurs between mutually distrusting parties, which calls for privacy-preserving techniques. In this paper, we propose a novel biometric authentication pro- tocol, SIAN(Secure Iris Authentication using Noise), adapt- ing secure two-party computation and incorporating un- certainty constraints from biometric noise for security. We evaluate it on three iris datasets: MMU v1, Ubiris v1, and IITD v1, and observe a low EER degradation. The pro- posed protocol has information-theoretic security and low computational complexity, making it suitable for practical real-time applications.
@inproceedings{bib_Tran_2022, AUTHOR = {Saraansh Tandon, Anoop Namboodiri}, TITLE = {Transformer based Fingerprint Feature Extraction}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2022}}
Fingerprint feature extraction is a task that is solved using either a global or a local representation. State-of-the-art global approaches use heavy deep learning models to process the full fingerprint image at once, which makes the corresponding approach memory intensive. On the other hand, local approaches involve minutiae based patch extraction, multiple feature extraction steps and an expensive matching stage, which make the corresponding approach time intensive. However, both these approaches provide useful and sometimes exclusive insights for solving the problem. Using both approaches together for extracting fingerprint representations is semantically useful but quite inefficient. Our convolutional transformer based approach with an in-built minutiae extractor provides a time and memory efficient solution to extract a global as well as a local representation of the fingerprint. The use of these …
SYSTEM AND METHOD FOR GENERATING DEWARPED IMAGE USING PROJECTION PATTERNS CAPTURED FROM OMNI - DIRECTIONAL STEREO CAMERAS
@inproceedings{bib_SYST_2022, AUTHOR = {Rajat Agarwal, Anoop Namboodiri, Parikshit Sakurikar}, TITLE = {SYSTEM AND METHOD FOR GENERATING DEWARPED IMAGE USING PROJECTION PATTERNS CAPTURED FROM OMNI - DIRECTIONAL STEREO CAMERAS}, BOOKTITLE = {United States Patent}. YEAR = {2022}}
A system for generating high - resolution de - warped omni directional stereo image from captured omni - directional stereo image by correcting optical distortions using projec tion patterns is provided . The system includes a projection pattern capturing arrangement , a projector or a display , and a de - warping server . The projection pattern capturing arrangement includes one or more omnidirectional cameras to capture projection patterns from the captured omni directional stereo image from each omni - directional stereo camera . The projector or the display displays the projection patterns . The de - warping server obtain the projection pat terns and processes the projection patterns to generate high resolution de - warped omni - directional stereo image by cor recting optical distortions in the captured omni - directional stereo image and mapping the captured omni - directional stereo image and the high resolution de - warped omni directional stereo image
Sidhant Subramanian,SRI AUROBINDO MUNAGALA,SHYAMGOPAL KARTHIK,Ameya Prabhu,Anoop Namboodiri
@inproceedings{bib_CLAC_2022, AUTHOR = {Sidhant Subramanian, SRI AUROBINDO MUNAGALA, SHYAMGOPAL KARTHIK, Ameya Prabhu, Anoop Namboodiri}, TITLE = {CLACTIVE: EPISODIC MEMORIES FOR RAPID ACTIVE LEARNING}, BOOKTITLE = {Conference on Lifelong Learning Agents. PMLR}. YEAR = {2022}}
Active Learning aims to solve the problem of alleviating labelling costs for large-scale datasets by selecting a subset of data to effectively train on. Deep Active Learning (DAL) techniques typically involve repeated training of a model for sample acquisition over the entire subset of labelled data available in each round. This can be prohibitively expensive to run in real-world scenarios with large and constantly growing data. Some work has been done to address this – notably, SelectionVia-Proxy (SVP) proposed the use of a separate, smaller “proxy” model for acquisition. We explore further optimizations to the standard DAL setup and propose CLActive: an optimization procedure that brings significant speedups which maintains a constant training time for the selection model across rounds and retains information from past rounds using Experience Replay. We demonstrate large improvements in total train-time compared to the fully-trained baselines and SVP. We achieve up to 89×, 7×, 61× speedups over the fully-trained baseline at 50% of dataset collection in CIFAR, Imagenet and Amazon Review datasets, respectively, with little accuracy loss. We also show that CLActive is robust against catastrophic forgetting in a challenging class-incremental active-learning setting. Overall, we believe that CLActive can effectively enable rapid prototyping and deployment of deep AL algorithms in real-world use cases across a variety of settings.
@inproceedings{bib_Syst_2022, AUTHOR = {RAJAT AGGARWAL, Anoop Namboodiri}, TITLE = {System and method for capturing omni-stereo videos using multi-sensors}, BOOKTITLE = {United States Patent}. YEAR = {2022}}
A method of calibrating cameras used to collect images to form an omni - stereo image is disclosed . The method may comprise determining intrinsic and extrinsic camera param eters for each of a plurality of left eye cameras and right eye cameras arranged along a viewing circle or ellipse and angled tangentially with respect to the viewing circle or ellipse ; categorizing left - right pairs of the plurality of left eye cameras and the plurality of right eye cameras into at least a first category , a second category or a third category ; aligning the left - right pairs of cameras that fall into the first category ; aligning the left - right pairs of cameras that fall into the second category ; and aligning the left - right pairs of cameras that fall into the third category by using extrinsic parameters of the left - right pairs that fall into the first category , and of the left - right pairs that fall into the second category .
Supervision by Landmarks: An Enhanced Facial De-occlusion Network for VR-based Applications
Surabhi Gupta,Jinka Sai Sagar,Avinash Sharma,Anoop Namboodiri
European Conference on Computer Vision Workshops, ECCV-W, 2022
@inproceedings{bib_Supe_2022, AUTHOR = {Surabhi Gupta, Jinka Sai Sagar, Avinash Sharma, Anoop Namboodiri}, TITLE = {Supervision by Landmarks: An Enhanced Facial De-occlusion Network for VR-based Applications}, BOOKTITLE = {European Conference on Computer Vision Workshops}. YEAR = {2022}}
Face possesses a rich spatial structure that can provide valuable cues to guide various face-related tasks. The eyes are considered an important sociovisual cue for effective communication. They are an integral feature of facial expressions as they are an important aspect of interpersonal communication. However, virtual reality headsets occlude a significant portion of the face and restrict the visibility of certain facial features, particularly the eye region. Reproducing this region with realistic content and handling complex eye movements such as blinks is challenging. Previous facial inpainting methods are not capable enough to capture subtle eye movements. In view of this, we propose a working solution to refine the reconstructions, particularly around the eye region, by leveraging inherent eye structure. We introduce spatial supervision and a novel landmark predictor module to regularize per-frame reconstructions obtained from an existing image-based facial de-occlusion network. experiments verify the usefulness of our approach in enhancing the quality of reconstructions to capture subtle eye movements
One-Shot Sensor and Material Translator : A Bilinear Decomposer for Fingerprint Presentation Attack Generalization
Gowri Lekshmy,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2022
@inproceedings{bib_One-_2022, AUTHOR = {Gowri Lekshmy, Anoop Namboodiri}, TITLE = {One-Shot Sensor and Material Translator : A Bilinear Decomposer for Fingerprint Presentation Attack Generalization}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2022}}
Automatic fingerprint recognition systems are currently under the constant threat of presentation attacks (PAs). Existing fingerprint presentation attack detection (FPAD) solutions improve cross-sensor and cross-material generalization by utilizing style-transfer-based augmentation wrappers over a two-class PAD classifier. These solutions synthesize data by learning the style as a single entity, containing both sensor and material characteristics. However, these strategies necessitate learning the entire style upon adding a new sensor for an already known material or vice versa. We propose a bilinear decomposition-based wrapper called OSMT to improve cross-sensor and cross-material FPAD. OSMT uses one PA fingerprint to learn the corresponding sensor and material representations by disentanglement. Our approach also reduces the computational complexity by generating compact representations and utilizing lesser combinations of sensors and materials to produce several styles. We present the improvement in PAD performance using our technique on the publicly available LivDet datasets (2015, 2017, 2019 and 2021
Split and Knit: 3D Fingerprint Capture with a Single Camera
Apoorva Srivastava,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2022
@inproceedings{bib_Spli_2022, AUTHOR = {Apoorva Srivastava, Anoop Namboodiri}, TITLE = {Split and Knit: 3D Fingerprint Capture with a Single Camera}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2022}}
3𝐷 fingerprint capture is less sensitive to skin moisture levels and avoids skin deformation, which is common in contact-based sensors, in addition to capturing depth information. Unfortunately, its adoption is limited due to high cost and system complexity. Photometric stereo provides an opportunity to build low-cost, simple sensors capable of high-quality 3𝐷 capture. However, it assumes that the surface being imaged is lambertian (unlike our fingers). We introduce the Split and Knit algorithm (SnK), a 3𝐷 reconstruction pipeline based on the photometric stereo for finger surfaces. It introduces an efficient way of estimating the direct illumination component, thus allowing us to do a higher-quality reconstruction of the entire finger surface. The algorithm also introduces a novel method to obtain the overall finger shape under NIR illumination, all using a single camera. Finally, we combine the overall finger shape and the ridge-valley point cloud to obtain a 3𝐷 finger phalange. The high-quality 3𝐷 reconstruction also results in better matching accuracy of the captured fingerprints. 1
A Unified Model for Fingerprint Authentication and Presentation Attack Detection
Additya Popli,Saraansh Tandon,Joshua J. Engelsma,Naoyuki Onoe,Atsushi Okubo,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2021
@inproceedings{bib_A_Un_2021, AUTHOR = {Additya Popli, Saraansh Tandon, Joshua J. Engelsma, Naoyuki Onoe, Atsushi Okubo, Anoop Namboodiri}, TITLE = {A Unified Model for Fingerprint Authentication and Presentation Attack Detection}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2021}}
Typical fingerprint recognition systems are comprised of a spoof detection module and a subsequent recognition module, running one after the other. In this paper, we reformulate the workings of a typical fingerprint recognition system. In particular, we posit that both spoof detection and fingerprint recognition are correlated tasks. Therefore, rather than performing the two tasks separately, we propose a joint model for spoof detection and matching to simultaneously perform both tasks without compromising the accuracy of either task. We demonstrate the capability of our joint model to obtain an authentication accuracy (1:1 matching) of TAR = 100% @ FAR = 0.1% on the FVC 2006 DB2A dataset while achieving a spoof detection ACE of 1.44% on the LiveDet 2015 dataset, both maintaining the performance of stand-alone methods. In practice, this reduces the time and memory requirements of the fingerprint recognition system by 50% and 40%, respectively; a significant advantage for recognition systems running on resource-constrained devices and communication channels.
A Unified Model for Fingerprint Authentication and Presentation Attack Detection
Additya Popli,Saraansh Tandon,Joshua J. Engelsma,Naoyuki Onoe,Atsushi Okubo,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2021
Abs | | bib Tex
@inproceedings{bib_A_Un_2021, AUTHOR = {Additya Popli, Saraansh Tandon, Joshua J. Engelsma, Naoyuki Onoe, Atsushi Okubo, Anoop Namboodiri}, TITLE = {A Unified Model for Fingerprint Authentication and Presentation Attack Detection}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2021}}
Typical fingerprint recognition systems are comprised of a spoof detection module and a subsequent recognition module, running one after the other. In this paper, we reformulate the workings of a typical fingerprint recognition system. In particular, we posit that both spoof detection and fingerprint recognition are correlated tasks. Therefore, rather than performing the two tasks separately, we propose a joint model for spoof detection and matching1 to simultaneously perform both tasks without compromising the accuracy of either task. We demonstrate the capability of our joint model to obtain an authentication accuracy (1:1 matching) of TAR = 100% @ FAR = 0.1% on the FVC 2006 DB2A dataset while achieving a spoof detection ACE of 1.44% on the LiveDet 2015 dataset, both maintaining the performance of stand-alone methods. In practice, this reduces the time and memory requirements of the fingerprint recognition system by 50% and 40%, respectively; a significant advantage for recognition systems running on resource-constrained devices and communication channels.
CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems
ACHARY SUDHEER,Kommu Lakshmi Bhanu Moorthy,Ashar Javed,P Nikitha Shravan,Vineet Gandhi,Anoop Namboodiri
Eurographics Workshop on Intelligent Cinematography and Editing, WICED, 2020
@inproceedings{bib_Cine_2020, AUTHOR = {ACHARY SUDHEER, Kommu Lakshmi Bhanu Moorthy, Ashar Javed, P Nikitha Shravan, Vineet Gandhi, Anoop Namboodiri}, TITLE = {CineFilter: Unsupervised Filtering for Real Time Autonomous Camera Systems}, BOOKTITLE = {Eurographics Workshop on Intelligent Cinematography and Editing}. YEAR = {2020}}
Autonomous camera systems are often subjected to an optimization/filtering operation to smoothen and stabilize the rough trajectory estimates. Most common filtering techniques do reduce the irregularities in data; however, they fail to mimic the behavior of a human cameraman. Global filtering methods modeling human camera operators have been successful; however, they are limited to offline settings. In this paper, we propose two online filtering methods called Cinefilters, which produce smooth camera trajectories that are motivated by cinematographic principles. The first filter (CineConvex) uses a sliding window-based convex optimization formulation, and the second (CineCNN) is a CNN based encoder-decoder model. We evaluate the proposed filters in two different settings, namely a basketball dataset and a stage performance dataset. Our models outperform previous methods and baselines on both quantitative and qualitative metrics. The CineConvex and CineCNN filters operate at about 250fps and 1000fps, respectively, with a minor latency (half a second), making them apt for a variety of real-time applications.
Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition
VIJAYA KUMAR R,Anoop Namboodiri,Jawahar C V
Winter Conference on Applications of Computer Vision, WACV, 2020
@inproceedings{bib_Regi_2020, AUTHOR = {VIJAYA KUMAR R, Anoop Namboodiri, Jawahar C V}, TITLE = {Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2020}}
Current approaches for person recognition train an ensemble of region specific convolutional neural networks for representation learning, and then adopt naive fusion strategies to combine their features or predictions during testing. In this paper, we propose an unified end-to-end architecture that generates a complete person representation based on pooling and aggregation of features from multiple body regions. Our network takes a person image and the predetermined locations of body regions as input, and generates common feature maps that are shared across all the regions. Multiple features corresponding to different regions are then pooled and combined with an aggregation block, where the adaptive weights required for aggregation are obtained through an attention mechanism. Evaluations on three person recognition datasets - PIPA, Soccer and Hannah show that a single model trained end-to-end is computationally faster, requires fewer parameters and achieves improved performance over separately trained models.
Understanding Dynamic Scenes using Graph Convolution Networks
Mylavarapu Venkata Sai Sravan,MAHTAB SANDHU,Mahtab Sandhu,K Madhava Krishna,Balaraman Ravindran,Anoop Namboodiri
International Conference on Intelligent Robots and Systems, IROS, 2020
@inproceedings{bib_Unde_2020, AUTHOR = {Mylavarapu Venkata Sai Sravan, MAHTAB SANDHU, Mahtab Sandhu, K Madhava Krishna, Balaraman Ravindran, Anoop Namboodiri}, TITLE = {Understanding Dynamic Scenes using Graph Convolution Networks}, BOOKTITLE = {International Conference on Intelligent Robots and Systems}. YEAR = {2020}}
We present a novel Multi Relational Graph Convolutional Network (MRGCN) to model on-road vehicle behaviours from a sequence of temporally ordered frames as grabbed by a moving monocular camera. The input to MRGCN is a Multi Relational Graph (MRG) where the nodes of the graph represent the active and passive participants/agents in the scene while the bidrectional edges that connect every pair of nodes are encodings of the spatio-temporal relations. The bidirectional edges of the graph encode the temporal interactions between the agents that constitute the two nodes of the edge. The proposed method of obtaining his encoding is shown to be specifically suited for the problem at hand as it outperforms more complex end to end learning methods that do not use such intermediate representations of evolved spatio-temporal relations between agent pairs. We show significant performance gain in the form of behaviour classification accuracy on a variety of datasets from different parts of the globe over prior methods as well as show seamless transfer without any resort to fine-tuning across multiple datasets. Such behaviour prediction methods find immediate relevance in a variety of navigation tasks such as behaviour planning, state estimation as well as in applications relating to detection of traffic violations over videos.
Towards Accurate Vehicle Behaviour Classification With Multi-Relational Graph Convolutional Networks
Mylavarapu Venkata Sai Sravan,MAHTAB SANDHU,Priyesh Vijayan,K Madhava Krishna,Balaraman Ravindran,Anoop Namboodiri
Intelligent Vehicles symposium, IV, 2020
@inproceedings{bib_Towa_2020, AUTHOR = {Mylavarapu Venkata Sai Sravan, MAHTAB SANDHU, Priyesh Vijayan, K Madhava Krishna, Balaraman Ravindran, Anoop Namboodiri}, TITLE = {Towards Accurate Vehicle Behaviour Classification With Multi-Relational Graph Convolutional Networks}, BOOKTITLE = {Intelligent Vehicles symposium}. YEAR = {2020}}
Understanding on-road vehicle behaviour from a temporal sequence of sensor data is gaining in popularity. In this paper, we propose a pipeline for understanding vehicle behaviour from a monocular image sequence or video. A monocular sequence along with scene semantics, optical flow and object labels are used to get spatial information about the object (vehicle) of interest and other objects (semantically contiguous set of locations) in the scene. This spatial information is encoded by a Multi-Relational Graph Convolutional Network (MR-GCN), and a temporal sequence of such encodings is fed to a recurrent network to label vehicle behaviours. The proposed framework can classify a variety of vehicle behaviours to high fidelity on datasets that are diverse and include European, Chinese and Indian on-road scenes. The framework also provides for seamless transfer of models across datasets without entailing re-annotation, retraining and even fine-tuning. We show comparative performance gain over baseline Spatio-temporal classifiers and detail a variety of ablations to showcase the efficacy of the framework.
Mutual Information Based Method for Unsupervised Disentanglement of Video Representation
P ADITYA SREEKAR,Ujjwal Tiwari,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2020
@inproceedings{bib_Mutu_2020, AUTHOR = {P ADITYA SREEKAR, Ujjwal Tiwari, Anoop Namboodiri}, TITLE = {Mutual Information Based Method for Unsupervised Disentanglement of Video Representation}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2020}}
Video Prediction is an interesting and challenging task of predicting future frames from a given set context frames that belong to a video sequence. Video prediction models have found prospective applications in Maneuver Planning, Health care, Autonomous Navigation and Simulation. One of the major challenges in future frame generation is due to the high dimensional nature of visual data. In this work, we propose Mutual Information Predictive Auto-Encoder (MIPAE) framework, that reduces the task of predicting high dimensional video frames by factorising video representations into content and low dimensional pose latent variables that are easy to predict. A standard LSTM network is used to predict these low dimensional pose representations. Content and the predicted pose representations are decoded to generate future frames. Our approach leverages the temporal structure of the latent generative factors of a video and a novel mutual information loss to learn disentangled video representations. We also propose a metric based on mutual information gap (MIG) to quantitatively access the effectiveness of disentanglement on DSprites and MPI3D-real datasets. MIG scores corroborate with the visual superiority of frames predicted by MIPAE. We also compare our method quantitatively on evaluation metrics LPIPS, SSIM and PSNR.
Reducing the Variance of Variational Estimates of Mutual Information by Limiting the Critic's Hypothesis Space to RKHS
P ADITYA SREEKAR,Ujjwal Tiwari,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2020
@inproceedings{bib_Redu_2020, AUTHOR = {P ADITYA SREEKAR, Ujjwal Tiwari, Anoop Namboodiri}, TITLE = {Reducing the Variance of Variational Estimates of Mutual Information by Limiting the Critic's Hypothesis Space to RKHS}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2020}}
Mutual information (MI) is an information-theoretic measure of dependency between two random variables. Several methods to estimate MI, from samples of two random variables with unknown underlying probability distributions have been proposed in the literature. Recent methods realize parametric probability distributions or critic as a neural network to approximate unknown density ratios. The approximated density ratios are used to estimate different variational lower bounds of MI. While these methods provide reliable estimation when the true MI is low, they produce high variance estimates in cases of high MI. We argue that the high variance characteristic is due to the uncontrolled complexity of the critic's hypothesis space. In support of this argument, we use the data-driven Rademacher complexity of the hypothesis space associated with the critic's architecture to analyse generalization error bound of variational lower bound estimates of MI. In the proposed work, we show that it is possible to negate the high variance characteristics of these estimators by constraining the critic's hypothesis space to Reproducing Hilbert Kernel Space (RKHS), which corresponds to a kernel learned using Automated Spectral Kernel Learning (ASKL). By analysing the aforementioned generalization error bounds, we augment the overall optimisation objective with effective regularisation term. We empirically demonstrate the efficacy of this regularization in enforcing proper bias variance tradeoff on four variational lower bounds, namely NWJ, MINE, JS and SMILE.
Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition
Guntireddy Vijay Kumar,Anoop Namboodiri,Jawahar C V
Winter Conference on Applications of Computer Vision, WACV, 2020
@inproceedings{bib_Regi_2020, AUTHOR = {Guntireddy Vijay Kumar, Anoop Namboodiri, Jawahar C V}, TITLE = {Region Pooling with Adaptive Feature Fusion for End-to-End Person Recognition}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2020}}
Current approaches for person recognition train an en- semble of region specific convolutional neural networks for representation learning, and then adopt naive fusion strate- gies to combine their features or predictions during testing. In this paper, we propose an unified end-to-end architec- ture that generates a complete person representation based on pooling and aggregation of features from multiple body regions. Our network takes a person image and the pre- determined locations of body regions as input, and gener- ates common feature maps that are shared across all the re- gions. Multiple features corresponding to different regions are then pooled and combined with an aggregation block, where the adaptive weights required for aggregation are obtained through an attention mechanism. Evaluations on three person recognition datasets - PIPA, Soccer and Han- nah show that a single model trained end-to-end is com- putationally faster, requires fewer parameters and achieves improved performance over separately trained models.
Universal Material Translator: Towards Spoof Fingerprint Generalization
ROHIT KUMAR GAJAWADA,Additya Popli,Tarang Chugh,Anoop Namboodiri,Anil K. Jain
International conference on Biometrics, IJCB, 2019
@inproceedings{bib_Univ_2019, AUTHOR = {ROHIT KUMAR GAJAWADA, Additya Popli, Tarang Chugh, Anoop Namboodiri, Anil K. Jain}, TITLE = {Universal Material Translator: Towards Spoof Fingerprint Generalization}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2019}}
Spoof detectors are classifiers that are trained to distinguish spoof fingerprints from bonafide ones. However, state of the art spoof detectors do not generalize well on unseen spoof materials. This study proposes a style transfer based augmentation wrapper that can be used on any existing spoof detector and can dynamically improve the robustness of the spoof detection system on spoof materials for which we have very low data. Our method is an approach for synthesizing new spoof images from a few spoof examples that transfers the style or material properties of the spoof examples to the content of bonafide fingerprints to generate a larger number of examples to train the classifier on. We demonstrate the effectiveness of our approach on materials in the publicly available LivDet 2015 dataset and show that the proposed approach leads to robustness to fingerprint spoofs of the target material.
IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments
Girish Varma,Anbumani Subramanian,Manmohan Chandraker,Anoop Namboodiri,Jawahar C V
Winter Conference on Applications of Computer Vision, WACV, 2019
@inproceedings{bib_IDD:_2019, AUTHOR = {Girish Varma, Anbumani Subramanian, Manmohan Chandraker, Anoop Namboodiri, Jawahar C V}, TITLE = {IDD: A dataset for exploring problems of autonomous navigation in unconstrained environments}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2019}}
While several datasets for autonomous navigation have become available in recent years, they have tended to focus on structured driving environments. This usually corresponds to well-delineated infrastructure such as lanes, a small number of well-defined categories for traffic participants, low variation in object or background appearance and strong adherence to traffic rules. We propose DS, a novel dataset for road scene understanding in unstructured environments where the above assumptions are largely not satisfied. It consists of 10,004 images, finely annotated with 34 classes collected from 182 drive sequences on Indian roads. The label set is expanded in comparison to popular benchmarks such as Cityscapes, to account for new classes. It also reflects label distributions of road scenes significantly different from existing datasets, with most classes displaying greater within-class diversity. Consistent with …
Cross-modal style transfer
SAHIL CHELARAMANI,Abhishek Jha,Anoop Namboodiri
International Conference on Image Processing, ICIP, 2018
@inproceedings{bib_Cros_2018, AUTHOR = {SAHIL CHELARAMANI, Abhishek Jha, Anoop Namboodiri}, TITLE = {Cross-modal style transfer}, BOOKTITLE = {International Conference on Image Processing}. YEAR = {2018}}
We, humans, have the ability to easily imagine scenes that depict sentences such as “Today is a beautiful sunny day” or “There is a Christmas feel, in the air”. While it is hard to precisely describe what one person may imagine, the essential high-level themes associated with such sentences largely remains the same. The ability to synthesize novel images that depict the feel of a sentence is very useful in a variety of applications such as education, advertisement, and entertainment. While existing papers tackle this problem given a style image, we aim to provide a far more intuitive and easy to use solution that synthesizes novel renditions of an existing image, conditioned on a given sentence. We present a method for cross-modal style transfer between an English sentence and an image, to produce a new image that imbibes the essential theme of the sentence. We do this by modifying the style transfer mechanism
Hybrid binary networks: optimizing for accuracy, efficiency and memory
PRABHU AMEYA PANDURANG,BATCHU VENKAT VISHAL,ROHIT KUMAR GAJAWADA,SRI AUROBINDO MUNAGALA,Anoop Namboodiri
Winter Conference on Applications of Computer Vision, WACV, 2018
@inproceedings{bib_Hybr_2018, AUTHOR = {PRABHU AMEYA PANDURANG, BATCHU VENKAT VISHAL, ROHIT KUMAR GAJAWADA, SRI AUROBINDO MUNAGALA, Anoop Namboodiri}, TITLE = {Hybrid binary networks: optimizing for accuracy, efficiency and memory}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2018}}
Binarization is an extreme network compression approach that provides large computational speedups along with energy and memory savings, albeit at significant accuracy costs. We investigate the question of where to binarize inputs at layer-level granularity and show that selectively binarizing the inputs to specific layers in the network could lead to significant improvements in accuracy while preserving most of the advantages of binarization. We analyze the binarization tradeoff using a metric that jointly models the input binarization-error and computational cost and introduce an efficient algorithm to select layers whose inputs are to be binarized. Practical guidelines based on insights obtained from applying the algorithm to a variety of models are discussed. Experiments on Imagenet dataset using AlexNet and ResNet-18 models show 3-4% improvements in accuracy over fully binarized networks with minimal
Distribution-aware binarization of neural networks for sketch recognition
PRABHU AMEYA PANDURANG,BATCHU VENKAT VISHAL,SRI AUROBINDO MUNAGALA,ROHIT KUMAR GAJAWADA,Anoop Namboodiri
Winter Conference on Applications of Computer Vision, WACV, 2018
@inproceedings{bib_Dist_2018, AUTHOR = {PRABHU AMEYA PANDURANG, BATCHU VENKAT VISHAL, SRI AUROBINDO MUNAGALA, ROHIT KUMAR GAJAWADA, Anoop Namboodiri}, TITLE = {Distribution-aware binarization of neural networks for sketch recognition}, BOOKTITLE = {Winter Conference on Applications of Computer Vision}. YEAR = {2018}}
Deep neural networks are highly effective at a range of computational tasks. However, they tend to be computationally expensive, especially in vision-related problems, and also have large memory requirements. One of the most effective methods to achieve significant improvements in computational/spatial efficiency is to binarize the weights and activations in a network. However, naive binarization results in accuracy drops when applied to networks for most tasks. In this work, we present a highly generalized, distribution-aware approach to binarizing deep networks that allows us to retain the advantages of a binarized network, while reducing accuracy drops. We also develop efficient implementations for our proposed approach across different architectures. We present a theoretical analysis of the technique to show the effective representational power of the resulting layers, and explore the forms of data they …
Deep expander networks: Efficient deep networks from graph theory
PRABHU AMEYA PANDURANG,Girish Varma,Anoop Namboodiri
European Conference on Computer Vision, ECCV, 2018
@inproceedings{bib_Deep_2018, AUTHOR = {PRABHU AMEYA PANDURANG, Girish Varma, Anoop Namboodiri}, TITLE = {Deep expander networks: Efficient deep networks from graph theory}, BOOKTITLE = {European Conference on Computer Vision}. YEAR = {2018}}
Efficient CNN designs like ResNets and DenseNet were proposed to improve accuracy vs efficiency trade-offs. They essentially increased the connectivity, allowing efficient information flow across layers. Inspired by these techniques, we propose to model connections between filters of a CNN using graphs which are simultaneously sparse and well connected. Sparsity results in efficiency while well connectedness can preserve the expressive power of the CNNs. We use a well-studied class of graphs from theoretical computer science that satisfies these properties known as Expander graphs. Expander graphs are used to model connections between filters in CNNs to design networks called X-Nets. We present two guarantees on the connectivity of X-Nets: Each node influences every node in a layer in logarithmic steps, and the number of paths between two sets of nodes is proportional to the product of their sizes. We also propose efficient training and inference algorithms, making it possible to train deeper and wider X-Nets effectively. Expander based models give a 4% improvement in accuracy on MobileNet over grouped convolutions, a popular technique, which has the same sparsity but worse connectivity. X-Nets give better performance trade-offs than the original ResNet and DenseNet-BC architectures. We achieve model sizes comparable to state-of-the-art pruning techniques using our simple architecture design, without any pruning. We hope that this work motivates other approaches to utilize results from graph theory to develop efficient network architectures.
Semi-supervised annotation of faces in image collection
VIJAYA KUMAR R,Anoop Namboodiri,Jawahar C V
Signal,Image and Video Processing, SIViP, 2018
@inproceedings{bib_Semi_2018, AUTHOR = {VIJAYA KUMAR R, Anoop Namboodiri, Jawahar C V}, TITLE = {Semi-supervised annotation of faces in image collection}, BOOKTITLE = {Signal,Image and Video Processing}. YEAR = {2018}}
Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images
Vahid Mirjalili,Sebastian Raschka,Anoop Namboodiri,Arun Ross
International conference on Biometrics, IJCB, 2018
@inproceedings{bib_Semi_2018, AUTHOR = {Vahid Mirjalili, Sebastian Raschka, Anoop Namboodiri, Arun Ross}, TITLE = {Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2018}}
In this paper, we design and evaluate a convolutional autoencoder that perturbs an input face image to impart privacy to a subject. Specifically, the proposed autoencoder transforms an input face image such that the transformed image can be successfully used for face recognition but not for gender classification. In order to train this autoencoder, we propose a novel training scheme, referred to as semiadversarial training in this work. The training is facilitated by attaching a semi-adversarial module consisting of an auxiliary gender classifier and an auxiliary face matcher to the autoencoder. The objective function utilized for training this network has three terms: one to ensure that the perturbed image is a realistic face image; another to ensure that the gender attributes of the face are confounded; and a third to ensure that biometric recognition performance due to the perturbed image is not impacted. Extensive experiments confirm the efficacy of the proposed architecture in extending gender privacy to face images.
Pose-aware person recognition
VIJAYA KUMAR R,Anoop Namboodiri,Manohar Paluri,Jawahar C V
Computer Vision and Pattern Recognition, CVPR, 2017
@inproceedings{bib_Pose_2017, AUTHOR = {VIJAYA KUMAR R, Anoop Namboodiri, Manohar Paluri, Jawahar C V}, TITLE = {Pose-aware person recognition}, BOOKTITLE = {Computer Vision and Pattern Recognition}. YEAR = {2017}}
Person recognition methods that use multiple body regions have shown significant improvements over traditional face-based recognition. One of the primary challenges in full-body person recognition is the extreme variation in pose and view point. In this work,(i) we present an approach that tackles pose variations utilizing multiple models that are trained on specific poses, and combined using pose-aware weights during testing.(ii) For learning a person representation, we propose a network that jointly optimizes a single loss over multiple body regions.(iii) Finally, we introduce new benchmarks to evaluate person recognition in diverse scenarios and show significant improvements over previously proposed approaches on all the benchmarks including the photo album setting of PIPA.
Words speak for Actions: Using Text to find Video Highlights
KUDI SUKANYA HANMANTRAO,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2017
@inproceedings{bib_Word_2017, AUTHOR = {KUDI SUKANYA HANMANTRAO, Anoop Namboodiri}, TITLE = {Words speak for Actions: Using Text to find Video Highlights}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2017}}
Video highlights are a selection of the most interesting parts of a video. The problem of highlight de-tection has been explored for video domains like egocentric,sports, movies, and surveillance videos. Existing methods are limited to finding visually important parts of the video but does not necessarily learn semantics. Moreover, the available benchmark datasets contain audio muted, single activity,short videos, which lack any context apart from a few keyframes that can be used to understand them. In this work,we explore highlight detection in the TV series domain, which features complex interactions with the surroundings. The existing methods would fare poorly in capturing the video semantics in such videos. To incorporate the importance of dialogues/audio, we propose using the descriptions of shots of the video as cues to learning visual importance.Note that while the audio information is used to determine visual importance during training, the highlight detection still works using only the visual information from videos.We use publicly available text ranking algorithms to rank the descriptions. The ranking scores are used to train a visual pair wise shot ranking model (VPSR) to find the highlights of the video. The results are reported on TV series videos of the Video Set dataset and a season of Buffy the Vampire SlayerTV series.
Panoramic Stereo Videos with a Single Camera
RAJAT AGGARWAL,AMRISHA VOHRA,Anoop Namboodiri
Computer Vision and Pattern Recognition, CVPR, 2016
@inproceedings{bib_Pano_2016, AUTHOR = {RAJAT AGGARWAL, AMRISHA VOHRA, Anoop Namboodiri}, TITLE = {Panoramic Stereo Videos with a Single Camera}, BOOKTITLE = {Computer Vision and Pattern Recognition}. YEAR = {2016}}
We present a practical solution for generating 360◦stereo panoramic videos using a single camera. Current approaches either use a moving camera that captures multiple images of a scene, which are then stitched together to form the final panorama, or use multiple cameras that are synchronized. A moving camera limits the solution to static scenes, while multi-camera solutions require dedicated calibrated setups. Our approach improves upon the existing solutions in two significant ways: It solves the problem using a single camera, thus minimizing the calibration problem and providing us the ability to convert any digital camera into a panoramic stereo capture device. It captures all the light rays required for stereo panoramas in a single frame using a compact custom designed mirror, thus making the design practical to manufacture and easier to use. We analyze several properties of the design as well as present panoramic stereo and depth estimation results.
Semantic Classification of Boundaries of an RGBD Image.
Anoop Namboodiri,Jawahar C V,Srikumar Ramalingam
British Machine Vision Conference, BMVC, 2015
@inproceedings{bib_Sema_2015, AUTHOR = {Anoop Namboodiri, Jawahar C V, Srikumar Ramalingam}, TITLE = {Semantic Classification of Boundaries of an RGBD Image.}, BOOKTITLE = {British Machine Vision Conference}. YEAR = {2015}}
The problem of labeling the edges present in a single color image as convex, concave, and occluding entities is one of the fundamental problems in computer vision. It has been shown that this information can contribute to segmentation, reconstruction and recognition problems. Recently, it has been shown that this classification is not straightforward even using RGBD data. This makes us wonder whether this apparent simple cue has more information than a depth map? In this paper, we propose a novel algorithm using random forest for classifying edges into convex, concave and occluding entities. We release a data set with more than 500 RGBD images with pixel-wise ground labels. Our method produces promising results and achieves an F-score of 0.84 on the data set.
Visual phrases for exemplar face detection
N VIJAY KUMAR,Anoop Namboodiri,Jawahar C V
International Conference on Computer Vision, ICCV, 2015
@inproceedings{bib_Visu_2015, AUTHOR = {N VIJAY KUMAR, Anoop Namboodiri, Jawahar C V}, TITLE = {Visual phrases for exemplar face detection}, BOOKTITLE = {International Conference on Computer Vision}. YEAR = {2015}}
Recently, exemplar based approaches have been successfully applied for face detection in the wild. Contrary to traditional approaches that model face variations from a large and diverse set of training examples, exemplar-based approaches use a collection of discriminatively trained exemplars for detection. In this paradigm, each exemplar casts a vote using retrieval framework and generalized Hough voting, to locate the faces in the target image. The advantage of this approach is that by having a large database that covers all possible variations, faces in challenging conditions can be detected without having to learn explicit models for different variations. Current schemes, however, make an assumption of independence between the visual words, ignoring their relations in the process. They also ignore the spatial consistency of the visual words. Consequently, every exemplar word contributes equally during voting regardless of its location. In this paper, we propose a novel approach that incorporates higher order information in the voting process. We discover visual phrases that contain semantically related visual words and exploit them for detection along with the visual words. For spatial consistency, we estimate the spatial distribution of visual words and phrases from the entire database and then weigh their occurrence in exemplars. This ensures that a visual word or a phrase in an exemplar makes a major contribution only if it occurs at its semantic location, thereby suppressing the noise significantly. We perform extensive experiments on standard FDDB, AFW and G-album datasets and show significant improvement over previous exemplar …
Laplacian pyramids for deep feature inversion
ANIKET SINGH,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2015
@inproceedings{bib_Lapl_2015, AUTHOR = {ANIKET SINGH, Anoop Namboodiri}, TITLE = {Laplacian pyramids for deep feature inversion}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2015}}
Modern feature extraction pipelines, especially the on esusing deep networks, involve an increasing variety of elements. With layered approaches heaping abstraction upon abstraction, it becomes difficult to understand what it is that these features are capturing. One appealing way of solving this puzzle is feature visualization, where features are mapped back to the image domain. Our work improves the generic approach of performing gradient descent (GD) in the image space to match a given set of features to achieve a visualization. Specifically, we note that coarse features of an image like blobs, outlines etc. are useful by them-selves for classification purposes. We develop an inversion scheme based on this idea by recovering coarse features of the image before finer details. This is done by modeling the image as the composition of a Laplacian Pyramid. We show that by performing GD on the pyramid in a level-wise manner, we can recover meaningful images. Results are presented for inverting a shallow network: the densely calculated SIFT as well as a deep network: Krizehvsky et al.’sImagenet CNN (Alexnet).
Fingerprint enhancement using unsupervised hierarchical feature learning
MIHIR SAHASRABUDHE,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2014
@inproceedings{bib_Fing_2014, AUTHOR = {MIHIR SAHASRABUDHE, Anoop Namboodiri}, TITLE = {Fingerprint enhancement using unsupervised hierarchical feature learning}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2014}}
We present an approach for learning low- and high-level fingerprint structures in an unsupervised manner, which we use for enhancement of fingerprint images and estimation of orientation fields, frequency images, and region masks. We incorporate the use of a convolutional deep belief network to learn features from greyscale, clean fingerprint images. We also show that reconstruction performed by the learnt network works as a suitable enhancement of the fingerprint, and hierarchical probabilistic inference is able to estimate overall fingerprint structures as well. Our approach performs better than Gabor-based enhancement and short time Fourier transform-assisted enhancement on images it was trained on. We further use information from the learnt features in first layer, which are short and oriented ridge structures, to extract the orientation field, frequency image, and region mask of input fingerprints
A Sketch-Based Approach To Video Retrieval Using Qualitative Features
KOUSTAV GHOSAL,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2014
@inproceedings{bib_A_Sk_2014, AUTHOR = {KOUSTAV GHOSAL, Anoop Namboodiri}, TITLE = {A Sketch-Based Approach To Video Retrieval Using Qualitative Features}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2014}}
Motion trajectories extracted from certain videos contain sufficient spatio-temporal information which can be effectively used to characterize those videos. But the task of framing text-based queries for such videos in content-based video retrieval systems is very complicated. Sketch based query is an efficient tool to construct motion-based queries but perceptual differences like spatial and temporal variability pose serious challenges to query modelling. In this work we propose a new method of modelling sketch based queries which attempts to extract the qualitative features of motion by minimizing the perceptual variability. We also develop a multilevel filter for indexing a query, in which the search results are refined at each stage using a cumulative scoring mechanism. Finally, we show the effectiveness of our algorithm on a dataset of real pool videos and a synthetic dataset containing simulated videos having very complex motion trajectories.
One-Time Biometric Token based Authentication
KULKARNI ROHAN PRADEEP,Anoop Namboodiri
Indian Conference on Computer Vision, Graphics and Image Processing, ICVGIP, 2014
@inproceedings{bib_One-_2014, AUTHOR = {KULKARNI ROHAN PRADEEP, Anoop Namboodiri}, TITLE = {One-Time Biometric Token based Authentication}, BOOKTITLE = {Indian Conference on Computer Vision, Graphics and Image Processing}. YEAR = {2014}}
Widely used online commerce systems require an user to submit his sole banking credentials or credit card details for availing desired services, thus involving high risks with untrusted service providers. Often used one-time password based systems provide additional transaction security, but are still incapable of differentiating between a genuine user trying to authenticate or an adversary with stolen credentials. This brings out a strong need for biometrics based onetime password systems. In this paper we propose a one-time biometric token based authentication protocol which works within the framework of current online transaction schemes allowing an user to carry out a financial transaction with a service provider which completes with an authorization from the bank. The proposed protocol is based on key-binding biometric cryptosystems and upholds the requirements of secure authentication, template protection and revocability while providing privacy to individual’s biometrics and anonymity from the service provider. We demonstrate our system’s security and performance using iris biometrics to authenticate individuals.
Providing Services on Demand By User Action Modeling on Smart Phones
KUMAR VISHAL,ROMIL BANSAL,Anoop Namboodiri,Jawahar C V
international joint conference on pervasive and ubiquitous computing, Ubicomp, 2014
@inproceedings{bib_Prov_2014, AUTHOR = {KUMAR VISHAL, ROMIL BANSAL, Anoop Namboodiri, Jawahar C V}, TITLE = {Providing Services on Demand By User Action Modeling on Smart Phones}, BOOKTITLE = {international joint conference on pervasive and ubiquitous computing}. YEAR = {2014}}
Face recognition in videos by label propagation
N VIJAY KUMAR,Anoop Namboodiri,Jawahar C V
International conference on Pattern Recognition, ICPR, 2014
@inproceedings{bib_Face_2014, AUTHOR = {N VIJAY KUMAR, Anoop Namboodiri, Jawahar C V}, TITLE = {Face recognition in videos by label propagation}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2014}}
We consider the problem of automatic identification of faces in videos such as movies, given a dictionary of known faces from a public or an alternate database. This has applications in video indexing, content based search, surveillance, and real time recognition on wearable computers. We propose a two stage approach for this problem. First, we recognize the faces in a video using a sparse representation framework using l1-minimization and select a few key-frames based on a robust confidence measure.We then use transductive learning to propagate the labels from the key-frames to the remaining frames by incorporating constraints simultaneously in temporal and feature spaces. This is in contrast to some of the previous approaches where every test frame/track is identified independently, ignoring the correlation between the faces in video tracks. Having a few key frames belonging to few subjects for label propagation rather than a large dictionary of actors reduces the amount of confusion. We evaluate the performance of our algorithm on Movie Trailer face dataset and five movie clips, and achieve a significant improvement in labeling accuracy compared to previous approaches
Estimating Floor Regions in Cluttered Indoor Scenes from First Person Camera View
SANCHIT AGGARWAL,Anoop Namboodiri,Jawahar C V
International conference on Pattern Recognition, ICPR, 2014
@inproceedings{bib_Esti_2014, AUTHOR = {SANCHIT AGGARWAL, Anoop Namboodiri, Jawahar C V}, TITLE = {Estimating Floor Regions in Cluttered Indoor Scenes from First Person Camera View}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2014}}
The ability to detect floor regions from an image enables a variety of applications such as indoor scene understanding, mobility assessment, robot navigation, path planning and surveillance. In this work, we propose a framework for estimating floor regions in cluttered indoor environments. The problem of floor detection and segmentation is challenging in situations where floor and non-floor regions have similar appearances. It is even harder to segment floor regions when clutter, specular reflections, shadows and textured floors are present within the scene. Our framework utilizes a generic classifier trained from appearance cues as well as floor density estimates, both trained from a variety of indoor images. The results of the classifier is then adapted to a specific test image where we integrate appearance, position and geometric cues in an iterative framework. A Markov Random Field framework is used to integrate the cues to segment floor regions. In contrast to previous settings that relied on optical flow, depth sensors or multiple images in a calibrated setup, our method can work on a single image. It is also more flexible as we avoid assumptions like Manhattan world scene or restricting clutter only to wall-floor boundaries. Experimental results on the public MIT Scene dataset as well as a more challenging dataset that we acquired, demonstrate the robustness and efficiency of our framework on the above mentioned complex situations.
Learning minutiae neighborhoods: A new binary representation for matching fingerprints
AKHIL VIJ,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2014
@inproceedings{bib_Lear_2014, AUTHOR = {AKHIL VIJ, Anoop Namboodiri}, TITLE = {Learning minutiae neighborhoods: A new binary representation for matching fingerprints}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2014}}
Representation of fingerprints is one of the key factors that limits the accuracy and efficiency of matching algorithms. Most popular methods represent each fingerprint as an unordered set of minutiae with variable cardinality and the matching algorithms are left with the task of finding the best correspondence between the two sets of minutiae. While this makes the representation more flexible and matching more accurate, the task becomes computationally intensive. Fixed length representations with aligned features are highly efficient to match. However, creating an aligned representation without the knowledge of the sample to which it is to be matched, makes the problem of representation more complex. Some of the fixed-length representations only provide partial alignment, leaving the rest to the matching stage. In this paper, we propose a fixed length representation for fingerprints that provides exact alignment between the features, thus enabling high-speed matching with minimal computational effort. The representation extends the idea of object representation using bag of words into a bag of minutiae neighborhoods. The representation is provably invariant to affine transformations (rotation, translation and uniform scaling), and is shown to be highly discriminative for the task of verification. Experimental results on FVC 2002 and 2004 datasets clearly show the superiority of the representation with competing methods. As the proposed representation can be computed from the standard minutiae templates, the method is applicable to existing datasets, where the original fingerprint images are not available.
Shape reconstruction from single relief image
HARSHIT AGRAWAL,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2013
@inproceedings{bib_Shap_2013, AUTHOR = {HARSHIT AGRAWAL, Anoop Namboodiri}, TITLE = {Shape reconstruction from single relief image}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2013}}
Reconstructing geometric models of relief carv-ings are of great importance in preserving cultural heritages digitally. In case of reliefs, using laser scanners and structured lighting techniques is not always feasible or are very expensive given the uncontrolled environment. Single image shape from shading is an under-constrained problem that tries to solve for the surface normals given the intensity image. Various constraints are used to make the problem tractable. To avoid the uncontrolled lighting, we use a pair of images with and without the flash and compute an image under a known illumination.This image is used as an input to the shape reconstruction algorithms. We present techniques that try to reconstruct theshape from relief images using the prior information learned from examples. We learn the variations in geometric shape corresponding to image appearances under different lighting conditions using sparse representations. Given a new image,we estimate the most appropriate shape that will result in the given appearance under the specified lighting conditions. We integrate the prior with the normals computed from reflectance equation in a MAP framework. We test our approach on relief images and compare them with the state-of-the-art shape from shading algorithms.
Learning fingerprint orientation fields using continuous restricted Boltzmann machines
MIHIR SAHASRABUDHE,Anoop Namboodiri
Asian Conference on Pattern Recognition, ACPR, 2013
@inproceedings{bib_Lear_2013, AUTHOR = {MIHIR SAHASRABUDHE, Anoop Namboodiri}, TITLE = {Learning fingerprint orientation fields using continuous restricted Boltzmann machines}, BOOKTITLE = {Asian Conference on Pattern Recognition}. YEAR = {2013}}
We aim to learn local orientation field patterns in fingerprints and correct distorted field patterns in noisy fingerprint images. This is formulated as a learning problem and achieved using two continuous restricted Boltzmann machines. The learnt orientation fields are then used in conjunction with traditional Gabor based algorithms for fingerprint enhancement. Orientation fields extracted by gradient-based methods are local, and do not consider neighboring orientations. If some amount of noise is present in a fingerprint, then these methods perform poorly when enhancing the image, affecting fingerprint matching. This paper presents a method to correct the resulting noisy regions over patches of the fingerprint by training two continuous restricted Boltzmann machines. The continuous RBMs are trained with clean fingerprint images and applied to overlapping patches of the input fingerprint. Experimental results show that one can successfully restore patches of noisy fingerprint images.
Secure hamming distance based biometric authentication
KULKARNI ROHAN PRADEEP,Anoop Namboodiri
International conference on Biometrics, IJCB, 2013
@inproceedings{bib_Secu_2013, AUTHOR = {KULKARNI ROHAN PRADEEP, Anoop Namboodiri}, TITLE = {Secure hamming distance based biometric authentication}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2013}}
Concerns of privacy, template security and efficiency ofbiometric authentication systems have received consider-able research attention in the recent past. Binary templaterepresentations have been shown to provide significant im-provements in efficiency without compromising accuracyfor a variety of biometric modalities including fingerprints,palmprints and iris. Most of the secure authentication pro-tocols work with generic feature representations or proposenew secure templates for specific modalities. In this work,we propose an authentication protocol that works with anybinary biometric representation that employs hamming dis-tance for comparison and convert it into a secure, privacypreserving protocol that provides template protection. Wealso provide the ability to use masks while computing thehamming distance. The protocol retains the accuracy of theunderlying representation as demonstrated by our experi-ments
Fingerprint Enhancement using Hierarchical Markov Random Fields
NAGAVIJAYA RAMA REDDY K,Anoop Namboodiri
International Joint Conference on Biometrics, IJCBio, 2011
@inproceedings{bib_Fing_2011, AUTHOR = {NAGAVIJAYA RAMA REDDY K, Anoop Namboodiri}, TITLE = {Fingerprint Enhancement using Hierarchical Markov Random Fields}, BOOKTITLE = {International Joint Conference on Biometrics}. YEAR = {2011}}
We propose a novel approach to enhance the fingerprint image and extract features such as directional fields, minutiae and singular points reliably using a Hierarchical Markov Random Field Model. Unlike traditional fingerprint enhancement techniques, we use previously learned prior patterns from a set of clean fingerprints to restore a noisy one. We are able to recover the ridge and valley structure from degraded and noisy fingerprint images by formulating it as a hierarchical interconnected MRF that processes the information at multiple resolutions. The top layer incorporates the compatibility between an observed degraded fingerprint patch and prior training patterns in addition to ridge continuity across neighboring patches. A second layer accounts for spatial smoothness of the orientation field and its discontinuity at the singularities. Further layers could be used for incorporating higher level priors such as the class of the fingerprint. The strength of the proposed approach lies in its flexibility to model possible variations in fingerprint images as patches and from its ability to incorporate contextual information at various resolutions. Experimental results (both quantitative and qualitative) clearly demonstrate the effectiveness of this approach.
Cascaded filtering for biometric identification using random projections
ATIF IQBAL,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2011
@inproceedings{bib_Casc_2011, AUTHOR = {ATIF IQBAL, Anoop Namboodiri}, TITLE = {Cascaded filtering for biometric identification using random projections}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2011}}
Biometric identification often involves explicit com-parison of a probe template against each template stored in a database. This approach becomes extremely time-consuming as the size of the database increases. Filtering approaches use alight-weight comparison to reduce the database to smaller set of candidates for explicit comparison. However, most existing filtering schemes use specific features that are hand-crafted for the biometric trait at each stage of the filtering. In this work,we show that a cascade of simple linear projections on random lines can achieve significant levels of filtering. Each stage offiltering consists of projecting the probe onto a specific line and removal of database samples outside a window around the probe.The approach provides a way of automatic generation of filters and avoids the need of developing specific features for different biometric traits. The method also provides us with a variety of parameters such as the projection lines, the number and order of projections, and the window sizes to customize the filtering process to a specific application. Experimental results show that using an ensemble of projections reduce the search space by 60%without increasing the false negative identification rate.
Video based palmprint recognition
CHHAYA METHANI,Anoop Namboodiri
International conference on Pattern Recognition, ICPR, 2010
@inproceedings{bib_Vide_2010, AUTHOR = {CHHAYA METHANI, Anoop Namboodiri}, TITLE = {Video based palmprint recognition}, BOOKTITLE = {International conference on Pattern Recognition}. YEAR = {2010}}
The use of camera as a biometric sensor is desirable due to its ubiquity and low cost, especially for mobile devices. Palmprint is an effective modality in such cases due to its discrimination power, ease of presentation and the scale and size of texture for capture by commodity cameras. However, the unconstrained nature of pose and lighting introduces several challenges in the recognition process. Even minor changes in pose of the palm can induce significant changes in the visibility of the lines. We turn this property to our advantage by capturing a short video, where the natural palm motion induces minor pose variations, providing additional texture information. We propose a method to register multiple frames of the video without requiring correspondence, while being efficient. Experimental results on a set of different 100 palms show that the use of multiple frames reduces the error rate from 12.75% to 4.7%. We also propose a method for detection of poor quality samples due to specularities and motion blur, which further reduces the EER to 1.8%.
Efficient privacy preserving k-means clustering
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
Pacific Asia Workshop on Intelligence and Security Informatics., PAISI, 2010
@inproceedings{bib_Effi_2010, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Efficient privacy preserving k-means clustering}, BOOKTITLE = {Pacific Asia Workshop on Intelligence and Security Informatics.}. YEAR = {2010}}
This paper introduces an efficient privacy-preserving protocol for distributed K-means clustering over an arbitrary partitioned data, shared among N parties. Clustering is one of the fundamental algorithms used in the field of data mining. Advances in data acquisition methodologies have resulted in collection and storage of vast quantities of user’s personal data. For mutual benefit, organizations tend to share their data for analytical purposes, thus raising privacy concerns for the users. Over the years, numerous attempts have been made to introduce privacy and security at the expense of massive additional communication costs. The approaches suggested in the literature make use of the cryptographic protocols such as Secure Multiparty Computation (SMC) and/or homomorphic encryption schemes like Paillier’s encryption. Methods using such schemes have proven communication overheads. And in practice are found to be slower by a factor of more than 106. In light of the practical limitations posed by privacy using the traditional approaches, we explore a paradigm shift to side-step the expensive protocols of SMC. In this work, we use the paradigm of secret sharing, which allows the data to be divided into multiple shares and processed separately at different servers. Using the paradigm of secret sharing, allows us to design a provably-secure, cloud computing based solution which has negligible communication overhead compared to SMC and is hence over a million times faster than similar SMC based protocols
Blind authentication: a secure crypto-biometric verification protocol
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
IEEE Transactions on Information Forensics and Security, TIFS, 2010
@inproceedings{bib_Blin_2010, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Blind authentication: a secure crypto-biometric verification protocol}, BOOKTITLE = {IEEE Transactions on Information Forensics and Security}. YEAR = {2010}}
Concerns on widespread use of biometric authentication systems are primarily centered around template security, revocability, and privacy. The use of cryptographic primitives to bolster the authentication process can alleviate some of these concerns as shown by biometric cryptosystems. In this paper, we propose a provably secure and blind biometric authentication protocol, which addresses the concerns of user’s privacy, template protection, and trust issues. The protocol is blind in the sense that it reveals only the identity, and no additional information about the user or the biometric to the authenticating server or vice-versa. As the protocol is based on asymmetric encryption of the biometric data, it captures the advantages of biometric authentication as well as the security of public key cryptography. The authentication protocol can run over public networks and provide nonrepudiable identity verification. The encryption also provides template protection, the ability to revoke enrolled templates, and alleviates the concerns on privacy in widespread use of biometrics. The proposed approach makes no restrictive assumptions on the biometric data and is hence applicable to multiple biometrics. Such a protocol has significant advantages over existing biometric cryptosystems, which use a biometric to secure a secret key, which in turn is used for authentication. We analyze the security of the protocol under various attack scenarios. Experimental results on four biometric datasets (face, iris, hand geometry, and fingerprint) show that carrying out the authentication in the encrypted domain does not affect the accuracy, while the encryption key acts as an additional layer of security.
Efficient privacy preserving video surveillance
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
International Conference on Computer Vision, ICCV, 2009
@inproceedings{bib_Effi_2009, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Efficient privacy preserving video surveillance}, BOOKTITLE = {International Conference on Computer Vision}. YEAR = {2009}}
Widespread use of surveillance cameras in offices and other business establishments, pose a significant threat to the privacy of the employees and visitors. The challenge of introducing privacy and security in such a practical surveillance system has been stifled by the enormous computational and communication overhead required by the solutions. In this paper, we propose an efficient framework to carry out privacy preserving surveillance. We split each frame into a set of random images. Each image by itself does not convey any meaningful information about the original frame, while collectively, they retain all the information. Our solution is derived from a secret sharing scheme based on the Chinese Remainder Theorem, suitably adapted to image data. Our method enables distributed secure processing and storage, while retaining the ability to reconstruct the original data in case of a legal requirement. The system installed in an office like environment can effectively detect and track people, or solve similar surveillance tasks. Our proposed paradigm is highly efficient compared to Secure Multiparty Computation, making privacy preserving surveillance, practical.
Retrieval of online handwriting by synthesis and matching
Jawahar C V,A BALA SUBRAMANIAN,MILLION MESHESHA,Anoop Namboodiri
Pattern Recognition, PR, 2009
@inproceedings{bib_Retr_2009, AUTHOR = {Jawahar C V, A BALA SUBRAMANIAN, MILLION MESHESHA, Anoop Namboodiri}, TITLE = {Retrieval of online handwriting by synthesis and matching}, BOOKTITLE = {Pattern Recognition}. YEAR = {2009}}
Search and retrieval is gaining importance in the ink domain due to the increase in the availability of online handwritten data. However, the problem is challenging due to variations in handwriting between various writers, digitizers and writing conditions. In this paper, we propose a retrieval mechanism for online handwriting, which can handle different writing styles, specifically for Indian languages. The proposed approach provides a keyboard-based search interface that enables to search handwritten data from any platform, in addition to pen-based and example-based queries. One of the major advantages of this framework is that information retrieval techniques such as ranking relevance, detecting stopwords and controlling word forms are extended to work with search and retrieval in the ink domain. The framework also allows cross-lingual document retrieval across Indian languages
Contextual restoration of severely degraded document images
JYOTIRMOY BANERJEE,Anoop Namboodiri,Jawahar C V
Computer Vision and Pattern Recognition, CVPR, 2009
@inproceedings{bib_Cont_2009, AUTHOR = {JYOTIRMOY BANERJEE, Anoop Namboodiri, Jawahar C V}, TITLE = {Contextual restoration of severely degraded document images}, BOOKTITLE = {Computer Vision and Pattern Recognition}. YEAR = {2009}}
We propose an approach to restore severely degraded document images using a probabilistic context model. Unlike traditional approaches that use previously learned prior models to restore an image, we are able to learn the text model from the degraded document itself, making the approach independent of script, font, style, etc. We model the contextual relationship using an MRF. The ability to work with larger patch sizes allows us to deal with severe degradations including cuts, blobs, merges and vandalized documents. Our approach can also integrate document restoration and super-resolution into a single framework, thus directly generating high quality images from degraded documents. Experimental results show significant improvement in image quality on document images collected from various sources including magazines and books, and comprehensively demonstrate the robustness and adaptability of the approach. It works well with document collections such as books, even with severe degradations, and hence is ideally suited for repositories such as digital libraries
Efficient Biometric Verification in Encrypted Domain
MANEESH UPMANYU,Anoop Namboodiri,Srinathan Kannan,Jawahar C V
International conference on Biometrics, IJCB, 2009
@inproceedings{bib_Effi_2009, AUTHOR = {MANEESH UPMANYU, Anoop Namboodiri, Srinathan Kannan, Jawahar C V}, TITLE = {Efficient Biometric Verification in Encrypted Domain}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2009}}
Biometric authentication over public networks leads to a variety of privacy issues that needs to be addressed before it can become popular. The primary concerns are that the biometrics might reveal more information than the identity itself, as well as provide the ability to track users over an extended period of time. In this paper, we propose an authentication protocol that alleviates these concerns. The protocol takes care of user privacy, template protection and trust issues in biometric authentication systems. The protocol uses asymmetric encryption, and captures the advantages of biometric authentication. The protocol provides non-repudiable identity verification, while not revealing any additional information about the user to the server or vice versa. We show that the protocol is secure under various attacks. Experimental results indicate that the overall method is efficient to be used in practical scenarios.
Pose invariant palmprint recognition
CHHAYA METHANI,Anoop Namboodiri
International conference on Biometrics, IJCB, 2009
@inproceedings{bib_Pose_2009, AUTHOR = {CHHAYA METHANI, Anoop Namboodiri}, TITLE = {Pose invariant palmprint recognition}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2009}}
A palmprint based authentication system that can work with a multipurpose camera in uncontrolled circumstances, such as those mounted on a laptop, mobile device or those for surveillance, can dramatically increase the applicability of such a system. However, the performance of existing techniques for palmprint authentication fall considerably, when the camera is not aligned with the surface of the palm. The problems arise primarily due to variations in appearance introduced due to varying pose, but is compounded by specularity of the kin and blur due to motion and focus. In this paper, we propose a method to deal with variations in pose in unconstrained palmprint imaging. The method can robustly estimate and correct variations in pose, and compute a similarity measure between the corrected test image and a reference image. Experimental results on a set of 100 user’s palms captured at varying poses show a reduction in Equal Error Eate from 22.4% to 8.7%.
Projected texture for object classification
AVINASH SHARMA,Anoop Namboodiri
European Conference on Computer Vision, ECCV, 2008
@inproceedings{bib_Proj_2008, AUTHOR = {AVINASH SHARMA, Anoop Namboodiri}, TITLE = {Projected texture for object classification}, BOOKTITLE = {European Conference on Computer Vision}. YEAR = {2008}}
Algorithms for classification of 3D objects either recover the depth information lost during imaging using multiple images, structured lighting, image cues, etc. or work directly the images for classification. While the latter class of algorithms are more efficient and robust in comparison, they are less accurate due to the lack of depth information. We propose the use of structured lighting patterns projected on the object, which gets deformed according to the shape of the object. Since our goal is object classification and not shape recovery, we characterize the deformations using simple texture measures, thus avoiding the error prone and computationally expensive step of depth recovery. Moreover, since the deformations encode depth variations of the object, the 3D shape information is implicitly used for classification. We show that the information thus derived can significantly improve the accuracy of object classification algorithms, and derive the theoretical limits on height variations that can be captured by a particular projector-camera setup. A 3D texture classification algorithm derived from the proposed approach achieves a ten-fold reduction in error rate on a dataset of 30 classes, when compared to state-of-the-art image based approaches. We also demonstrate the effectiveness of the approach for a hand geometry based authentication system, which achieves a four-fold reduction in the equal error rate on a dataset containing 149 users.
Projected texture for hand geometry based authentication
AVINASH SHARMA,NISHANT SHOBHIT,Anoop Namboodiri
Computer Vision and Pattern Recognition Conference workshops, CVPR-W, 2008
@inproceedings{bib_Proj_2008, AUTHOR = {AVINASH SHARMA, NISHANT SHOBHIT, Anoop Namboodiri}, TITLE = {Projected texture for hand geometry based authentication}, BOOKTITLE = {Computer Vision and Pattern Recognition Conference workshops}. YEAR = {2008}}
We propose a novel approach to 3D hand geometry based person authentication using projected light patterns. Instead of explicitly computing a depth map of the palm for recognition, we capture the depth information in the deformations of a projected texture pattern, and use it directly for recognition. The deformed pattern is characterized using local texture measures, which can encode the certain depth characteristics of the palm. An authentication system built using the proposed technique achieves an equal error rate of 0.84% on a dataset of 1341 samples collected from 149 users, as opposed to 4.03% using traditional 2D features on an identical dataset. The approach is robust as well as computationally efficient and could be applied to other 3D object recognition problems as well.
Repudiation detection in handwritten documents
SACHIN GUPTA,Anoop Namboodiri
International conference on Biometrics, IJCB, 2007
@inproceedings{bib_Repu_2007, AUTHOR = {SACHIN GUPTA, Anoop Namboodiri}, TITLE = {Repudiation detection in handwritten documents}, BOOKTITLE = {International conference on Biometrics}. YEAR = {2007}}
Forensic document verification presents a different and interesting set of challenges as opposed to traditional writer identification and verification tasks using natural handwriting. The handwritten data presented to a forensic examiner is often deliberately altered, in addition to being limited in quantity. Specifically, the alterations can be either forged, where one imitates another person’s handwriting; or repudiated, where one deliberately distorts his handwriting in order to avoid identification. In this paper, we present a framework to detect repudiation in forensic documents, where we only have one pair of documents to arrive at a decision. The approach generates a statistically significant confidence score from matching two documents, which can be used to screen the documents that are passed on to an expert examiner. The approach can be extended for detection of forgeries as well.
Text Independent Writer Identification from Online Handwriting
Anoop Namboodiri,SACHIN GUPTA
International Conference on Frontiers in Handwriting Recognition, ICFHR, 2006
@inproceedings{bib_Text_2006, AUTHOR = {Anoop Namboodiri, SACHIN GUPTA}, TITLE = {Text Independent Writer Identification from Online Handwriting}, BOOKTITLE = {International Conference on Frontiers in Handwriting Recognition}. YEAR = {2006}}
Automatic identification of the author of a document has a variety of applications for both online and offline handwritten data such as facilitating the use of write dependent recognizers, verification of claimed identity for security, enabling personalized HCI and countering repudiations for legal purposes. Most of the existing writer identification techniques require the data to be from a specific text or a recognizer be available, which is not always feasible. Text-independent approaches often require large amount of data to be confident of good results. In this work, we propose a text-independent writer identification framework that uses a specified set of primitives of online handwritten data to ascertain the identity of the writer. The framework allows us to learn the properties of the script and the writers simultaneously and hence can be used with multiple languages or scripts. We demonstrate the applicability of our framework by choosing shapes of curves as primitives and show results on five different scripts and on different data sets.