Best Paper Award for Using Deep Learning In Retrieval And Recognition of Handwritten Word Images

Simplistically put, Document Analysis is about trying to extract content from a scanned document image so that it is easily editable and searchable. While the currently popular Optical Character Recognition software gives almost 100% accuracy for recognizing scanned and printed Latin text, the approach doesn’t work very well with scanned handwritten (HW) Latin text, due to factors such as variability in handwriting, page structure and so on.

Methods for indexing and retrieval of scanned handwritten documents are needed for various applications such as historical manuscripts, scientific notes, personal records, and even criminal records.

DAS 2018

At the recently concluded International Workshop on Document Analysis Systems (DAS) in Vienna, the IAPR Nakano Best Paper Award went to Praveen Krishnan, Kartik Dutta and Prof C.V. Jawahar for the paper titled, ‘Word Spotting and Recognition Using Deep Embeddings’.

DAS 2018 is sponsored by the International Association for Pattern Recognition (IAPR) and this year was it’s 13th edition. Very broadly speaking, the IAPR is an international association of non-profit, scientific or professional organizations concerned with pattern recognition, computer vision, and image processing. DAS in particular is a workshop focussing on system-level issues and approaches in document analysis and recognition and is considered to be a premier event for those working in this field (DA). The workshop comprised of invited speaker presentations, oral, poster, tutorial sessions, as well as working group discussions.

The IAPR Nakano Award (the best paper award) was established in 1998, in memory of the late Prof. Yasuaki Nakano, the honorary chair of this workshop and the general chair of DAS1998.

Praveen Krishnan, 7th year PhD student under Prof. C.V Jawahar at Center for Visual Information Technology (CVIT) who represented the institute says that he was fortunate to attend and present two papers at the workshop. “We got the opportunity to present on the first day itself (actually the first talk in the first session!)”, he says with pride.

The Award-Winning Work

There currently exist two approaches to make handwritten documents searchable:

Handwritten Word Recognition
Word Spotting: Similar to zeroing in on (key)words uttered in speech processing, word spotting in document image processing is the problem of finding all instances of a query word that exist in a scanned document image, without fully recognizing it.

The researchers from the Centre for Visual Information Technology at IIIT-H applied word spotting for HW word images. According to Kartik Dutta, one of the co-authors of the paper, “In case of word spotting, the previous approaches either had a multiple stage pipeline or trained their networks to encode an image in a predetermined representation (PHOC, etc.). Whereas the proposed approach allows us to train an elegant, end to end network, without using any predetermined representation.” In the case of handwritten word recognition, the group used a network (which is similar to the one used in word spotting). However, here they added recurrent layers to the network such that it allows for predicting the text present in any line or word image without being constrained to choose from a dictionary of words.

The Second Paper

The other paper that was presented was on “Offline Handwriting Recognition on Devanagari using a new Benchmark Dataset”. “The major contribution of this work is in bringing a new framework for generating annotated data for Indian scripts. We also benchmark this dataset along with the publically available dataset with an improved version of text recognition system using a deep neural network,” explains Praveen.

Experiences and Insights

Explaining the proceedings at the workshop, Praveen says that other than the regular oral presentations and poster sessions, DAS has an interesting concept of “discussion forums” where the researchers vote for few upcoming/major research topics in the broad interests of the community. Subsequently groups are formed and participants sit together and discuss the various aspects of it. “I was again lucky enough to be the part of the popular group formed for discussing “deep learning for document images analysis” where I also took the role of becoming the scribe. Towards the end of workshop, I summarized the topic discussed and gave a presentation,” says Praveen.

Collaborations

Praveen credits Prof. C.V Jawahar with being a major contributor in India in this field by taking major roles in projects such as Digital Library of India, developing OCR systems for Indic Scripts and more recently the IMPRINT initiative. Infact this project work is partly supported by IMPRINT project, Govt. of India. IMPRINT (an acronym for IMpacting Research INnovation and Technology) is the first of its kind MHRD-supported initiative in collaboration with premier tech institutes to address the major science and engineering challenges that India faces. It aims to adopt Engineering and Technology as the vehicle to address societal needs and to eventually achieve national prosperity.

Praveen Krishnan has been a TCS Research Scholar under their Fellowship Scheme since 2013. Launched in 2010, the TCS Research Scholarship Program invites applications from doctoral research scholars from 61-identified top ranking institutes and universities in India. The aim is to provide a major fillip for quality research in areas such as computer sciences and engineering, information systems and technology, and software engineering. It is helpful to note that the scholarship comes with no strings attached. As per the website, “the RSP scholars do not have to join TCS or share the IP of their work in any way.”