Abstract
Handwritten Text Recognition (HTR) presents a significant challenge in computer vision due to various factors such as individual writing styles, noise, blur, and other imperfections in the text. This challenge is further exacerbated when dealing with languages using Indian scripts, which are characterized by complex character structures, extensive character inventories, and specific cultural nuances. In this study, we address these challenges by focusing on enhancing handwritten text recognition for ten Indic languages: Hindi, Bengali, Telugu, Tamil, Gujarati, Gurumukhi, Oriya, Kannada, Malayalam, and Urdu. We aim to improve recognition accuracy by leveraging the Permuted Autoregressive Sequence Model (PARSeq), an extension of the transformer-based model. Our results demonstrate the superiority of the PARSeq model over existing approaches, particularly in achieving state-of-the-art performance across most languages. Additionally, we investigate the efficacy of transfer learning from printed text to handwritten text, revealing its potential to enhance recognition performance. The trained models and code are publicly available at https://github.com/LalithaEvani/Indic-HTR-CVIP-2024.