Abstract
In addition to innate human intelligence, having access to extensive context and world knowledge is a crucial factor that aids in comprehending natural language, making it smooth and effortless to understand words with multiple meanings for humans. Although machines lack intrinsic intelligence, their capacity to learn language can greatly improve with access to more data, which serves as valuable context. In Natural Language Processing (NLP), the task of identifying and attributing the right sense of a word in a given context is called Word Sense Disambiguation (WSD). WSD, as a sub-task, plays a crucial role in several NLP applications such as Machine Translation. Every language has a set of words that have multiple senses. Sanskrit, one of the ancient and classical languages of the Indian subcontinent is no exception to this. Like many other languages with a rich literary tradition, Sanskrit features a multitude of polysemous words. However, it is essential to acknowledge that the data used to train machine models on Sanskrit is considerably less compared to European and a few other Indian languages. Consequently, the task of disambiguating word senses in Sanskrit presents a highly complex challenge for machines, especially when considering the unique and rich nature of its literary language. The purpose of this paper is to delineate the potential areas where the infusion of additional data can enhance language learning, through a manual error analysis taxonomy focused on the Bhagavadgītā. Our analysis will delve into the translation outcomes produced by Google Translate, which is considered the state-of-the-art tool for handling Sanskrit and other languages with limited available resources.