IIITH

A sandhi splitter for malayalam

Devadath V V, Litton J Kurisinkel, Dipti Mishra Sharma, Vasudeva Varma Kalidindi

International Conference on Natural Language Processing., ICON, 2014

Core Rank : - Google Rank :5

Abs PDF bibTex

@inproceedings{bib_A_sa_2014, AUTHOR = {DEVADATH V V, LITTON J KURISINKEL, Dipti Mishra Sharma, Vasudeva Varma Kalidindi}, TITLE = {A sandhi splitter for malayalam}, BOOKTITLE = {International Conference on Natural Language Processing.}. YEAR = {2014}}

A sandhi splitter for malayalam

Abstract

Sandhi splitting is the primary task for computational processing of text in Sanskrit and Dravidian languages. In these languages, words can join together with morpho-phonemic changes at the point of joining. This phenomenon is known as Sandhi. Sandhi splitter splits the string of conjoined words into individual words. Accurate execution of sandhi splitting is crucial for text processing tasks such as POS tagging, topic modelling and document indexing. We have tried different approaches to address the challenges of sandhi splitting in Malayalam, and finally, we have thought of exploiting the phonological changes that take place in the words while joining. This resulted in a hybrid method which statistically identifies the split points and splits using predefined character level linguistic rules. Currently, our system gives an accuracy of 91.1%.

Identification of Karaka relations in an English sentence

Gorthi Sai Kiran, Ashish Palakurthi, Radhika Mamidi, Dipti Mishra Sharma

International Conference on Natural Language Processing., ICON, 2014

Core Rank : - Google Rank :5

Abs PDF bibTex

@inproceedings{bib_Iden_2014, AUTHOR = {GORTHI SAI KIRAN, ASHISH PALAKURTHI, Radhika Mamidi, Dipti Mishra Sharma}, TITLE = {Identification of Karaka relations in an English sentence}, BOOKTITLE = {International Conference on Natural Language Processing.}. YEAR = {2014}}

Identification of Karaka relations in an English sentence

Abstract

In this paper we explain the identification of karaka relations in an English sentence. We explain the genesis of the problem and present two different approaches, rule based and statistical. We briefly describe about rule based and focus more on statistical approach. We process a sentence through various stages and extract features at each stage. We train our data and identify Karaka relations using Support Vector Machines (SVM). We also explain the impact of our work on Natural Language Interfaces for Database systems.

Animacy Annotation in the Hindi Treebank

Itisree Jena, Riyaz Ahmad Bhat, Sambhav Jain, Dipti Mishra Sharma

Linguistic Annotation Workshop, LAW, 2013

Core Rank : - Google Rank :-

Abs PDF bibTex

@inproceedings{bib_Anim_2013, AUTHOR = {Itisree Jena, Riyaz Ahmad Bhat, Sambhav Jain, Dipti Mishra Sharma}, TITLE = {Animacy Annotation in the Hindi Treebank}, BOOKTITLE = {Linguistic Annotation Workshop}. YEAR = {2013}}

Animacy Annotation in the Hindi Treebank

Abstract

In this paper, we discuss our efforts to annotate nominals in the Hindi Treebank with the semantic property of animacy. Although the treebank already encodes lexical information at a number of levels such as morph and part of speech, the addition of animacy information seems promising given its relevance to varied linguistic phenomena. The suggestion is based on the theoretical and computational analysis of the property of animacy in the context of anaphora resolution, syntactic parsing, verb classification and argument differentiation.

Linguistic Annotation Workshop & Interoperability with Discourse

Itisree Jena, Riyaz Ahmad Bhat, Sambhav Jain, Dipti Mishra Sharma

Linguistic Annotation Workshop, LAW, 2013

Core Rank : - Google Rank :-

Abs PDF bibTex

@inproceedings{bib_Ling_2013, AUTHOR = {Itisree Jena, Riyaz Ahmad Bhat, Sambhav Jain, Dipti Mishra Sharma}, TITLE = {Linguistic Annotation Workshop & Interoperability with Discourse}, BOOKTITLE = {Linguistic Annotation Workshop}. YEAR = {2013}}