Abstract
“Natural language processing is the “big data” problem for the future of computing. While a huge amount of work has been done in this area, which we call ‘modeling and parameterizing computer systems,’ a fundamentally different approach is needed. We want to deploy virtual languages, and human computational agents. We have no formal computational theory that describes the problems that will be addressed by these. Most existing work is
preliminary, although some work has been done and some actual software exists.” Can you guess the author who penned the above quote? You’ll be surprised it is not a famous celebrity. No, it is not even an overworked computer scientist. In fact, the entire quote was generated automatically by OpenAI’s latest language model, GPT-2 on simply supplying a short prompt of “Natural Language Processing is”. Surprising, isn’t it? Indeed, that’s how powerful language models can be!
Language models are at the heart of Natural Language Processing (NLP). They help in extracting regularities from natural language which aid in solving multiple problems in various applications like
automatic speech recognition, document classification, sentiment analysis, summarization, et cetera. With the growing advent of deep learning techniques aided by huge and efficient computational resources - we seek to evaluate various applications of fine-tuning language models which can guide us to a better and nuanced understanding of natural language. In this thesis, I’ll attempt to tackle some of the original problems that language models can help solve. Primary among those is the style transfer task wherein a piece of text is desired to be stylistically rewritten in the fashion of a given target author. Due to the unavailability of parallel data for such a task, it is not possible to leverage supervised systems like
sequence-to-sequence models which entirely rely on paired instances of training data. This is where we bring in the concept of language-model finetuning - wherein we build a novel stylistic language model, StyleLM. StyleLM can be used to rewrite any given piece of text into the style of a target author, and
is also scalable to multiple authors. We demonstrate the ability of StyleLM to rewrite into the style of a given target author across 3 different domains - viz. online reviews, fiction books and encyclopaedic pages. Our evaluation framework shows that StyleLM achieves high alignment with the target authors
across all 3 levels (surface, lexical, syntactic) of the stylistic spectrum. At the same time, StyleLM also achieves comparable performance to the supervised baseline in both qualitative and quantitative comparisons. We also seek to evaluate the power of language models across content understanding tasks - both at
the syntactic and semantic levels of linguistic discourse. For syntactic understanding, we dive into search queries, and seek to evaluate if pretraining and finetuning large language models can help discern wellformed search queries effectively from the non well-formed ones. To effectively understand the semantic
spectrum, we use data from happy moments posted by users online to determine if we can detect the agency (whether the author is in control?) and social (if anyone other than the author is involved?) characteristics from within the happy moments. Across both these tasks, we achieve considerably high performance - with the latter system for understanding happy moments achieving the runner-up at the shared task in an affective content analysis workshop held at AAAI 19. We also outperform the stateof-the-art for syntactic discernment of search queries.