Abstract
Research in NLP has seen increasing attention toward narrative understanding over the past decade.
Narratives have a broad area of applications in various domains such as economics, political science
and literature. Understanding narratives is critical to perform well in discourse-level tasks such as
summarization, question answering, and multi-hop reasoning.
In this thesis, we explore a framework for understanding narratives. Narratives are broken down into
two fundamental parts: events and characters. The task of understanding narratives is then posed as the
task of understanding the interplay and relations between these two constituents. We focus on two major
relations. How are characters related to other characters, i.e. the character-character relations, and how
are events related to other events, i.e. event-event relations.
We utilize the concept of character arcs, a popular literary device that shows how the character
changes with time to model character-character relations. We build MARCUS, an automated pipeline
to generate and visualize these character arcs and character relations given a novel. We take two
famous literary works, “Harry Potter” and “The Lord of the Rings” and analyze the character relations
generated by MARCUS. We evaluate the quality of these arcs and relations through both quantitative
and qualitative methods and show the effectiveness of the arcs created through MARCUS.
For event-event relations, we focus on the task of identifying temporal relations between an event
pair in the narrative. Narratives, by their very nature, are discourse-level phenomenons. Yet, most
of the current work on identifying event temporal relations focuses on local event pairs, i.e. event
pairs found close together typically within adjacent sentences. We thus, build DELTA, a discourselevel event temporal relation dataset to facilitate document-level event timeline generation. In DELTA,
we introduce the concept of multiple timelines, where we distinguish between the real timeline where
the events have actually occurred, and hypothetical timelines with events that may not have actually
happened. We also develop a new user-friendly annotation tool that not only streamlines and makes
the timeline annotation efforts more efficient but also helps visualise and understand the timeline. We
train strong baseline models based on RoBERTa to predict discourse-level event temporal relations. In
addition, we qualitatively analyze the timelines generated by our dataset, and evaluate these timelines
against the timelines generated by existing datasets.