ACAD: Automatic Coherence Analysis of Dutch

The goal of ACAD was to develop an environment in which computationally naive discourse analysts can carry out an automatic analysis of causal coherence in discourse.

  • Wilbert Spooren
  • Linguïstiek
Resource types
  • Tools

The research question of the project is: To what extent do the results of small-scale causal coherence analyses in different genres in terms of subjectivity hold for large datasets?

Grammatically parsed text message with the Dutch coherence marker ‘Want’.
Grammatically parsed text message with the Dutch coherence marker ‘Want’.

Coherence markers such as want and omdat differ in their degree of subjectivity. As a discourse analyst, one wants to be able to investigate the environment of these markers, to see whether the environment of subjective connectives like want contains more subjective words than that of relatively objective connectives like omdat and doordat. The ACAD tool allows the researcher to search through a large number of corpora (some already available in CLARIAH, like SoNaR, some newly added, like a corpus of Dutch WhatsApp messages). Core of the project is a search interface, CESAR (Corpus Editor for Syntactically Annotated Resources). CESAR allows the user to formulate advanced search queries without any advanced programming skills. It makes use of the annotations available in the corpora(POS -tagging, lemmatization, grammatical parse). It also has many options to control the output. In principle, the search interface is extendable to other languages and other types of research questions.