ACAD: Automatic Coherence Analysis of Dutch
The goal of ACAD was to develop an environment in which computationally naive discourse analysts can carry out an automatic analysis of causal coherence in discourse.
The research question of the project is: To what extent do the results of small-scale causal coherence analyses in different genres in terms of subjectivity hold for large datasets?
Coherence markers such as want and omdat differ in their degree of subjectivity. As a discourse analyst, one wants to be able to investigate the environment of these markers, to see whether the environment of subjective connectives like want contains more subjective words than that of relatively objective connectives like omdat and doordat. The ACAD tool allows the researcher to search through a large number of corpora (some already available in CLARIAH, like SoNaR, some newly added, like a corpus of Dutch WhatsApp messages). Core of the project is a search interface, CESAR (Corpus Editor for Syntactically Annotated Resources). CESAR allows the user to formulate advanced search queries without any advanced programming skills. It makes use of the annotations available in the corpora(POS -tagging, lemmatization, grammatical parse). It also has many options to control the output. In principle, the search interface is extendable to other languages and other types of research questions.
DIGIFIL: Digital Film Listings
DIGIFIL aims to digitise the Dutch Filmladders and contextual information about the wider movie landscape as reported in historical newspape...
NEWSGAC: News Genres Transparant Automatic Genre Classification
How genres in newspapers and television news can be detected automatically using machine learning in a transparent manner, to capture the sh...
HUMIGEC: Human capital, immigration and the early modern Dutch economy
What was the contribution of migrant workers to the 18th-century Dutch economy? We reconstructed the careers of native and migrant sailors w...