• 1 March 2021

OpenSoNaR Tutorial big succes

CLARIAH organized a virtual OpenSoNaR Tutorial on October 9, 2020. OpenSoNar is a web application for searching the SoNaR corpus and the CGN corpus.

Open Sonar

SoNaR contains more than 500 million occurrences of words from sources written in Dutch from the Netherlands and Flanders. The Corpus Spoken Dutch (CGN) is a collection of 900 hours of transcribed speech with approximately 9 million occurrences of words from speakers from the Netherlands and Flanders.

Interest in the tutorial was overwhelming. There were more than 30 participants. The tutorial had to be held online due to the Covid-19 pandemic. This of course had major disadvantages, but also advantages. The online nature made it possible for participants from all over the world to participate and we saw participants not only from the Netherlands, but also from Belgium, and even from Curaçao, Sweden, Germany, Canada and South Africa.

After an introductory presentation on OpenSoNaR by Jan Odijk (UU), the four different interfaces that OpenSoNaR offers were introduced one by one, together with the possibilities for analysis of the search results. Carole Tiberius (INT) introduced the Simple interface that allows you to search for words and sequences of words, the Extended interface that allows you to search for words based on their lemma (the form of the word you find in a dictionary) and part of speech, and the Advanced interface that provides a graphical interface for creating complex regular expression searches.

Finally, Jan Odijk introduced the Expert interface, which makes searches in the Corpus Query Processor Language possible. After each introduction of an interface, the participants were given the opportunity to do exercises with this interface. A team of seven experts was on hand to discuss questions and concerns of participants in separate "break-out rooms", and this was put to good use.

Thank you for the informative and interesting tutorial. As a lateral-entry teacher of (corpus) linguistics, I learned a lot from it.

- Gonneke Groenen (Noordwes University, South Africa)

The participants were enthusiastic about the tutorial, as witnessed spontaneous reactions (see box) and the (anonymous) reactions in the evaluation survey (eg "It was better and more useful than I expected. I learned a lot. The examples were very useful." ), but they also pointed out areas for improvement (such as making more time for the exercises and for breaks).

The tutorial ended with a session highlighting which other applications for corpus-based research CLARIAH offers. We plan to organize tutorials for some of those other applications soon as well. Interest in this can be reported via

Thank you very much for Friday's training. It was very interesting and I plan to use SoNaR for my dissertation.

- Nathanaël Stilmant (Université de Mons, Belgium)

About OpenSoNaR Tutorial

The presentations, the exercises and a link to the solutions of the exercises are all available via this link. The tutorial is included, and the recordings are available here (in Dutch).

Link to OpenSoNaR: 

OpenSoNaR requires login. You can do this with the account of your own higher education institution, if you are associated with it. Otherwise, you will need to apply for an account with CLARIN ERIC.

About the teachers

The teachers were Jan Odijk (Utrecht University), Jesse de Does, Katrien Depuydt, Kris Heylen, Jan Niestadt, Carole Tiberius, and Vincent Vandeghinste (all Institute for the Dutch Language). They all have a lot of experience working with OpenSoNaR, have taught about it in other contexts before, and some of them have contributed to the development of OpenSoNaR.