SERPENS: SEaRch PEst and Nuisance Species
Contextual search and analysis of pest and nuisance species through time in the KB newspaper collection, particularly focused on the perception of Mustelid species like polecats, martens and stoats.
Historical newspapers are a fascinating source of information for historical ecologists to study interactions between humans and animals through time and space. Digitized newspaper archives are particularly interesting to analyze because of their breadth and depth and easy access. However, the size and the occasional noisiness of such archives also brings difficulties, as manual analysis still remains cumbersome and laborious.
In SERPENS, we performed experiments to automate query expansion and categorization for the perception of alleged pest and nuisance animal species mentioned in digitized newspapers from a subset of the KB newspaper collection (1800-1940). We particularly focused on the perception of Mustelid species like polecats, martens and stoats. For animal taxonomy we made use of ATHENA; for query expansion we used lexicons; for categorization of newspaper articles we trained a Support Vector Machine model.
Our results indicate that – with a rather limited number of training examples – we can fairly easily distinguish newspaper articles that are about animal species from those that are not (~92% accuracy) and between different types of subcategories of newspaper articles (e.g., articles about material damage caused by pest species, non-material damage, pest control and hunting; ~84% accuracy). Automated procedures like this can greatly enhance the usability of large digitized collections, not only for historical ecology but also for other fields in the natural sciences and humanities.
Project info
Onderzoekers
Universitair docent Geschiedenis van de Ecologie, Radboud Universiteit
Meer projecten
DIGIFIL: Digital Film Listings
DIGIFIL aims to digitise the Dutch Filmladders and contextual information about the wider movie landscape as reported in historical newspape...
NEWSGAC: News Genres Transparant Automatic Genre Classification
How genres in newspapers and television news can be detected automatically using machine learning in a transparent manner, to capture the sh...
HUMIGEC: Human capital, immigration and the early modern Dutch economy
What was the contribution of migrant workers to the 18th-century Dutch economy? We reconstructed the careers of native and migrant sailors w...