Education
2015-2019 PhD, University of Groningen. Supervised by Gertjan van Noord and Malvina Nissim: Normalization and Parsing Algorithms for Uncertain Input
2014-2015 Information Science Master, University of Groningen. Supervised by Johan Bos and Gosse Bouma: Automatic estimation of semantic relatedness for sentences using machine learning.
2010-2014 Information Science Bachelor, University of Groningen. Supervised by Johan Bos: Een automatisch recommender system voor Nederlandstalig nieuws op basis van Twitter.
Other academic jobs
- 2021-2024 Assistant Professor at the IT University of Copenhagen
- 2019-2021 Postdoc at the IT University of Copenhagen
- 2018-2019 Part-time teacher at the University of Groningen
- 2013-2015 Java programmer for the incpar project
- 2014 Student assistant for a web-programming course
Real jobs
- Aant Betsema Azn (Bookkeeper, website, programmer, IT)
- Jumbo de Greiden (Supermarket): 4 years!
- Friesland Campina (Cheese Factory)
- Wieger Ketellapper (Cookies Factory)
Prizes
- Best paper award WNUT 2022
- Outstanding paper award EACL 2021
- Outstanding reviewer ACL 2021
- Participant Guinnes record of most people (361) walking 1,000 meter on ice barefoot (2013)
Reviewer
- AACL 2022
- EMNLP 2022
- CONLL 2022 (Area Chair)
- ARR 2022
- WNUT2021
- CONLL 2021 (Area Chair)
- EMNLP 2021
- WNUT 2021
- ARR 2021
- ACL 2021
- NoDaLiDa 2021
- NAACL 2021
- EACL 2021
- PEOPLES 2020
- COLING 2020
- WNUT 2020
- EMNLP 2020
- AACL-IJCNLP 2020
- ACL 2020
- LREC 2020
- Clin Journal 2020
- NoDaLiDa 2019
- EMNLP 2019
- SemEval 2019
- ACL 2019
- NAACL 2019
- EMNLP 2018
- WNUT 2018
- SemEval 2015
Supervisor
Master Theses (2019):
- Cross-domain Dialogue Act Classification for Social Media Data
- A two step training approach for semi-supervised language identification in code-switched data
- Kelly Dekker: Distant Supervision for Lexical Normalization: Methods to Automatically Generate Training Data.
- Ian Matroos: Distant Supervision for Lexical Normalization.
- Wessel Reijngoud: Automatic Classification of Normalisation Replacement Categories: Within Corpus and Cross Corpus.
- Youri Schuur: Normalization for Dutch for improved POS tagging [bib | pdf]
Bachelor theses:
- A picture says more than a thousand words - or does it?
- Literal Japanese to English Translation (2020)
- Automatic code-switching detection in Dutch-Frisian language (2018)
- Objectivity in Dutch newspapers’ headlines; What is the level of objectivity and subjectivity in Dutch newspapers’ headlines? (2018)
- Predicting Age using Code-switching (2018)
- Semi-supervised language identification in code-switched data: K-means clustering versus a probabilistic approach (2018)
- The Effectiveness of a Semi-Supervised Approach to Automatic Code Switch Detection (2018)
Other supervised projects that lead to a publication:
- Martijn Bartelds and Wietse de Vries. Improving Cross-domain Authorship Attribution by Combining Lexical and Syntactic Features Notebook for PAN at CLEF 2019.
- Bastian Bergsma, Arjan van Eerden and Patrick Plattje. Bhojpuri PoS-tagger: a bi-LSTM tagger with Proper Noun Information (2019).
- Rianne Bos, Kelly Dekker and Harmjan Setz. Rob’s Angels: Embedding and Clustering for Cross-Genre Gender Prediction
- Lennart Faber, Ian Matroos, Leon Melein and Wessel Reijngoud. wUGs: Co-Training vs. Simple SVM Comparing Two Approaches for Cross-Genre Gender Prediction
- Aria Nourbakhsh, Frida Vermeer and Gijs Wiltvank. sthruggle at SemEval-2019 Task 5: An Ensemble Approach to Hate Speech Detection.
- Mike Zhang, Roy David, Leon Graumans and Gerben Timmerman. Grunn2019 at SemEval-2019 Task 5: Shared Task on Multilingual Detection of Hate"
- Remko Boschker. CLIN 27 Shared Task - Translating Historical Dutch Text
Presentations
Below I list all presentations which were not the result of a paper.- Invited talk at the University of Groningen (26-06-2020)
- Interview de Taalstaat (Dutch): link
- ROBustness workshop: Analysis of Effect of Normalization Types on Dependency Parsing & 100% Independent Universal Dependencies Annotation
[slides] - CMC and Social Media Corpora, Antwerpen (2018): Lexical Normalization for Dutch Social Media Texts
[abstract | poster] - Beroependag ASCI (2018): Hoe is het om te promoveren in computationele taalkunde?
[slides] - CLIN28: Lexical Normalization for Neural Network Parsing
[abstract | slides | code] - CLS Language and Social Media day, Nijmegen (2017): Parser Adaptation for Social Media by Integrating Normalization
[abstract | poster] - Reading Group presentation 3-3-2017: Parser Adaptation for Social Media by Integrating Normalization
[slides] - CLIN27: Towards Domain Adaptation for Dutch Social Media Text Through Normalization
[abstract | paper| slides | code] - CLIN26: GroRef: Rule-Based Coreference Resolution for Dutch
[abstract | slides | code] - Kroegcollege ASCI (2015): Het parsen van tweets
[slides] - CLIN25: A New Automatic Spelling Correction Model Aimed at Improving Parsability
[abstract | slides | code]
Teaching
- Advanced NLP (2022)
- 2nd year project in NLP and deep learning (2021) (together with Barbara Plank)
- Supervising research project of three students (2020)
- Supervised one bachelor thesis (2020)
- 2nd year project (2020) (together with Barbara Plank)
- Computationele Grammatica (2018-2019)
- Shared Task course (2018-2019)
- First supervisor for 5 Bachelor theses (2018-2019)
- First supervisor for 3 Master thesss (2018-2019)
- Organizing the Reading Group of Informatiekunde. Includes a Ma course, which helps students prepare for their master thesis. (2015-2018)
- Computationele Grammatica (2015-2016)
- Natuurlijke taalverwerking 1 (2014-2015) (together with Malvina Nissim)
Organizer
Grants
- Extension of PhD project for 8 months, at the Nuance foundation
- Grant for hosting a visitor (Ahmet Ustun)