A corpus of 2400 tweets, collected for research into marking of Dutch kinship terms. We searched Twitter for 24 Dutch kinship terms and selected the first 100 positive hits. A hit was considered positive when it included the kinship term, and the term was pre--modified by 1) a possessive pronoun 2) a definite article or 3) a zero-marker. It was excluded when the kinship term was used with a different meaning and when there was a post-modifier. All tweets are provided in .txt format.
Along with the corpus itself we include an Excel with all annotations, i.e. classification of the kinship terms, pre-modifiers, and data on the gender of authors, etc.