Sunday, June 8, 2014

Compiling corpi

So there's this German news corpus obtained between 1996 and 2000 from online retrieval that I intend to use for some of my NLP work, and it occurred to me that I could build a similar corpus (well, the monolingual side of it, anyway) by doing my own periodic retrievals.

To that end, here's the RSS feed pages for the Süddeutsche Zeitung, the Népszabaság Online, and the Népszava (published in New York for Hungarian-Americans).

No comments:

Post a Comment