JOS corpora of Slovene
- Host: Jožef Stefan Institute
- Other institutions involved: Faculty of Arts, University of Ljubljana
- URL: http://nl.ijs.si/jos/index-en.html
- Main language: Slovene
General description: The JOS project developed Slovene annotated corpora and associated resources meant to facilitate development of Human Language Technologies for the Slovene language. The main results are the JOS morphosyntactic specifications (tagset definition), two annotated corpora, and two Web services. The developed resources are available under the Creative Commons licences.
Implementation description: The corpora and morphosyntactic specifications are encoded in TEI P5 using the additional modules for
corpora, linking, analysis and iso-fs plus a few local extensions.
Related resources: Links to papers describing the corpora are given at http://nl.ijs.si/jos/index-en.html#bib
Copyright information: The corpora are distributed under the Creative Commons, Attribution, Non-commercial licence.
Contact:
Tomaž Erjavec
Department of Knowledge Technologies
Jožef Stefan Institute
Jamova cesta 39
1000 Ljubljana
Slovenia
Email: tomaz.erjavec@ijs.si