CLARIN to IMDI, DATE: 2009-08-13
Balkan-English corpus
Balkan-English corpus
3 000 000 tokens per language; aligned bilingually; Alignment – TMX, structural – XCES, morphosyntactic for Bulgarian only – XCES, MTE tagset. Deadline of the production of the full corpus including other Balkan languages – Macedonian, Turkish, Serbian, Romanian, Croatian : June 2007.
ISO639-3:bul
Bulgarian
Unknown
Unknown
Unknown
ISO639-3:eng
English
Unknown
Unknown
Unknown
Unknown
Greek
Unknown
Unknown
Unknown
Bulgaria
Aligned Corpus
Written Corpus
Elena Paskaleva ( Stelios Piperidis – for Greek material )
413
1046