CLARIN to IMDI, DATE: 2009-08-13
Prague Dependency Treebank 2.0 (PDT 2.0)
Prague Dependency Treebank 2.0 (PDT 2.0)
The Prague Dependency Treebank 2.0 (PDT 2.0) contains a large amount of Czech texts with complex and interlinked morphological (2 million words), syntactic (1.5 MW) and complex semantic annotation (0.8 MW); in addition, certain properties of sentence information structure and coreference relations are annotated at the semantic level.
ISO639-3:ces
Czech
Unknown
Unknown
Unknown
Czech Republic
Treebank
PML (XML format)
2000/2006
Charles University in Prague, Institute of Formal and Applied Linguistics
Jan Hajic
http://ufal.mff.cuni.cz/pdt2.0/
http://ufal.mff.cuni.cz/pdt2.0/doc/pdt-guide/en/html/ch03.html
http://ufal.mff.cuni.cz/pdt2.0/doc/pdt-guide/en/html/ch06.html
1122
2217
http://ufal.mff.cuni.cz/pdt2.0/doc/pdt-guide/en/html/ch07.html
There are two ways to get PDT 2.0. The standard way is to order the full PDT 2.0 distribution through the Linguistic Data Consortium at http://www.ldc.upenn.edu; during the ordering process, you will be redirected to the form-based License web page, which you have to fill in for your order to be completed.
The other option is to browse parts of PDT 2.0 directly on our web pages at http://ufal.mff.cuni.cz/pdt2.0; it is an exact copy of the distribution provided by LDC, but only a small sample of the annotated data is included. You can do so before or after filling the registration form based on the License at http://ufal.mff.cuni.cz/corp-lic/pdt20-reg.html, but you are not allowed to use anything what you have downloaded (tools, sample data, etc.) without filling in the form. In other words, this license is not valid until registration.