CoNLL-U Tools Documentation

License Python Tests

CoNLL-U Tools is a Python toolkit for working with CoNLL-U files, Universal Dependencies treebanks, and annotated corpora. It provides utilities for format conversion, validation, evaluation, pattern matching, and morphological normalization, supporting workflows with CoNLL-U and brat standoff formats.

Features

  • Format Conversion: Bidirectional conversion between brat standoff and CoNLL-U formats

  • Validation: Check CoNLL-U files for format compliance and annotation guideline adherence

  • Evaluation: Score parser outputs against gold-standard files with comprehensive metrics

  • Pattern Matching: Find tokens and sentences matching complex linguistic criteria

  • Morphological Utilities: Normalize features, convert between tagsets (Perseus, ITTB, PROIEL, LLCT)

  • Extensible: Add custom tagset converters and feature mappings

Indices and tables