Constants Module
The conllu_tools.constants module provides reference data for Universal Dependencies
annotations, including valid tags, relations, features, and mappings between different
annotation schemes.
These constants are used internally by the validation and utility functions, but are also available for direct use in custom processing pipelines.
Relation Categories
- conllu_tools.constants.CONTENT_DEPRELS: set[str]
Set of content (non-functional) dependency relations.
Content relations attach content words that contribute semantic meaning.
CoNLL-U Token Structure
XPOS Mappings
These mappings are used for converting between morphological features and Perseus-format XPOS tags.
- conllu_tools.constants.UPOS_TO_PERSEUS: dict[str, str]
Mapping from UPOS tags to Perseus XPOS first-position codes.
UPOS_TO_PERSEUS = { 'ADJ': 'a', 'ADP': 'r', 'ADV': 'd', 'AUX': 'v', 'CCONJ': 'c', 'DET': 'p', 'NOUN': 'n', 'NUM': 'm', 'PART': 't', 'PRON': 'p', 'PROPN': 'n', 'PUNCT': 'u', 'SCONJ': 'c', 'VERB': 'v', 'X': '-' }
- conllu_tools.constants.FEATS_TO_XPOS: dict[tuple[str, str], tuple[int, str]]
Mapping from (feature, value) pairs to (position, character) in Perseus XPOS.
See Utils for the complete mapping table.
- conllu_tools.constants.XPOS_TO_FEATS: dict[tuple[int, str], tuple[str, str]]
Inverse mapping from (position, character) to (feature, value).
This is the inverse of
FEATS_TO_XPOS.
- conllu_tools.constants.VALIDITY_BY_POS: dict[int, str]
Defines which XPOS positions are valid for which POS categories.
Used by
validate_xpos()to check position validity.VALIDITY_BY_POS = { 2: 'v', # Position 2 (person): only valid for verbs 3: 'nvapm', # Position 3 (number): nouns, verbs, adj, pron, num 4: 'v', # Position 4 (tense): only verbs 5: 'v', # Position 5 (mood/verbform): only verbs 6: 'v', # Position 6 (voice): only verbs 7: 'nvapm', # Position 7 (gender): nouns, verbs, adj, pron, num 8: 'nvapm', # Position 8 (case): nouns, verbs, adj, pron, num 9: 'a', # Position 9 (degree): only adjectives }
Treebank Concordances
These dictionaries define mappings for converting XPOS formats from specific Latin treebanks to the Perseus standard format.