Evaluation Module
The conllu_tools.evaluation module provides tools for evaluating CoNLL-U format annotations,
including computing precision, recall, and F1 scores for various annotation layers.
The evaluation framework is based on the official CoNLL shared task evaluation scripts,
with support for all standard UD evaluation metrics.
Main Classes
Evaluator
-
class conllu_tools.evaluation.evaluator.ConlluEvaluator(*, eval_deprels=True, treebank_type='0')[source]
Bases: WordProcessingMixin, TreeValidationMixin
Evaluator for Universal Dependencies CoNLL-U files.
-
__init__(*, eval_deprels=True, treebank_type='0')[source]
Initialize the evaluator.
- Parameters:
eval_deprels (bool) – Whether to evaluate dependency relations
treebank_type (str) – String indicating which enhancement types to disable (e.g., ‘12’ disables 1 and 2)
-
evaluate_files(gold_path, system_path)[source]
Evaluate system file against gold file.
- Parameters:
-
- Return type:
dict[str, Score]
- Returns:
Dictionary of metric names to Score objects
Score
-
class conllu_tools.evaluation.base.Score(gold_total, system_total, correct, aligned_total=None)[source]
Bases: object
Represents evaluation scores for a particular metric.
-
gold_total:
int | None
-
system_total:
int | None
-
correct:
int | None
-
aligned_total:
int | None = None
-
property precision: float
Calculate precision.
-
property recall: float
Calculate recall.
-
property f1: float
Calculate F1 score.
-
property aligned_accuracy: float | None
Calculate aligned accuracy.
-
__init__(gold_total, system_total, correct, aligned_total=None)
Supporting Classes
These classes are used internally by the evaluator but may be useful for advanced use cases.
UDWord
-
class conllu_tools.evaluation.base.UDWord(span, token, is_multiword, enhanced_deps=None, functional_children=None)[source]
Bases: object
Represents a word with its span and CoNLL-U token.
-
span:
UDSpan
-
token:
Token
-
is_multiword:
bool
-
enhanced_deps:
list[tuple[int | UDWord, list[str]]] | None = None
-
functional_children:
list[UDWord] | None = None
-
__hash__()[source]
Make UDWord hashable for use in dictionaries.
- Return type:
int
-
__init__(span, token, is_multiword, enhanced_deps=None, functional_children=None)
UDSpan
-
class conllu_tools.evaluation.base.UDSpan(start, end)[source]
Bases: object
Represents a span (start and end position) in the character array.
-
start:
int
-
end:
int
-
__init__(start, end)
Alignment
-
class conllu_tools.evaluation.base.Alignment(gold_words, system_words)[source]
Bases: object
Represents the alignment between gold and system words.
-
__init__(gold_words, system_words)[source]
Initialize alignment.
- Parameters:
-
-
append_aligned_words(gold_word, system_word)[source]
Add an aligned word pair.
- Parameters:
-
- Return type:
None
AlignmentWord
-
class conllu_tools.evaluation.base.AlignmentWord(gold_word, system_word)[source]
Bases: object
Represents an aligned pair of gold and system words.
-
gold_word:
UDWord
-
system_word:
UDWord
-
__init__(gold_word, system_word)
Exceptions
-
exception conllu_tools.evaluation.base.UDError[source]
Bases: Exception
Raised when there is an error in the UD data or evaluation process.