The Spanish Evaluation Benchmark


EvalES tests the ability of a system in the Spanish language. Below are the results of the different models.

What is EvalES?

The EvalES benchmark consists of 7 tasks: Named Entity Recognition and Classification (CoNLL-NERC), Part-of-Speech Tagging (UD-POS), Text Classification (MLDoc), Paraphrase Identification (PAWS-X), Semantic Textual Similarity (STS), Question Answering (SQAC), and Textual Entailment (XNLI).

