Tools and Taggers

These are some taggers and tools you can find in the CORAL-Lab computers. Some of them you can install in your home computers too.

Note: We are still working on installing some of them, we hope to have everything working by the end of this semester (Fall 2022).


Taggers


TreeTagger

The TreeTagger is a tool for annotating text with part-of-speech and lemma information. It was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English, French, Italian, Danish, Swedish, Norwegian, Dutch, Spanish, Bulgarian, Russian, Portuguese, Galician, Greek, Chinese, Swahili, Slovak, Slovenian, Latin, Estonian, Polish, Persian, Romanian, Czech, Coptic and old French texts and is adaptable to other languages if a lexicon and a manually tagged training corpus are available.


CLAWS

CLAWS is a Part of Speech (POS) tagging software for English texts. CLAWS (the Constituent Likelihood Automatic Word-tagging System), has been continuously developed since the early 1980s.


Biber Tagger

The Biber Tagger is one of the most comprehensive tags in English. It annotates more than 150 linguistic features.


Stanford Core NLP

CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations


The Multidimensional Analysis Tagger (MAT)

The Multidimensional Analysis Tagger (MAT) is a program that replicates Biber's (1988) Variation across Speech and Writing tagger for the multidimensional functional analysis of English texts, generally applied for studies on text type or genre variation. The program can generate a grammatically annotated version of the corpus selected as well as the statistics needed to perform a text-type or genre analysis. The program plots the input text or corpus on Biber’s (1988) Dimensions and it determines its closest text type, as proposed by Biber (1989) A Typology of English Texts


TagAnt

TagAnt is a freeware, multi-language tagging tool built on top of the SpaCy natural language processing (NLP) framework.

Stats Tools