Tools and Taggers

These are some taggers and tools you can find in the CORAL-Lab computers. Some of them you can install in your home computers too.

Note: We are still working on installing some of them, we hope to have everything working by the end of this semester (Fall 2022).

Taggers

TreeTagger

The TreeTagger is a tool for annotating text with part-of-speech and lemma information. It was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English, French, Italian, Danish, Swedish, Norwegian, Dutch, Spanish, Bulgarian, Russian, Portuguese, Galician, Greek, Chinese, Swahili, Slovak, Slovenian, Latin, Estonian, Polish, Persian, Romanian, Czech, Coptic and old French texts and is adaptable to other languages if a lexicon and a manually tagged training corpus are available.

CLAWS

CLAWS is a Part of Speech (POS) tagging software for English texts. CLAWS (the Constituent Likelihood Automatic Word-tagging System), has been continuously developed since the early 1980s.

Biber Tagger

The Biber Tagger is one of the most comprehensive tags in English. It annotates more than 150 linguistic features.

Stanford Core NLP

CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations

The Multidimensional Analysis Tagger (MAT)

The Multidimensional Analysis Tagger (MAT) is a program that replicates Biber's (1988) Variation across Speech and Writing tagger for the multidimensional functional analysis of English texts, generally applied for studies on text type or genre variation. The program can generate a grammatically annotated version of the corpus selected as well as the statistics needed to perform a text-type or genre analysis. The program plots the input text or corpus on Biber’s (1988) Dimensions and it determines its closest text type, as proposed by Biber (1989) A Typology of English Texts

TagAnt

TagAnt is a freeware, multi-language tagging tool built on top of the SpaCy natural language processing (NLP) framework.

Tools

AntConc - http://www.laurenceanthony.net/software/antconc/
AntCorGen - https://www.laurenceanthony.net/software/antcorgen/
AntGram - http://www.laurenceanthony.net/software/antgram/
BootCat - http://bootcat.dipintra.it/?section=home
CasualConc - https://sites.google.com/site/casualconc/
Corpkit - http://interrogator.github.io/corpkit/
Coquery - https://www.coquery.org/
Hyper Collocation - https://hypcol.marutank.net/
Kfngram - http://www.kwicfinder.com/kfNgram/kfNgramHelp.html
LancsBox - http://corpora.lancs.ac.uk/lancsbox/
MonoConcEsy - https://monoconc.com/
MultiLingProfiler - https://www.multilingprofiler.net/
New Word Level Checker - https://nwlc.pythonanywhere.com/?fbclid=IwAR32cBtp0ERCslk0lX_uYMINh9RR2DF-pgJvjgiBaxCNpdv8u86tWUWrsVA
NoSketch Engine - https://nlp.fi.muni.cz/trac/noske
ProtAnt - http://www.laurenceanthony.net/software/protant/
Tools for Corpus Linguistics - https://corpus-analysis.com/?fbclid=IwAR0EdH0rQWtGscbVzRsT0oTjZI_If8-y6DZ7LiH7oxpFSoGwydzoqZC9Z4E
UAM Corpus Tool - http://www.corpustool.com/
VARD 2 - https://ucrel.lancs.ac.uk/vard/about/
Voyant Tools - https://voyant-tools.org/
WordSmith Tools - https://lexically.net/wordsmith/index.html

Stats Tools

Lancaster Stats Tools - http://corpora.lancs.ac.uk/stats/toolbox.php
Langtest - https://langtest.jp/
Statistics for Corpus Linguistics - https://corplingstats.wordpress.com/

Corpus Tools for Language Learners

SKELL - https://skell.sketchengine.eu/#home?lang=en
FLAX Library - http://flax.nzdl.org/greenstone3/flax
Just the Word - http://www.just-the-word.com/
Linggle - https://linggle.com/
Netspeak - https://netspeak.org/

Page updated

Google Sites

Report abuse