Tools and Taggers
These are some taggers and tools you can find in the CORAL-Lab computers. Some of them you can install in your home computers too.
Note: We are still working on installing some of them, we hope to have everything working by the end of this semester (Fall 2022).
The TreeTagger is a tool for annotating text with part-of-speech and lemma information. It was developed by Helmut Schmid in the TC project at the Institute for Computational Linguistics of the University of Stuttgart. The TreeTagger has been successfully used to tag German, English, French, Italian, Danish, Swedish, Norwegian, Dutch, Spanish, Bulgarian, Russian, Portuguese, Galician, Greek, Chinese, Swahili, Slovak, Slovenian, Latin, Estonian, Polish, Persian, Romanian, Czech, Coptic and old French texts and is adaptable to other languages if a lexicon and a manually tagged training corpus are available.
CLAWS is a Part of Speech (POS) tagging software for English texts. CLAWS (the Constituent Likelihood Automatic Word-tagging System), has been continuously developed since the early 1980s.
The Biber Tagger is one of the most comprehensive tags in English. It annotates more than 150 linguistic features.
CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations
The Multidimensional Analysis Tagger (MAT)
The Multidimensional Analysis Tagger (MAT) is a program that replicates Biber's (1988) Variation across Speech and Writing tagger for the multidimensional functional analysis of English texts, generally applied for studies on text type or genre variation. The program can generate a grammatically annotated version of the corpus selected as well as the statistics needed to perform a text-type or genre analysis. The program plots the input text or corpus on Biber’s (1988) Dimensions and it determines its closest text type, as proposed by Biber (1989) A Typology of English Texts
TagAnt is a freeware, multi-language tagging tool built on top of the SpaCy natural language processing (NLP) framework.
AntCorGen - https://www.laurenceanthony.net/software/antcorgen/
BootCat - http://bootcat.dipintra.it/?section=home
CasualConc - https://sites.google.com/site/casualconc/
Corpkit - http://interrogator.github.io/corpkit/
Coquery - https://www.coquery.org/
Hyper Collocation - https://hypcol.marutank.net/
Kfngram - http://www.kwicfinder.com/kfNgram/kfNgramHelp.html
LancsBox - http://corpora.lancs.ac.uk/lancsbox/
MonoConcEsy - https://monoconc.com/
MultiLingProfiler - https://www.multilingprofiler.net/
New Word Level Checker - https://nwlc.pythonanywhere.com/?fbclid=IwAR32cBtp0ERCslk0lX_uYMINh9RR2DF-pgJvjgiBaxCNpdv8u86tWUWrsVA
NoSketch Engine - https://nlp.fi.muni.cz/trac/noske
Tools for Corpus Linguistics - https://corpus-analysis.com/?fbclid=IwAR0EdH0rQWtGscbVzRsT0oTjZI_If8-y6DZ7LiH7oxpFSoGwydzoqZC9Z4E
UAM Corpus Tool - http://www.corpustool.com/
Voyant Tools - https://voyant-tools.org/
WordSmith Tools - https://lexically.net/wordsmith/index.html
Lancaster Stats Tools - http://corpora.lancs.ac.uk/stats/toolbox.php
Langtest - https://langtest.jp/
Statistics for Corpus Linguistics - https://corplingstats.wordpress.com/
Corpus Tools for Language Learners
FLAX Library - http://flax.nzdl.org/greenstone3/flax
Just the Word - http://www.just-the-word.com/
Linggle - https://linggle.com/
Netspeak - https://netspeak.org/