LanguageTool: Python Grammar and Spell Checker - Prediction Tricks (2023)

language tool

language toolis an open source grammar tool also known as the OpenOffice Spell Checker. This library allows you to catch grammar and spelling mistakes from a Python script or from a command line interface. let's work with himlanguage_tool_pytonPython package to install withpip install language tool pythonDomain. By default,language_tool_pythondownloads a LanguageTool server.Bottleand run in the background to catch grammatical errors locally. However, LanguageTool also offers onePublic HTTP Fix APIthis is also supported, but there is a limit on the number of calls.

LanguageTool ein Python

We will give you a practical example of how to identify and correct your grammatical errors. We work with the following text:

LanguageTool offers spelling and grammar checking. Just paste your text here and click the "Check Text" button. Click on the colored phrases for details on possible errors.ÖUse this textseen too muchsomevonthe problems that LanguageTool canrecognized. How are youthinkby grammar checkers? PleaseNOthey are not perfect. Style issues have a blue marker: that's 5PN. Am Abend.The weather was fineThursday 27 June 2017“.

(Video) LanguageTool: free and open-source grammar checker

I didboldgrammar problems. Let's see how we can detect them using Python:

import language_tool_pythontool = language_tool_python.LanguageTool('en-US')text = """LanguageTool provides spelling and grammar checking. Just paste your text here and click the "Check Text" button. Click on the colored sentences, to see details of possible errors Also use this text to see some of the problems LanguageTool can detect What do you think of the grammar checkers Note that they are not perfect Style problems have a blue marker: It's 5pm .The weather was nice on Thu- Fair, 27 Jun 2017"""# get the matchesmatches = tool.check(text)matches
[match({'ruleId': 'UPPERCASE_SENTENCE_START', 'message': 'This sentence will not be capitalized', 'replacements': ['OR'], 'context': '...Phrases for details on possible errors. or Also use this text to match some of the...', 'offset': 168, 'errorLength': 2, 'category': 'CASING', 'ruleIssueType': 'typographical'}), Match( {' ruleId ' : 'TOO_TO', 'message': 'Did you mean "ver"?', 'replacements': ['to see'], 'context': '...s about possible errors or usage Check this text too some of the language problems...', 'offset': 185, 'errorLength': 7, 'category': 'CONFUSED_WORDS', 'ruleIssueType': 'wrong spelling'}), Match( { 'ruleId': ' EN_A_VS_AN ' , 'message': 'Use "a" instead of \'an\' if the following word does not start with a vowel, for example \'a sentence\', \'a university\ '', 'substitutions' : ['a'], 'context': '... major errors. or use this text too, look at some of the issues LanguageToo...', 'offset': 193 , ' rLength error': 2, ' category' : 'MISC', 'ruleIssueType': 'Spelling error'}), Match({'ruleId': 'ENGLISH_WORD_REPEAT_RULE', 'message': 'Possible typo: repeated word', 'replacements': ['from'] , 'context': '...error. or use this text to see some of the issues LanguageTool can detect...', 'offset': 200, 'errorLength': 5, 'category': 'MISC', 'ruleIssueType': 'duplication'} ), Match ( {'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible misspelling found.', 'replacements': ['detect'], 'context': '...some of the problems LanguageTool can detect what do you think grammar checker...', 'offset': 241, 'errorLength': 6, 'category': 'TYPOS', 'ruleIssueType': 'wrong spelling'}), Match({'ruleId': 'DO_VBZ' , 'message': 'After the auxiliary verb \'do\', use the base form of the main verb. Did you mean 'think'?', 'replacements': ['think'], 'context' : ' . . .in LanguageTool you can see. What do you think about the grammar checker? Please don't...', 'offset': 261, 'errorLength': 6, 'category': 'GRAMMAR', 'ruleIssueType' : 'Grammar'}) , Match({'ruleId': 'PLEASE_NO_THAT', 'message': 'Did you mean "note"?", 'replacements': ['note'], 'context': '... Guess n you to grammar checker? Please, it's not that these aren't perfect style issues...', 'offset': 296, 'errorLength': 3, 'category': 'TYPOS', 'ruleIssueType': 'wrong spelling'}), match ( { 'ruleId': 'PM_IN_THE_EVENING', 'message': 'This is redundant. It is 5:00 p.m. The weather on Thursday 27 July was good...", 'offset': 366, ' errorLength' : 22 , 'category': 'REDUNDANCE', 'ruleIssueType': 'style' }), Match({'ruleId': 'DATE_WEEKDAY' , 'message': 'The date June 27, 2017 is not Thursday but Tuesday.', 'replacements': [], 'context': '...late . The weather was nice on Thursday 27 June 2017', 'compensation': 413, 'errorLength': 22, 'category' : 'SEMANTICS', 'ruleIssueType': 'inconsistency'})]

As we can see, we get a detailed dictionary showing theRule ID, IsNewsetc. For a detailed explanation of each rule ID, see theLanguageTool-Community. Interesting to see the error you got about the date returning a message containing:The date June 27, 2017 is not Thursday, but Tuesday.However, for this case you don't have a fix because you can't guess what the author meant by inserting this date 🙂

Now that we've spotted the errors, we can correct them.

my_errors = []my_corrections = []start_positions = []end_positions = []para regras sobre partidas: if len(rules.replacements)>0: start_positions.append(offset.rules) end_positions.append(rules.errorlength+offset.rules ) my_mistakes.append(text[rules.offset:rules.errorLength+rules.offset]) my_corrections.append(rules.replacements[0]) my_new_text = list(text)for m in range(len(start_positions)): for i in range(len(text)): my_new_text[start_positions[m]] = my_corrections[m] if (i>start_positions[m] e i<end_positions[m]): my_new_text[i]="" my_new_text = "". join(meu_novo_texto)meu_novo_texto

And we get:

LanguageTool offers spelling and grammar checking. Just paste your text here and click the "Check Text" button. Click on the colored phrases for details on possible errors. Or use this text to see some of the problems LanguageTool can detect. What do you think of grammar checkers? Please note that they are not perfect. Style issues have a blue marker: It's 5 p.m. M. The weather was fine Thursday, June 27, 2017

(Video) Real-Time Spelling Checker in Python

Spelling and grammatical errors

Let's take a look at the bugs we've discovered and the corresponding fixes.

[('o', 'O'), ('too see', 'see'), ('an', 'a'), ('de of', 'de'), ('detected', ' detect '), ('think', 'think'), ('no', 'note'), ('afternoon afternoon.', 'afternoon')]

detailed example

We will give a detailed example considering a simple one sentence example and see the result we get from thelanguage tool. Our sentence:

You are the best, but they are good too.!

text = "You're the best, but they're good too!"matches = tool.check(text)len(matches)# 4

LanguageTool found 4 issues. We can focus on any problem. Let's take a look.


And we get:

(Video) Spell Checker in Python | GingerIt Library | Sentence Corrector

Match({'ruleId': 'YOUR_YOU_RE', 'message': 'Did you mean "You are"?', 'replacements': ["You are"], 'context': 'You are the best, but they are good ones too!', 'offset': 0, 'errorLength': 4, 'category': 'TYPOS', 'ruleIssueType': 'wrong spelling'})

As we can see, mention thoseRule ID, ANewsfor the end user this isDid you mean "Are you?"", recommendedErsatz, Iscontextwhat is the entrancecompensateWhat is the location of the beginning of the problem thaterror lengththat is the number of characters in the subject, in our case 4 characters thatCategoryof the error that "WRITE ERROR"in our case and inreleIssueTypewhich "Spelling mistake".

We can show how we can name each element of thelanguage_tool_python.match.MatchWrite with the name followed by a period. Let's say we want to callErsatz.

matches[0].replacements# ["Du bist"]

Let's take a look at the other issues that LanguageTool detects. The second problem identified was that “they are“ which correctedleaves


And we get:

Match({'ruleId': 'THEIR_IS', 'message': 'Did you mean "there"?', 'replacements': ['there'], 'context': 'You're the best, but they're good too! ' , 'offset': 18, 'errorLength': 5, 'category': 'CONFUSED_WORDS', 'ruleIssueType': 'wrong spelling'})

The third problem identified was "Also“ which correctedAlso

(Video) Vim's built-in Spell-Checker, Corrections and Multilingual Dictionaries!


And we get:

Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found', 'replacements': ['also', 'okay'], 'context': 'You're the best, but so are you!ok !', 'offset': 28, 'errorLength': 5, 'category': 'TYPOS', 'ruleIssueType': 'spelling error'})

Finally, the last recognized problem was thisdouble spaceswhat is correctedonly space.


And we get:

Match({'ruleId': 'WHITESPACE_RULE', 'message': 'Possible typo: repeated space', 'replacements': [' '], 'context': 'You're the best, but so are they!' , 'offset': 33, 'errorLength': 2, 'category': 'TYPOGRAPHY', 'ruleIssueType': 'whitespace'})

Automatically apply suggestions to text

We can automatically apply suggestions to text like this:

import language_tool_pythontool = language_tool_python.LanguageTool('en-US')text = 'A sentence with an error in The Hitchhiker's Guide to the Galaxy'tool.correct(text)
'A misspelled sentence in Hitchhiker's Guide to the Galaxy'


If we want a free Python tool that works grammatically and supports over 20 languages, thenlanguage toolIt's a good choice Sure, no tool is perfect and not only can we rely on grammar and spell checkers, but it sure is something we can use mostly in NLP projects and tasks.


More data science hacks?

He canFollow us in the middlefor more data science tricks


1. Tutorial to build Deep Learning Punctuation Corrector in Python | Hugging Face Applied NLP
2. Language Checker
(Vivian Han)
3. Data Science Tools - Spell Checker and Auto Correction with Python[2019]
4. Build Grammarly API Opensource Grammar Correction Alternative with Gramformer & FastAPI in Python
5. Universal Tools: Spell Check
6. Dictionary Software | Part-5 | Word Auto-Correction Feature | Final Tutorial
(Brain Preps)


Top Articles
Latest Posts
Article information

Author: Van Hayes

Last Updated: 15/09/2023

Views: 6681

Rating: 4.6 / 5 (46 voted)

Reviews: 85% of readers found this page helpful

Author information

Name: Van Hayes

Birthday: 1994-06-07

Address: 2004 Kling Rapid, New Destiny, MT 64658-2367

Phone: +512425013758

Job: National Farming Director

Hobby: Reading, Polo, Genealogy, amateur radio, Scouting, Stand-up comedy, Cryptography

Introduction: My name is Van Hayes, I am a thankful, friendly, smiling, calm, powerful, fine, enthusiastic person who loves writing and wants to share my knowledge and understanding with you.