Chat
Search
Ithy Logo

Python Excel Text Analyzer

A complete interactive script for grammar and spelling correction

physical scenery of computer desk and books

Key Highlights

  • Interactive Correction: The script identifies potential grammar and spelling mistakes and lets you confirm or override the corrections.
  • Context-Awareness: Using libraries such as language_tool_python, the corrections are sensitive to the context of the sentence.
  • Excel Integration: The tool reads the source Excel file with openpyxl, applies corrections, and writes the output to a new Excel file.

Detailed Script Overview and Implementation

This Python script is designed to scan through an Excel file and analyze each text cell for potential grammar or spelling mistakes. The script leverages the openpyxl library to read and write Excel files, language_tool_python for grammar and punctuation checking, and pyspellchecker for identifying spelling errors.

Step-by-step Breakdown

1. Required Libraries and Setup

The script requires three main libraries:

  • openpyxl: To load and manipulate Excel spreadsheets.
  • language_tool_python: To detect grammar mistakes with contextual analysis.
  • pyspellchecker: To detect and suggest potential corrections for misspelled words.

Ensure you install these dependencies if you have not already, using the pip command:

# Install necessary packages
pip install openpyxl language_tool_python pyspellchecker

2. Loading and Reading the Excel File

The script opens the Excel file with openpyxl and iterates through each row. It assumes that the majority of text is located in the first column but can be easily adjusted for different layouts.

3. Checking for Grammar and Spelling Issues

For each cell containing text, the following operations are carried out:

  • Grammar Check: The program uses the language_tool_python library to analyze the text and return a list of potential grammar issues including suggestions and exact positions of errors.
  • Spelling Check: It uses the pyspellchecker to find words that might be misspelled. Each identified mistake comes with a recommended correction.

4. Interactive User Interface for Corrections

When errors are found, the script displays the error details (such as offset, length, and suggestions) so you can decide whether the text is indeed flawed.

If you confirm the error by entering "yes," the script will provide its suggested correction. If you decline the automated suggestion, the program lets you manually input the corrected version.

After you confirm the final correction, the script saves the updated text for that cell. This process is repeated for each text entry scanned from the Excel file.

5. Writing the Corrected Data to a New Excel File

After processing the entire file, the script creates a new Excel file containing all corrected texts. This file is named with a suffix indicating that corrections have been applied.

Complete Python Script

import openpyxl
from openpyxl import load_workbook
import language_tool_python
from spellchecker import SpellChecker

def correct_text(text, tool, spell):
    """
    Process text to check for grammar and spelling errors,
    then return the corrected text based on user feedback.
    """
    # Check for grammar mistakes using language_tool_python
    grammar_matches = tool.check(text)
    # Check spelling mistakes
    words = text.split()
    misspelled = list(spell.unknown(words))
    
    corrections_made = text
    
    # Display grammar errors with details
    if grammar_matches:
        print("\nDetected grammar issues:")
        # Create a table for error details
        print("{:<10} {:<15} {:<50}".format("Offset", "Error Length", "Message"))
        for match in grammar_matches:
            print("{:<10} {:<15} {:<50}".format(match.offset, match.errorLength, match.msg))
    
    # Let the user decide on grammar corrections
    if grammar_matches:
        decision = input("\nAre the above grammar issues valid? (yes/no): ").strip().lower()
        if decision == "yes":
            # Use the tool's suggestion to correct grammar
            suggested = tool.correct(text)
            print("\nSuggested grammar correction:")
            print(suggested)
            user_input = input("Accept the suggested correction? (yes/no) or enter your own: ").strip().lower()
            if user_input == "yes":
                corrections_made = suggested
            elif user_input == "no":
                custom = input("Enter your own grammar correction: ").strip()
                corrections_made = custom
            else:
                # If custom correction provided directly
                corrections_made = user_input
    
    # Process spelling corrections separately
    if misspelled:
        print("\nDetected misspelled words:")
        # Create a table for misspelling suggestions
        print("{:<20} {:<20}".format("Misspelled Word", "Suggested Correction"))
        corrected_words = {}
        for word in misspelled:
            suggested_word = spell.correction(word)
            corrected_words[word] = suggested_word
            print("{:<20} {:<20}".format(word, suggested_word))
            
        decision = input("\nAre the spelling issues valid? (yes/no): ").strip().lower()
        if decision == "yes":
            # Replace each misspelled word with its suggested correction
            for wrong, suggestion in corrected_words.items():
                corrections_made = corrections_made.replace(wrong, suggestion)
            print("\nPost-spelling correction result:")
            print(corrections_made)
            user_final = input("Confirm final corrected sentence? (yes/no): ").strip().lower()
            if user_final != "yes":
                corrections_made = input("Enter your own final correction: ").strip()
        else:
            # If the user rejects spelling corrections, allow manual entry
            corrections_made = input("Enter your full corrected sentence: ").strip()
    
    return corrections_made

def analyze_excel(input_file, output_file):
    # Load the workbook and select the active worksheet
    wb = load_workbook(input_file)
    sheet = wb.active
    
    # Instantiate the language and spelling tools
    tool = language_tool_python.LanguageTool('en-US')
    spell = SpellChecker()
    
    # Iterate through each row and cell in the worksheet
    for row_index in range(1, sheet.max_row + 1):
        for col_index in range(1, sheet.max_column + 1):
            cell = sheet.cell(row=row_index, column=col_index)
            if cell.value is not None and isinstance(cell.value, str):
                print("\nAnalyzing row {}, column {}:".format(row_index, col_index))
                print("Original Text: {}".format(cell.value))
                # Analyze and correct the cell text interactively
                corrected = correct_text(cell.value, tool, spell)
                print("Final corrected text: {}".format(corrected))
                cell.value = corrected
    
    # Save the corrected workbook to a new Excel file
    wb.save(output_file)
    print("\nCorrected Excel file saved as: {}".format(output_file))

if __name__ == "__main__":
    print("=== Excel Text Analyzer ===")
    input_path = input("Enter the path of the Excel file to analyze (e.g., input.xlsx): ").strip()
    output_path = input("Enter the desired output file name (e.g., corrected.xlsx): ").strip()
    analyze_excel(input_path, output_path)
  

Table Overview of Error Details

The table below summarizes the error details provided by the script:

Offset Error Length Error Message
e.g., 5 e.g., 2 e.g., Spelling mistake: 'teh' - did you mean 'the'?

References


Related Queries and Further Exploration


Last updated March 11, 2025
Ask Ithy AI
Export Article
Delete Article