A comprehensive tool for checking Tamil text spelling, grammar, and overall correctness using multiple AI models.
- Multiple model support:
- Rule-based checking
- Deep Learning analysis (using Indic-BERT)
- Statistical analysis
- Google Gemma integration
- Real-time error detection
- Spelling and grammar correction
- Confidence scores for each model
- User-friendly Streamlit interface
- Clone the repository:
git clone https://github.com/arsath-eng/Tamil-spell-checker.git
cd Tamil-spell-checker
- Install dependencies:
pip install -r requirements.txt
- Set up environment variables:
cp .env.example .env
# Edit .env with your GROQ_API_KEY
- Start the Streamlit application:
streamlit run main.py
-
Access the web interface at
http://localhost:8501
-
Choose your input method:
- Enter custom text
- Use example text
-
Select checking options:
- Spelling check
- Grammar check
-
Click "Check Text" to analyze
- Uses predefined Tamil dictionary and grammar rules
- Fast and deterministic results
- Best for basic spelling and grammar checks
- Powered by AI4Bharat's Indic-BERT
- Handles complex language patterns
- Suitable for nuanced grammar analysis
- Uses TF-IDF and Naive Bayes classification
- Trained on correct Tamil text samples
- Good for identifying unusual patterns
- Leverages Groq's Gemma 2B model
- Provides detailed language insights
- Offers correction suggestions
- Basic Spelling Check:
Input: நான் பள்ளிக்கு சல்கிறேன்
Output: Spelling error detected in "சல்கிறேன்" - Suggested: "செல்கிறேன்"
- Grammar Check:
Input: நான் பள்ளிக்கு செல்கிறார்கள்
Output: Grammar error - Subject-verb agreement mismatch
tamil-text-checker/
├── main.py
├── requirements.txt
├── models/
│ ├── __init__.py
│ ├── rule_based_model.py
│ ├── deep_learning_model.py
│ ├── statistical_model.py
│ └── google_gemma_model.py
├── data/
│ └── tamil_words.txt
└── .env
Main dependencies include:
- streamlit >= 1.24.0
- indic-nlp-library >= 0.91
- transformers >= 4.30.2
- torch >= 2.2.0
- tensorflow >= 2.13.0
- scikit-learn >= 1.2.2
- Groq API client
- python-dotenv
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
- AI4Bharat for the Indic-BERT model
- Indic NLP Library contributors
- Groq for the Gemma model API
- Tamil language experts who helped validate the rule sets