/linkace-classifer

AI-powered automatic link classification for LinkAce using Ollama

Primary LanguagePythonMIT LicenseMIT

๐Ÿ”— LinkAce Link Classifier

AI-powered automatic link classification for LinkAce using Ollama

Python License LinkAce Ollama

Automatically classify links from a LinkAce input list into appropriate classification lists using AI-powered content analysis. The classifier uses Ollama for intelligent link classification with confidence scoring to ensure accurate categorization.

โœจ Features

  • ๐Ÿค– AI-Powered Classification: Uses Ollama server for intelligent link analysis
  • ๐ŸŽฏ Confidence Scoring: Only moves links with high confidence scores (configurable threshold)
  • ๐Ÿ”„ LinkAce Integration: Seamlessly integrates with LinkAce API v2.1+
  • ๐Ÿงช Dry Run Mode: Test classifications without making actual changes
  • โš™๏ธ Flexible Configuration: CLI arguments, config files, and environment variables
  • ๐Ÿ“Š Comprehensive Logging: Detailed progress tracking and classification reporting
  • ๐Ÿ’พ Export Results: Save classification results to CSV or JSON formats
  • ๐Ÿ›ก๏ธ Error Handling: Robust error handling with automatic retry and rate limiting

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.8+
  • LinkAce instance with API access
  • Ollama server running locally or remotely

Installation

Option 1: Install from source

  1. Clone the repository:

    git clone https://github.com/alx/linkace-classifier.git
    cd linkace-classifier
  2. Install the package:

    pip install .

Option 2: Development installation

git clone https://github.com/alx/linkace-classifier.git
cd linkace-classifier
pip install -e .
  1. Set up Ollama:
    # Install Ollama (see https://ollama.ai/)
    curl -fsSL https://ollama.ai/install.sh | sh
    
    # Pull a model
    ollama pull llama3.2
    
    # Start the server
    ollama serve

Basic Usage

linkace-classifier \
  --api-url https://your-linkace.com/api/v2 \
  --token YOUR_API_TOKEN \
  --input-list 12 \
  --classify-lists 1,2,3,4,5

Test with Dry Run

linkace-classifier \
  --api-url https://your-linkace.com/api/v2 \
  --token YOUR_API_TOKEN \
  --input-list 12 \
  --classify-lists 1,2,3,4,5 \
  --dry-run

๐Ÿ“– How It Works

  1. ๐Ÿ“ฅ Load Input List: Fetches all links from the specified input list
  2. ๐Ÿ“š Load Classification Context: Retrieves links from classification lists for AI context
  3. ๐Ÿค– AI Classification: For each input link:
    • Analyzes link content, title, and metadata
    • Compares against existing links in classification lists
    • Generates confidence scores for each potential classification
  4. ๐ŸŽฏ Threshold Filtering: Only processes classifications above confidence threshold (default: 0.8)
  5. ๐Ÿ”„ Link Movement: Removes links from input list and adds to appropriate classification lists
  6. ๐Ÿ“Š Results: Provides detailed summary of classifications and movements

โš™๏ธ Configuration

Command Line Arguments

Argument Description Required
--api-url LinkAce API base URL Yes
--token LinkAce API token Yes
--input-list Input list ID to classify links from Yes
--classify-lists Comma-separated classification list IDs Yes
--config Configuration file path No
--ollama-url Ollama server URL (default: http://localhost:11434) No
--ollama-model Ollama model to use (default: llama3.2) No
--confidence-threshold Confidence threshold (default: 0.8) No
--dry-run Run in dry-run mode No
--verbose Enable verbose output No
--output-file Output file for results (CSV or JSON) No

Configuration File

Create a configs/config.json file:

{
  "linkace_api_url": "https://your-linkace.com/api/v2",
  "linkace_api_token": "your-api-token",
  "input_list_id": 12,
  "classify_list_ids": [1, 2, 3, 4, 5],
  "ollama_url": "http://localhost:11434",
  "ollama_model": "llama3.2",
  "confidence_threshold": 0.8,
  "dry_run": false,
  "verbose": false
}

Generate a sample configuration:

python src/linkace_classifier/core/config.py

Environment Variables

export LINKACE_API_URL="https://your-linkace.com/api/v2"
export LINKACE_API_TOKEN="your-api-token"
export INPUT_LIST_ID=12
export CLASSIFY_LIST_IDS="1,2,3,4,5"
export OLLAMA_URL="http://localhost:11434"
export CONFIDENCE_THRESHOLD=0.8

๐Ÿ”ง LinkAce API Integration

Required API Endpoints

The classifier uses these LinkAce API v2.1+ endpoints:

  • GET /lists/{id}/links - Retrieve all links from a specific list
  • GET /links/{id} - Get detailed information about individual links
  • PUT /links/{id} - Update link list assignments

API Token Setup

  1. Log into your LinkAce instance
  2. Go to User Settings โ†’ API Tokens
  3. Create a new token with appropriate permissions
  4. Use the token in your configuration

๐Ÿงช Testing

Run the comprehensive test suite:

python tests/test_core.py

Run the demo with existing CSV data:

python scripts/demo_classifier.py

๐Ÿ“Š Example Output

[2024-01-15 10:30:00] INFO: Starting LinkAce Link Classifier
โœ… LinkAce API connection successful
โœ… Ollama server connection successful
[2024-01-15 10:30:01] INFO: Loaded 25 links from input list
[2024-01-15 10:30:02] INFO: Loaded 150 total links from 5 classification lists
Progress: |โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 100.0% (25/25)

============================================================
๐Ÿ“Š CLASSIFICATION SUMMARY
============================================================
Total links processed: 25
Links classified: 18
Links not classified: 7
Classification rate: 72.0%

Classifications by list:
  List 1: 8 links
  List 2: 5 links  
  List 3: 3 links
  List 4: 2 links

Confidence statistics:
  Average: 0.847
  Range: 0.801 - 0.923
============================================================

๐Ÿ”ง Advanced Usage

Custom Ollama Models

linkace-classifier \
  --ollama-model llama3.1:70b \
  --ollama-url http://localhost:11434 \
  [other options]

Batch Processing with Custom Threshold

linkace-classifier \
  --confidence-threshold 0.7 \
  --output-file results.csv \
  --verbose \
  [other options]

Configuration File Usage

linkace-classifier --config configs/config.json

HTTP API Server

Start the HTTP API server:

linkace-classifier-server --config configs/config.json --host 0.0.0.0 --port 8080

Make classification requests:

curl -X POST http://localhost:8080/classify \
  -H "Content-Type: application/json" \
  -d '{"url": "https://github.com/user/repo"}'

๐Ÿ›ก๏ธ Security Considerations

  • API Token Security: Tokens are never logged or exposed in output
  • Input Validation: All inputs are validated and sanitized
  • Rate Limiting: Built-in rate limiting prevents API abuse
  • Safe Defaults: Conservative defaults for all operations
  • Dry Run Testing: Always test with --dry-run before production use

๐Ÿš€ Performance & Scalability

  • Batch Processing: Efficiently handles large link collections
  • Pagination Support: Automatically handles paginated API responses
  • Memory Efficient: Processes links in batches to manage memory usage
  • Rate Limiting: Configurable delays between API calls
  • Progress Tracking: Real-time progress indicators for long-running operations
  • Resumable Operations: Graceful handling of interruptions

๐Ÿค– Supported Ollama Models

The classifier works with any Ollama model, but these are recommended:

Model Speed Accuracy Use Case
llama3.2 Fast Good Default choice
llama3.1:70b Slow Excellent High-accuracy needs
codellama:13b Medium Good Technical links
mistral:7b Very Fast Fair Quick processing

๐Ÿ› Troubleshooting

Common Issues

โŒ LinkAce API Connection Failed

โŒ LinkAce API connection failed: 404 Client Error
  • Verify your LinkAce URL and API token
  • Ensure API token has necessary permissions
  • Check LinkAce instance is running and accessible

โŒ Ollama Connection Failed

โŒ Ollama server connection failed
  • Start Ollama server: ollama serve
  • Verify server URL and port
  • Check model availability: ollama list

โš ๏ธ No Classifications Above Threshold

โš ๏ธ No classifications above threshold
  • Lower confidence threshold: --confidence-threshold 0.7
  • Ensure classification lists have sufficient context links
  • Verify input links are accessible and have content

๐Ÿ”„ Rate Limiting Issues

429 Too Many Requests
  • Increase rate limit delay in configuration
  • Use smaller batch sizes
  • Check LinkAce instance rate limits

Debug Mode

Enable detailed logging:

linkace-classifier --verbose [other options]

๐Ÿ“ Project Structure

linkace-classifier/
โ”œโ”€โ”€ README.md                    # Project documentation
โ”œโ”€โ”€ LICENSE                      # License file
โ”œโ”€โ”€ requirements.txt             # Python dependencies
โ”œโ”€โ”€ setup.py                     # Package setup
โ”œโ”€โ”€ pyproject.toml              # Modern Python packaging
โ”œโ”€โ”€ Dockerfile                   # Container setup
โ”œโ”€โ”€ docker-compose.yml          # Container orchestration
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ linkace_classifier/     # Main package
โ”‚       โ”œโ”€โ”€ __init__.py         # Package initialization
โ”‚       โ”œโ”€โ”€ core/               # Configuration, utilities, classifier
โ”‚       โ”œโ”€โ”€ api/                # LinkAce & Ollama clients
โ”‚       โ”œโ”€โ”€ http/               # Flask server
โ”‚       โ”œโ”€โ”€ cli/                # Command-line interfaces
โ”‚       โ”œโ”€โ”€ services/           # Classification service
โ”‚       โ””โ”€โ”€ validation/         # URL validation
โ”œโ”€โ”€ tests/                      # Test files
โ”œโ”€โ”€ configs/                    # Configuration files
โ”œโ”€โ”€ scripts/                    # Demo and legacy scripts
โ”œโ”€โ”€ docs/                       # Documentation
โ””โ”€โ”€ examples/                   # Usage examples

๐Ÿค Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass: python tests/test_core.py
  6. Submit a pull request

Reporting Issues

Please use the GitHub Issues to report bugs or request features.

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • LinkAce - The excellent bookmark manager this tool integrates with
  • Ollama - The AI inference engine powering intelligent classification

๐ŸŒŸ Support

  • Documentation: Check this README and inline code documentation
  • Issues: Report bugs via GitHub Issues
  • Discussions: Join conversations in GitHub Discussions

Made with โค๏ธ for the LinkAce community