/kistmath-ai

Kistmat-AI is an advanced machine learning model designed to solve a wide range of mathematical problems, from elementary arithmetic to university-level calculus. This project demonstrates the application of curriculum learning in AI, allowing the model to progressively tackle more complex mathematical concepts.

Primary LanguagePythonMIT LicenseMIT

Kistmath-AI

Kistmat-AI: A Machine Learning Model for Mathematical Problem Solving

Project Description

Kistmat-AI is an advanced machine learning model designed to solve a wide range of mathematical problems, from elementary arithmetic to university-level calculus. This project demonstrates the application of curriculum learning in AI, allowing the model to progressively tackle more complex mathematical concepts.

Key Features

  • Curriculum Learning: The model starts with basic arithmetic and gradually progresses to advanced calculus, mimicking human learning patterns.
  • Multi-stage Learning: Includes stages from elementary school math to university-level calculus.
  • Dynamic Difficulty Adjustment: Automatically adjusts problem difficulty based on the model's performance.
  • External Memory Mechanism: Implements an advanced external memory system to enhance the model's ability to retain and apply learned concepts. This includes:
    • Conceptual Memory
    • Short-term Memory
    • Long-term Memory
    • Inference Memory
    • Integrated Memory System
  • Symbolic Reasoning: Incorporates a symbolic reasoning component to handle abstract mathematical concepts.
  • Parallel Processing: Utilizes parallel processing for efficient training and reinforcement learning.
  • Real-time Visualization: Provides real-time plotting of learning curves and performance metrics.

Usage Examples

Example 1: Basic Arithmetic Problem

from Kistmath_AI.models.kistmat_ai import Kistmat_AI
from Kistmath_AI.utils.tokenization import tokenize_problem

# Initialize the model
model = Kistmat_AI(input_shape=(100,), output_shape=(1,))

# Tokenize a basic arithmetic problem
problem = "2 + 2"
tokenized_problem = tokenize_problem(problem, vocab_size=1000, max_length=100)

# Predict the solution
solution = model(tokenized_problem)
print(f"Problem: {problem}, Solution: {solution}")

Example 2: Advanced Calculus Problem

from Kistmath_AI.models.kistmat_ai import Kistmat_AI
from Kistmath_AI.utils.tokenization import tokenize_calculus_problem

# Initialize the model
model = Kistmat_AI(input_shape=(100,), output_shape=(1,))

# Tokenize a calculus problem
problem = "integrate x^2 from 0 to 1"
tokenized_problem = tokenize_calculus_problem(problem, max_terms=10)

# Predict the solution
solution = model(tokenized_problem)
print(f"Problem: {problem}, Solution: {solution}")

Technical Details

  • Framework: TensorFlow and Keras
  • Architecture: Custom LSTM-based neural network with attention mechanisms
  • Additional Libraries: NumPy, SymPy, Scikit-learn, Matplotlib

Project Structure and Design Principles

  • Modular Design: Functions are defined outside the main function to promote reusability and modularity.
  • Readability and Maintainability: Code is structured for easy reading and maintenance, with each function being independently understandable and modifiable.
  • Global Scope: Module-level function definitions ensure accessibility throughout the code.
  • Logical Organization: Functions are grouped by purpose (e.g., utility functions, training functions, visualization functions).
  • Test-Friendly: Independent functions facilitate unit testing.
  • Flexibility: The structure allows for easy execution of specific parts of the code independently.
  • Component Division: The code is divided into multiple files, improving organization and maintainability.

Project Structure

KistMath_AI/
│
├── config/
│   ├── __init__.py
│   ├── settings.py
│   └── tensorflow_config.py
│
├── models/
│   ├── __init__.py
│   ├── kistmat_ai.py
│   ├── symbolic_reasoning.py
│   ├── external_memory.py
│   └── external_memory/
│       ├── __init__.py
│       ├── base.py
│       ├── btree.py
│       ├── conceptual_memory.py
│       ├── external_memory.py
│       ├── formulative_memory.py
│       ├── inference_memory.py
│       ├── integrated_memory_system.py
│       ├── long_term_memory.py
│       └── short_term_memory.py
│
├── training/
│   ├── __init__.py
│   ├── curriculum_learning.py
│   ├── parallel_training.py
│   └── training_pipeline.py
│
├── utils/
│   ├── __init__.py
│   ├── data_generation.py
│   ├── data_preparation.py
│   ├── evaluation.py
│   ├── logging.py
│   └── tokenization.py
│
├── visualization/
│   ├── __init__.py
│   ├── plotting.py
│   ├── real_time_plotter.py
│   └── visualization_process.py
│
├── test/
│   ├── __init__.py
│   ├── test_a.py
│   ├── test_curriculum_learning_and_data_generation.py
│   ├── test_kistmat_ai.py
│   ├── test_symbolic_reasoning.py
│   ├── test_utils.py
│   └── external_memory_tests/
│       ├── test_btree.py
│       ├── test_conceptual_memory.py
│       ├── test_external_memory.py
│       ├── test_formulative_memory.py
│       ├── test_inference_memory.py
│       ├── test_integrated_memory_system.py
│       ├── test_long_term_memory.py
│       └── test_short_term_memory.py
│
├── main.py
├── README.md
└── requirements.txt

Computational Complexity

High Complexity Areas:

  • Neural network model (Kistmat-AI)
  • Parallel processing implementation
  • Symbolic reasoning component

Medium Complexity Areas:

  • Curriculum learning process
  • Data generation and preprocessing

Computationally Intensive Operations:

  • Model training and inference
  • Parallel reinforcement learning
  • Real-time visualization for large datasets

Current Limitations and Areas for Improvement

  • Model Architecture: The current architecture may be overly complex for simpler mathematical tasks.
  • Memory Usage: The external memory mechanism could be optimized for more efficient use of computational resources.
  • Scalability: The current implementation may face challenges with extremely large datasets or very complex mathematical problems.
  • Visualization: Real-time plotting may become a bottleneck with very large datasets.

Current Priorities and Planned Improvements

Enhancement of Reasoning and Concept Understanding

  • Implement advanced natural language processing techniques for better comprehension of mathematical problems.
  • Develop a more robust symbolic reasoning module to handle abstract mathematical concepts.

Optimization of Data Loading and Problem Generation

  • Implement a DynamicProblemGenerator to generate problems on-demand, improving memory efficiency.
  • Create create_dynamic_dataset to integrate dynamic problem generation with tf.data.
  • Modify smooth_curriculum_learning to use dynamic datasets, allowing real-time adjustments of difficulty.

Improvement of Memory Usage (High Priority)

  • Optimize the external memory mechanism for more efficient use of computational resources.
  • Implement pruning and quantization techniques to reduce model size without significantly sacrificing performance.
  • Develop a more sophisticated memory management system to handle complex mathematical concepts efficiently.

Training Methodology Enhancement (High Priority)

  • Implement advanced curriculum learning strategies with dynamic difficulty adjustment.
  • Develop a hybrid training approach combining supervised learning with reinforcement learning for problem-solving strategies.
  • Introduce meta-learning techniques to improve the model's ability to learn new mathematical concepts quickly.

Model Usage on CPU and GPU

High Priority: System Resource Utilization

  • This model is designed to fully utilize system resources, whether on CPU or GPU. Depending on the available hardware, the model can efficiently operate on both types of processors.

GPU Detection and Usage:

  • If a GPU is detected, the model will automatically utilize the GPU for compute-intensive operations.
  • The implementation is optimized to take full advantage of the GPU's capabilities, ensuring fast and efficient execution.

CPU Usage:

  • In the absence of a GPU, the model will run on the CPU, using all available cores to maximize efficiency.
  • The configuration allows the model to fully leverage the CPU's processing power, distributing tasks optimally.

Compatibility between CPU and GPU:

  • The code is designed to be fully compatible with both types of hardware. This means that if the model is developed or trained on a CPU, it can be transferred and executed on a GPU without significant modifications, and vice versa.
  • This flexibility ensures that the model can be deployed in a variety of environments, adapting to the available hardware capabilities.

Code Modularization and Maintenance Improvement (High Priority)

  • Refactor the codebase into smaller, more manageable components.
  • Create separate modules for problem generation, model architecture, training loops, and evaluation metrics.
  • Implement a plugin architecture to allow easy addition of new mathematical concepts and problem types.

Expansion to Visual Tasks

  • Implement a Convolutional Neural Network (CNN) for processing mathematical image tasks.
  • Develop methods to extract and analyze activations from intermediate CNN layers.
  • Create a sparse autoencoder to decompose activations and identify visual patterns in mathematical notations.

Advanced Pattern Recognition

  • Implement visual attention techniques to identify key elements in visually presented mathematical problems.
  • Develop a mathematical symbol recognition system to interpret handwritten equations.

Model Behavior Manipulation

  • Experiment with artificial modification of activations to alter model behavior in problem-solving.
  • Develop methods to control model perception by manipulating specific components.

Enhanced Visualization

  • Create advanced techniques to visualize learned concepts across different mathematical domains.
  • Implement tools for visualizing "polysemantic neurons" in mathematical contexts.

Interpretability Enhancements

  • Develop interpretable regularization techniques.
  • Implement mechanisms to track neuron evolution during training.
  • Create tools for gradient analysis to better understand feature importance in problem-solving.

Robustness Testing

  • Develop a suite of tests to evaluate model robustness against various types of manipulations.

Technical Considerations

  • Current CPU Focus: The code is currently optimized to work primarily on CPU, with future plans for GPU optimization.
  • Reproducibility: Random seeds will be implemented to ensure reproducibility of generated problems.
  • Custom Callbacks: Custom callbacks will be developed to handle real-time visualization and other specific functionalities.
  • Dynamic Evaluation: Model readiness evaluation will be adapted to work with dynamic datasets.
  • Optimized Parallelism: Parallel training will be adjusted to make the most of available computational resources.

Practical Applications

Explore applications in:

  • Visual signal detection
  • Robustness against adversarial attacks
  • Image editing in mathematical contexts

Ethical Considerations

  • Establish comprehensive evaluation metrics.
  • Conduct thorough analysis of the ethical implications of model manipulations and interpretations.

Needed Improvements

1. Documentation and Comments

  • Add docstrings to classes and methods
  • Include inline comments for complex logic
  • Explain algorithms and important decisions

2. Unit Tests

  • Create a tests/ directory
  • Implement tests for key functions and methods using pytest or unittest
  • Cover edge cases and typical scenarios

3. Refactoring and Modularization

  • Separate code into logical modules (e.g., model.py, data_generation.py, training.py)
  • Improve code organization and separation of concerns

4. Error Handling

  • Implement exception handling in functions prone to errors
  • Use try-except blocks and raise custom exceptions when necessary

5. Code Optimization

  • Review and optimize performance-critical sections
  • Consider using vectorized operations where applicable

Usage

To install the required dependencies, run:

pip install -r requirements.txt

This will install all necessary packages, including:

  • tensorflow==2.16.1
  • tensorflow-model-optimization==0.8.0
  • scikit-learn==1.3.2
  • tf-keras==2.16.0
  • keras-tuner==1.4.7
  • keras==3.3.3
  • torch
  • [Any additional dependencies listed in requirements.txt]

Contributing

Contributions to improve Kistmat-AI are welcome. Please follow these steps to contribute:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature-branch)
  3. Make your changes and commit them (git commit -am 'Add some feature')
  4. Push to the branch (git push origin feature-branch)
  5. Create a new Pull Request

Please ensure your code adheres to the project's coding standards and include appropriate tests for new features.

Testing

To run the tests, including the new external memory tests, use the following command:

python -m pytest

This will run all tests, including those in the test/external_memory_tests/ directory.

Configuration

The project now includes configuration files in the config/ directory:

  • settings.py: Contains general project settings.
  • tensorflow_config.py: Specific configurations for TensorFlow.

Make sure to review and adjust these configuration files as needed before running the project.