Portfolio is an experimental ML model serving system that demonstrates caching and memory management patterns using a Least Recently Used (LRU) eviction strategy.
- LRU-based model caching with configurable memory limits
- Support for PyTorch and TensorFlow models
- RESTful API with FastAPI
- Configurable model loading/unloading
- Basic metrics and monitoring
- Memory usage tracking
- Python 3.9+
- pip
- virtualenv (recommended)
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install in development mode
pip install -e .
# Configure environment
export PORTFOLIO_ENV=development
export PORTFOLIO_CONFIG_PATH=config/development/config.yaml
# Run server
python -m portfolio.main
# Create example model
python examples/create_model.py
# Test inference
python examples/test_inference.py
# Check system status
python examples/test_system_status.py
# Run all tests
pytest tests/
# Run specific test categories
pytest tests/unit/
pytest tests/integration/
# Run with coverage
pytest --cov=src tests/
# Format code
make format
# Run linters
make lint
# Run all checks
make check
API documentation is available at http://localhost:8000/docs
when running the server.
For detailed implementation documentation, see the docs/
directory.
See config/development/config.yaml
for available configuration options:
models:
model_name:
path: "models/model.pt"
type: "pytorch"
memory_estimate: "1GB"
preload: true
cache:
max_memory: "8GB"
soft_limit: "6GB"
ttl: 3600
- Fork the repository
- Create your feature branch
- Write tests for new functionality
- Ensure all tests pass
- Submit a pull request
MIT