Python toolkit for standardized model hosting container implementations with Amazon SageMaker integration.
This repository provides a Python toolkit that enables TensorRT-LLM and vLLM integration with Amazon SageMaker hosting platform for efficient model deployment and inference.
ModelHostingContainerStandards/
├── python/ # Python implementation
│ ├── model_hosting_container_standards/ # Main Python package
│ │ ├── __init__.py
│ │ ├── config.py
│ │ ├── logging_config.py
│ │ ├── utils.py
│ │ ├── common/ # Common utilities
│ │ │ ├── fastapi/ # FastAPI integration
│ │ │ ├── custom_code_ref_resolver/ # Dynamic code loading
│ │ │ └── handler/ # Handler specifications
│ │ └── sagemaker/ # SageMaker integration
│ │ └── lora/ # LoRA adapter support
│ ├── tests/ # Package tests
│ ├── examples/ # Python examples and demos
│ ├── pyproject.toml # Python project configuration
│ ├── Makefile # Build automation
│ └── README.md # Python-specific documentation
├── docs/ # Documentation
├── examples/ # Top-level examples
├── .github/ # GitHub templates and workflows
├── Config # Shared configuration files
└── README.md # This file
cd python
poetry install
poetry shellSee the Python README for detailed usage instructions, examples, and development workflow.
When contributing to this repository:
- Place Python-specific code in the
python/directory - Follow the established patterns for project structure
- Include tests for new functionality
- Update documentation as needed
- Run pre-commit hooks to ensure code quality
See CONTRIBUTING for more information.
This project is licensed under the Apache-2.0 License.