Core functionality for machine learning provenance tracking with C2PA (Coalition for Content Provenance and Authenticity) support.
Atlas Common provides essential building blocks for creating content authenticity systems that track the provenance of machine learning models, datasets, and related assets throughout their lifecycle.
- 🔐 Cryptographic Hashing: SHA-256/384/512 with constant-time comparison
- 📋 C2PA Metadata: Types and utilities for C2PA manifest management
- 💾 Storage Abstractions: Backend-agnostic storage interfaces
- ✅ Validation: Validation for manifests, URNs, and hashes
- 🛡️ Secure File Operations: Protection against symlink and hardlink attacks
- ⚡ Async Support: Optional async/await for storage operations
Add this to your Cargo.toml
:
[dependencies]
atlas-common = "0.1.0"
hash
(default): Cryptographic hash functionsc2pa
(default): C2PA manifest and asset typesstorage
: Storage backend abstractionsvalidation
: Validation utilitiesfile-utils
: Secure file operation utilitiesasync
: Async support for storage operationsfull
: Enable all features
To use specific features:
[dependencies]
atlas-common = { version = "0.1.0", features = ["all"] }
use atlas_common::hash::{calculate_hash, verify_hash, HashAlgorithm};
// Calculate hash with default algorithm (SHA-384)
let data = b"important data";
let hash = calculate_hash(data);
// Verify hash
assert!(verify_hash(data, &hash));
// Use specific algorithm
let sha256_hash = calculate_hash_with_algorithm(data, &HashAlgorithm::Sha256);
// Hardware-optimized hashing for large data
let optimized_hash = data.hash_optimized(HashAlgorithm::Sha384);
Atlas Common includes hardware-optimized hashing implementations that automatically detect and utilize available CPU features:
- Intel Xeon: SHA-NI extensions and AVX-512 parallel processing
- Apple Silicon: ARM crypto extensions
- Multi-core systems: Parallel processing for large datasets
Optimizations are automatically selected at runtime based on available hardware and data size. Use hash_optimized()
methods or the BatchHasher
for optimal performance with large files or multiple inputs.
use atlas_common::c2pa::{ManifestId, ManifestMetadata, ManifestType, DateTimeWrapper};
// Create a manifest ID
let manifest_id = ManifestId::new();
println!("URN: {}", manifest_id.as_urn());
// Create manifest metadata
let metadata = ManifestMetadata {
id: manifest_id.as_urn().to_string(),
name: "GPT-2 Fine-tuned Model".to_string(),
manifest_type: ManifestType::Model,
created_at: DateTimeWrapper::now_utc().to_rfc3339(),
hash: Some(calculate_hash(b"model data")),
size: Some(1024 * 1024 * 50), // 50 MB
version: Some("1.0.0".to_string()),
};
use atlas_common::c2pa::{determine_asset_type, AssetKind};
use std::path::Path;
let model_path = Path::new("model.onnx");
let asset_type = determine_asset_type(model_path, AssetKind::Model)?;
// Returns AssetType::ModelOnnx
use atlas_common::file::{safe_create_file, safe_open_file};
use std::io::{Read, Write};
// Safely create a file (blocks symlink attacks)
let mut file = safe_create_file(Path::new("output.txt"), false)?;
file.write_all(b"secure data")?;
// Safely read a file
let mut file = safe_open_file(Path::new("input.txt"), false)?;
let mut contents = String::new();
file.read_to_string(&mut contents)?;
use atlas_common::validation::{validate_manifest_id, ensure_c2pa_urn};
// Validate a manifest ID
validate_manifest_id("urn:c2pa:123e4567-e89b-12d3-a456-426614174000")?;
// Ensure proper URN format
let urn = ensure_c2pa_urn("my-custom-id");
assert!(urn.starts_with("urn:c2pa:"));
use atlas_common::hash::{HashBuilder, HashAlgorithm};
let mut builder = HashBuilder::new(HashAlgorithm::Sha256);
builder.update(b"chunk1");
builder.update(b"chunk2");
builder.update(b"chunk3");
let hash = builder.finalize();
use atlas_common::hash::{Hasher, HashAlgorithm};
let text = "Hello, World!";
let hash = text.hash(HashAlgorithm::Sha512);
let bytes = b"raw bytes";
let hash2 = bytes.hash_default(); // Uses SHA-384
use atlas_common::storage::{StorageConfig, StorageType};
let config = StorageConfig {
storage_type: StorageType::S3,
url: Some("s3://my-bucket/manifests".to_string()),
..Default::default()
};
The repository includes several examples demonstrating various features:
basic_hashing
- Hash operations and verificationc2pa_manifest
- Working with C2PA manifestsfull_example
- Complete demonstration of all features
Run examples with:
cargo run --example basic_hashing --features hash
cargo run --example c2pa_manifest --features c2pa
cargo run --example full_example --features all
Performance benchmarks are available for hash operations:
cargo bench --features hash
- Constant-time comparison: Hash verification uses constant-time comparison to prevent timing attacks
- Path validation: File operations validate paths to prevent symlink and hardlink attacks
- Input validation: All inputs are validated to prevent injection attacks
- Secure defaults: SHA-384 is the default hash algorithm for optimal security/performance balance
- TensorFlow:
.pb
,.savedmodel
,.tf
- PyTorch:
.pt
,.pth
,.pytorch
- ONNX:
.onnx
- OpenVINO:
.bin
,.xml
- Keras/HDF5:
.h5
,.keras
,.hdf5
- Tabular:
.csv
,.tsv
,.txt
- JSON:
.json
,.jsonl
- Big Data:
.parquet
,.orc
,.avro
- TensorFlow:
.tfrecord
,.tfrec
- NumPy:
.npy
,.npz
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request