/Joader

Primary LanguageRustMIT LicenseMIT

About The Project

This a dataloader that for multiple training jobs in deep learning. The goal of this project is to avoid redundant data prep work and boost training efficiency.

Built With

Getting Started

This is an example of how you may give instructions on setting up your project locally. To get a local copy up and running follow these simple example steps.

Prerequisites

Install packages for server

Install packages for client

pip install -r requirements.txt 

Installation

Build the target of server

  cargo build --release

Quick start

  1. Run the server
./server/target/release/joader
  1. Create a dataset with some keys and conditions
from dataset.dataset import Dataset as JDataset, DatasetType
channel = grpc.insecure_channel('127.0.0.1:4321')
ds = JDataset(name=name, location=location, ty=DatasetType.LMDB)
for k in keys:
    ds.add_item([str(k).encode()])
ds.create(channel)
channel.close()
  1. Register the job for loading data and read data
job = Job.new(dataset_name, name='job', ip='127.0.0.1:4321')
for _ in range(dataset_len):
    data = job.next
  1. Train the model with PyTorch

More examples are in client/test and unitests are in each file in server

License

Distributed under the MIT License. See LICENSE.txt for more information.