/remote-dataset

A pytorch dataset that allows you to iterate the data that is on the remote machine without having to copy all the data.

Primary LanguagePython

Dataset class to access remote data

Installation

To install the latest version from PyPI, use:

>>> pip install remtorch

Example of usage

import io
from PIL import Image
import numpy as np
import torch
from remtorch import RemoteDataset


class ImageDataset(RemoteDataset):
    def prepare_item(self, item):
      buf = io.BytesIO(item)
      buf.seek(0)
      img = Image.open(buf)
      return np.array(img)

ds = ImageDataset(
  'servername',
  'username',
  'password',
  '/path/to/files',
  batchsize
)

dl = torch.utils.data.DataLoader(ds, batchsize)
for img in dl:
  # do smth