aaron-xichen/pytorch-playground

Encounter "Memory Error" when converting imagenet dataset

jeff830107 opened this issue ยท 4 comments

Hi,
When I was trying to using the Alexnet model, I first of all tried to follow your instruction to download val224_compressed.pkl and executed the command "python convert.py"
But when I was converting, it always come to the error message "Memory Error".
I am curious about how to deal with this issue, since I think the memory of the machine I used is big enough, which is 64 GB.
Thanks !

I ran into the same problem as well, the 244244 file was dumped okay with 7.5G and the 299299 pkl file was empty with 0B

I saw the same issue. I separated the 224 and 299 dump processing loops and cleared variables that were no longer used. Still it dies in dump_pickle, which must be making another copy.
So, I looked around and found that scikit learn has a joblib.dump that can replace pkl.dump in dump_pickle, and it doesn't use as much memory while writing out the files.
I think you'll still need to separate the 224 and 299 processing, as mine was running out of 32G memory while doing a transpose.... too many copies of the same data going on. With joblib, memory use goes up to 27G, and no error. This could probably use a db, instead of all this image info in a dict.

same problem on 24G RAM windows PC with python 3.6.6 and torch 1.1.0

===

I finished my job by following convert.py, thx to @jnorwood

this new convert.py will takes about 16Gb memory.

code

`

import os
import numpy as np
import tqdm
from utee import misc
import argparse
import cv2
import joblib

imagenet_urls = [
'http://ml.cs.tsinghua.edu.cn/~chenxi/dataset/val224_compressed.pkl'
]
parser = argparse.ArgumentParser(description='Extract the ILSVRC2012 val dataset')
parser.add_argument('--in_file', default='val224_compressed.pkl', help='input file path')
parser.add_argument('--out_root', default='/tmp/public_dataset/pytorch/imagenet-data/', help='output file path')
args = parser.parse_args()

d = misc.load_pickle(args.in_file)
assert len(d['data']) == 50000, len(d['data'])
assert len(d['target']) == 50000, len(d['target'])

''' conver val224.pkl '''
data = []
for img, target in tqdm.tqdm(zip(d['data'], d['target']), total=50000):
img224 = misc.str2img(img)
data.append(img224)
data_dict = dict(
data = np.array(data).transpose(0, 3, 1, 2),
target = d['target']
)
if not os.path.exists(args.out_root):
os.makedirs(args.out_root)
''' misc.dump_pickle(data_dict, os.path.join(args.out_root, 'val224.pkl'))'''
joblib.dump(data_dict, os.path.join(args.out_root, 'val224.pkl'))
data_dict.clear()
data.clear()
print('val224.pkl done.')

''' conver val229.pkl '''
data = []
for img, target in tqdm.tqdm(zip(d['data'], d['target']), total=50000):
img224 = misc.str2img(img)
img299 = cv2.resize(img224, (299, 299))
data.append(img299)
data_dict = dict(
data = np.array(data).transpose(0, 3, 1, 2),
target = d['target']
)

if not os.path.exists(args.out_root):
os.makedirs(args.out_root)
''' misc.dump_pickle(data_dict, os.path.join(args.out_root, 'val299.pkl')) '''
joblib.dump(data_dict, os.path.join(args.out_root, 'val299.pkl'))
data_dict.clear()
data.clear()
print('val299.pkl done.')

`

result

Loading pickle object from val224_compressed.pkl
=> Done (1.0991 s)
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 50000/50000 [01:02<00:00, 798.99it/s]
val299.pkl done.

thanks @jnorwood, fixed, please check