GLAMOR-USC/teach_tatc

Data preprocessing takes very large disk space

UeFan opened this issue · 1 comments

UeFan commented

Hi, since as reported #1 , I am trying to do the data preprocessing from scratch. However, I find that doing data preprocessing will also generate super large lmdb files, each about 700GB. The reason is that in this line of code:

lmdb_feats = lmdb.open(str(output_path), 700 * 1024**3, writemap=True)
, it is creating a lmdb file of size 700 * 1024**3. Is creating this large lmdb file on purpose?

Thanks

Please refer to #2