mswellhao/PacSum

how can i trainning my own modole in my data?

Opened this issue · 1 comments

i have been try to build the h5py file( train,test,validation) and vocab file , but the code dosen't work. And i don't know what is the useage of the path "chunked".

Can you be specific what's your error?

'Chunked' contains many files, each file storing small training samples.

See for yourself using the small script what's in there:

import h5py
import json
filename = "../data/NYT/nyt_chunked/1.train.h5df"

with h5py.File(filename, "r") as f:
    a_group_key = list(f.keys())[0]
    data = list(f[a_group_key])

res = json.loads(data[0])