lonePatient/Bert-Multi-Label-Text-Classification

自定义数据集只能用pkl?我记得之前用过csv 近期再用发现提示让我用pkl

hdyzhuxun opened this issue · 1 comments

请问哪里更改使用csv格式数据集来训练? 我找了好久没有发现可以改的地方呢
def read_data(cls, input_file,quotechar = None):
"""Reads a tab separated value file."""
if 'pkl' in str(input_file): #pkl 改 csv ??
lines = load_pickle(input_file)
else:
lines = input_file
return lines

run_bert.py 里
`def run_train(args):
# --------- data
processor = BertProcessor(vocab_path=config['bert_vocab_path'], do_lower_case=args.do_lower_case)
label_list = processor.get_labels()
label2id = {label: i for i, label in enumerate(label_list)}
id2label = {i: label for i, label in enumerate(label_list)}

train_data = processor.get_train(config['data_dir'] / f"{args.data_name}.train.csv")
train_examples = processor.create_examples(lines=train_data,
                                           example_type='train',
                                           cached_examples_file=config[
                                                'data_dir'] / f"cached_train_examples_{args.arch}")`

可以解惑一下么

I guess if you input the command python run_bert.py --do_data your .csv files will be automatically converted to .pkl files...?You can refer to the code in the task_data.py