Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

自定义数据集只能用pkl?我记得之前用过csv 近期再用发现提示让我用pkl #68

Open
hdyzhuxun opened this issue May 19, 2021 · 1 comment

Comments

@hdyzhuxun
Copy link

hdyzhuxun commented May 19, 2021

请问哪里更改使用csv格式数据集来训练? 我找了好久没有发现可以改的地方呢
def read_data(cls, input_file,quotechar = None):
"""Reads a tab separated value file."""
if 'pkl' in str(input_file): #pkl 改 csv ??
lines = load_pickle(input_file)
else:
lines = input_file
return lines

run_bert.py 里
`def run_train(args):
# --------- data
processor = BertProcessor(vocab_path=config['bert_vocab_path'], do_lower_case=args.do_lower_case)
label_list = processor.get_labels()
label2id = {label: i for i, label in enumerate(label_list)}
id2label = {i: label for i, label in enumerate(label_list)}

train_data = processor.get_train(config['data_dir'] / f"{args.data_name}.train.csv")
train_examples = processor.create_examples(lines=train_data,
                                           example_type='train',
                                           cached_examples_file=config[
                                                'data_dir'] / f"cached_train_examples_{args.arch}")`

可以解惑一下么

@0ddAstronaut
Copy link

I guess if you input the command python run_bert.py --do_data your .csv files will be automatically converted to .pkl files...?You can refer to the code in the task_data.py

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants