You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed.
请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue
Check List
Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.
You can post pictures, but if specific text or code is required to reproduce the issue, please provide the text in a plain text format for easy copy/paste.
[√] I have searched in existing issues but did not find the same one.
报错信息如下:
File "/venv/lib/python3.6/site-packages/kashgari/tasks/labeling/abc_model.py", line 177, in fit
fit_kwargs=fit_kwargs)
File "/venv/lib/python3.6/site-packages/kashgari/tasks/labeling/abc_model.py", line 208, in fit_generator
self.build_model_generator([g for g in [train_sample_gen, valid_sample_gen] if g])
File "/venv/lib/python3.6/site-packages/kashgari/tasks/labeling/abc_model.py", line 85, in build_model_generator
self.text_processor.build_vocab_generator(generators)
File "/venv/lib/python3.6/site-packages/kashgari/processors/sequence_processor.py", line 84, in build_vocab_generator
count = token2count.get(token, 0)
TypeError: unhashable type: 'list'
kashgari下build_vocab_generator()报错位置:
def build_vocab_generator(self,
generators: List[CorpusGenerator]) -> None:
if not self.vocab2idx:
vocab2idx = self._initial_vocab_dic
token2count: Dict[str, int] = {}
for gen in generators:
for sentence, label in tqdm.tqdm(gen, desc="Preparing text vocab dict"):
if self.build_vocab_from_labels:
target = label
else:
target = sentence
for token in target: ## 我的输入是嵌套list,这里token是每一个list,就报错了。
count = token2count.get(token, 0)
token2count[token] = count + 1
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
You must follow the issue template and provide as much information as possible. otherwise, this issue will be closed.
请按照 issue 模板要求填写信息。如果没有按照 issue 模板填写,将会忽略并关闭这个 issue
Check List
Thanks for considering to open an issue. Before you submit your issue, please confirm these boxes are checked.
You can post pictures, but if specific text or code is required to reproduce the issue, please provide the text in a plain text format for easy copy/paste.
Environment
Issue Description
我自定义了一个模型,模型需要输入多种特征(如词、词性、命名实体类别)。词特征用BertEmbedding获取,其他特征用BareEmbedding初始化,然后把它们拼接起来作为模型输入。模型定义都没问题,在相应tasks/labeling/init.py里面也加了,能调用,错误出现在fit的时候。
自定义模型的代码测试抽取如下(省略参数定义),是个序列标注任务:
def init(self,
embedding: ABCEmbedding = None,
posembedding: ABCEmbedding = None,
nerembedding: ABCEmbedding = None,
**kwargs
):
super(BiLSTM_TEST_Model, self).init()
self.embedding = embedding
self.posembedding = posembedding
self.nerembedding = nerembedding
def build_model_arc(self) -> None:
output_dim = self.label_processor.vocab_size
我在使用数据训练的时候,代码抽取如下:
def trainFunction(.....):
bert_embed = BertEmbedding('./Data/路径', sequence_length=maxlength)
pos_embed = BareEmbedding(embedding_size=32)
ner_embed = BareEmbedding(embedding_size=32)
Reproduce
报错信息如下:
File "/venv/lib/python3.6/site-packages/kashgari/tasks/labeling/abc_model.py", line 177, in fit
fit_kwargs=fit_kwargs)
File "/venv/lib/python3.6/site-packages/kashgari/tasks/labeling/abc_model.py", line 208, in fit_generator
self.build_model_generator([g for g in [train_sample_gen, valid_sample_gen] if g])
File "/venv/lib/python3.6/site-packages/kashgari/tasks/labeling/abc_model.py", line 85, in build_model_generator
self.text_processor.build_vocab_generator(generators)
File "/venv/lib/python3.6/site-packages/kashgari/processors/sequence_processor.py", line 84, in build_vocab_generator
count = token2count.get(token, 0)
TypeError: unhashable type: 'list'
kashgari下build_vocab_generator()报错位置:
def build_vocab_generator(self,
generators: List[CorpusGenerator]) -> None:
if not self.vocab2idx:
vocab2idx = self._initial_vocab_dic
DEBUG追踪看了下:我fit输入的x_train是三个,CorpusGenerator得到的generators的x也是嵌套的三个list,在build_vocab_generator的时候,就报错了。
我需要重定义build_vocab_generator吗?
除了这个地方,我的模型输入需要三个embedding model,这会导致我的self.vocab2idx/idx2vocab是不是也得定义三种,还有哪些地方需要我重新定义的吗?debug跟着跟着就晕了 T_T
求助!!
The text was updated successfully, but these errors were encountered: