加载用户字典不起作用以及实体未识别出来的情况 #45

ShuGao0810 · 2018-09-29T08:21:20Z

博主好，foolnltk使用时发现加载用户字典不起作用，不知道是什么原因导致的，具体如下：
环境：win10+python3.6

fool.analysis('阿里收购饿了么')
返回：([[('阿里', 'nz'), ('收购', 'v'), ('饿', 'v'), ('了', 'y'), ('么', 'y')]], [[(0, 3, 'company', '阿里')]])

用户字典格式：
饿了么 10

fool.load_userdict(path)
fool.analysis('阿里收购饿了么')
返回：([[('阿里', 'nz'), ('收购', 'v'), ('饿', 'v'), ('了', 'y'), ('么', 'y')]], [[(0, 3, 'company', '阿里')]])

加载用户字典似乎不起作用？分词时“饿了么”还是被拆开了，实体识别中也没识别出来

rockyzhengwu · 2018-10-06T09:11:14Z

@ShuGao0810 谢谢你的反馈，现在的词典在分词的时候是有效的，analysis 不支持，稍后修改

xrzlizheng · 2018-11-13T10:00:19Z

如何加载jieba格式的字典，

yu45020 · 2018-12-14T03:31:54Z

@ShuGao0810
或许可行的解决办法：修改__init__.py
ner 的修改抄 cut 的

这样改好像不行 ><

def ner(text, ignore=False):
    text = _check_input(text, ignore)
    if not text:
        return [[]]
    res = LEXICAL_ANALYSER.ner(text)
-    return res
+    new_words = []
+    if _DICTIONARY.sizes != 0:
+        for sent, words in zip(text, res):
+            words = _mearge_user_words(sent, words)
+            new_words.append(words)
+    else:
+        new_words = res
+    return new_words


def analysis(text, ignore=False):
    text = _check_input(text, ignore)
    if not text:
        return [[]], [[]]
-    res = LEXICAL_ANALYSER.analysis(text)
-    return res
+    word_inf = pos_cut(text)
+    ners = ner(text)
+    return word_inf, ners

a = ['阿里收购饿了么']
fool.load_userdict('foolnltk_userdict.txt')
# fool.delete_userdict()
print(fool.cut(a))
[['阿里', '收购', '饿了么']]

print(fool.analysis(a))
([[('阿里', 'nz'), ('收购', 'v'), ('饿了么', 'nz')]], [['阿里收购', '饿了么']])

@rockyzhengwu
应该是笔误吧： init.py 下

_mearge_user_words -- 改为 --> _merge_user_words

rockyzhengwu added the bug label Oct 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

加载用户字典不起作用以及实体未识别出来的情况 #45

加载用户字典不起作用以及实体未识别出来的情况 #45

ShuGao0810 commented Sep 29, 2018 •

edited

Loading

rockyzhengwu commented Oct 6, 2018

xrzlizheng commented Nov 13, 2018

yu45020 commented Dec 14, 2018 •

edited

Loading

加载用户字典不起作用以及实体未识别出来的情况 #45

加载用户字典不起作用以及实体未识别出来的情况 #45

Comments

ShuGao0810 commented Sep 29, 2018 • edited Loading

rockyzhengwu commented Oct 6, 2018

xrzlizheng commented Nov 13, 2018

yu45020 commented Dec 14, 2018 • edited Loading

ShuGao0810 commented Sep 29, 2018 •

edited

Loading

yu45020 commented Dec 14, 2018 •

edited

Loading