We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
from flashtext import KeywordProcessor
#text = "@苍月轶 再次核实:骆然5月8日持24小时核酸从宜昌回蓉,到成都24小时内核酸一次,9号回泸定,24小时内又做一次核酸,均阴性,健康码绿码。宜昌不是 AB区域。" text = "成都到北京高铁3小时,郑州到成都2小时"
print(text) kp = KeywordProcessor() kp.add_keyword("到成都", ("成都", "ab")) kp.add_keyword("宜昌", ("宜昌", "ab"))
print(len(kp)) print(kp) word_index = kp.extract_keywords(text, span_info=True) print(word_index) for item in word_index: print(text[item[1]:item[2]])
print('finished')
The text was updated successfully, but these errors were encountered:
text = "成都到北京高铁3小时,郑州到成都2小时" kp = KeywordProcessor() kp.add_keyword("到成都", ("成都", "ab")) kp.add_keyword("宜昌", ("宜昌", "ab"))
print(len(kp)) keywords_found = kp.extract_keywords(text, span_info=True) for item in keywords_found: print(item)
2 (('成都', 'ab'), 13, 15)
Reference:https://blog.csdn.net/chen10314/article/details/122048726
Sorry, something went wrong.
still not a good solution cause so many special char will appear in our keywords. like () [] ... etc.
No branches or pull requests
from flashtext import KeywordProcessor
#text = "@苍月轶 再次核实:骆然5月8日持24小时核酸从宜昌回蓉,到成都24小时内核酸一次,9号回泸定,24小时内又做一次核酸,均阴性,健康码绿码。宜昌不是
AB区域。"
text = "成都到北京高铁3小时,郑州到成都2小时"
print(text)
kp = KeywordProcessor()
kp.add_keyword("到成都", ("成都", "ab"))
kp.add_keyword("宜昌", ("宜昌", "ab"))
print(len(kp))
print(kp)
word_index = kp.extract_keywords(text, span_info=True)
print(word_index)
for item in word_index:
print(text[item[1]:item[2]])
print('finished')
The text was updated successfully, but these errors were encountered: