-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
MemoryWithRag 方法传了本地的llm模型,但是还是提示AssertionError: DASHSCOPE_API_KEY should be set in environ. #520
Comments
MemoryWithRag默认的embedding模型也会调用dashscope api,导致了这个问题。下载使用本地embedding模型的方式理论上直接可用,但还未测试,我们测试后提供。这里直接用dashscope api是因为本地模型支持的并发量较低,此前出现过响应过慢超时的问题。 另外两个MemoryWithXxx类会使用下载开源embedding模型。 |
什么时候能支持本地embedding模型的接入?老实说,能理解你们希望尽量往modelscope靠拢的初衷,但很多时候,受限于环境和要求只能是本地部署,不让接入外部网络。 |
[CN] [EN] The problem I'm facing now is that for the MemoryWithRetrievalKnowledge series of examples, there is a dependency between the LLM and embedding model. Some tests shown that qwen-max + damo/nlp_gte_sentence-embedding_chinese-base, OK; qwen-max + Xorbits/bge-large-zh-v1.5, error, siliconflow's qwen-7b + damo/nlp_gte_sentence-embedding_chinese-base, error. I guess that ollama can not work well, it can be found that the MemoryWithRetrievalKnowledge tool has a heavy internal coupling. Compared to the example of llmaindex_rag, this combination test can pass easily. If the Masters are available, please help to see which angle should be approached to solve the problem. |
Initial Checks
What happened + What you expected to happen
Versions / Dependencies
最新版本
Reproduction script
llm_config = {
'model': 'qwen2',
'model_server': 'ollama',
}
function_list = []
memory = MemoryWithRag(urls=['tests/samples/常见QA.pdf'], function_list=function_list,llm=llm_config, use_knowledge_cache=False)
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: