Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug]: getting 'hnswlib.Index' has no attribute 'file_handle_count' error when using PersistentClient #931

Closed
malaccan opened this issue Aug 4, 2023 · 9 comments
Labels
bug Something isn't working

Comments

@malaccan
Copy link

malaccan commented Aug 4, 2023

What happened?

i am using the latest chromadb (v0.4.4) and has an error running the following standard example code:

import chromadb
client = chromadb.PersistentClient(path="./db")

throws an error:
AttributeError: type object 'hnswlib.Index' has no attribute 'file_handle_count'

looking at the stack trace, it seems like there is a bug in:
chromadb/segment/impl/manager/local.py:73

  71 segment_limit = (
  72     self._max_file_handles
  ---> 73     // PersistentLocalHnswSegment.get_file_handle_count()

i suspect the intention is to comment out line 73? (wrong syntax // for java, # for python)

i manually changed this to # in my local version of local.py, and it works.
hope this helps.

thanks

Versions

Chromadb 0.4.4
Chroma-hnswlib 0.7.2
Python 3.10.0
MacOS 13.4.1, Ubuntu 22.04.2

Relevant log output

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:603, in Chroma.from_documents(cls, documents, embedding, ids, collection_name, persist_directory, client_settings, client, collection_metadata, **kwargs)
    601 texts = [doc.page_content for doc in documents]
    602 metadatas = [doc.metadata for doc in documents]
--> 603 return cls.from_texts(
    604     texts=texts,
    605     embedding=embedding,
    606     metadatas=metadatas,
    607     ids=ids,
    608     collection_name=collection_name,
    609     persist_directory=persist_directory,
    610     client_settings=client_settings,
    611     client=client,
    612     collection_metadata=collection_metadata,
    613     **kwargs,
    614 )

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:558, in Chroma.from_texts(cls, texts, embedding, metadatas, ids, collection_name, persist_directory, client_settings, client, collection_metadata, **kwargs)
    525 @classmethod
    526 def from_texts(
    527     cls: Type[Chroma],
   (...)
    537     **kwargs: Any,
    538 ) -> Chroma:
    539     """Create a Chroma vectorstore from a raw documents.
    540 
    541     If a persist_directory is specified, the collection will be persisted there.
   (...)
    556         Chroma: Chroma vectorstore.
    557     """
--> 558     chroma_collection = cls(
    559         collection_name=collection_name,
    560         embedding_function=embedding,
    561         persist_directory=persist_directory,
    562         client_settings=client_settings,
    563         client=client,
    564         collection_metadata=collection_metadata,
    565         **kwargs,
    566     )
    567     chroma_collection.add_texts(texts=texts, metadatas=metadatas, ids=ids)
    568     return chroma_collection

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/langchain/vectorstores/chroma.py:120, in Chroma.__init__(self, collection_name, embedding_function, persist_directory, client_settings, collection_metadata, client, relevance_score_fn)
    118         _client_settings = chromadb.config.Settings()
    119     self._client_settings = _client_settings
--> 120     self._client = chromadb.Client(_client_settings)
    121     self._persist_directory = (
    122         _client_settings.persist_directory or persist_directory
    123     )
    125 self._embedding_function = embedding_function

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/chromadb/__init__.py:110, in Client(settings)
    107 system = System(settings)
    109 telemetry_client = system.instance(Telemetry)
--> 110 api = system.instance(API)
    112 system.start()
    114 # Submit event for client start

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/chromadb/config.py:195, in System.instance(self, type)
    192     type = get_class(fqn, type)
    194 if type not in self._instances:
--> 195     impl = type(self)
    196     self._instances[type] = impl
    197     if self._running:

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/chromadb/api/segment.py:82, in SegmentAPI.__init__(self, system)
     80 self._settings = system.settings
     81 self._sysdb = self.require(SysDB)
---> 82 self._manager = self.require(SegmentManager)
     83 self._telemetry_client = self.require(Telemetry)
     84 self._producer = self.require(Producer)

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/chromadb/config.py:134, in Component.require(self, type)
    131 def require(self, type: Type[T]) -> T:
    132     """Get a Component instance of the given type, and register as a dependency of
    133     that instance."""
--> 134     inst = self._system.instance(type)
    135     self._dependencies.add(inst)
    136     return inst

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/chromadb/config.py:195, in System.instance(self, type)
    192     type = get_class(fqn, type)
    194 if type not in self._instances:
--> 195     impl = type(self)
    196     self._instances[type] = impl
    197     if self._running:

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/chromadb/segment/impl/manager/local.py:73, in LocalSegmentManager.__init__(self, system)
     69 else:
     70     self._max_file_handles = ctypes.windll.msvcrt._getmaxstdio()  # type: ignore
     71 segment_limit = (
     72     self._max_file_handles
---> 73     // PersistentLocalHnswSegment.get_file_handle_count()
     74 )
     75 self._vector_instances_file_handle_cache = LRUCache(
     76     segment_limit, callback=lambda _, v: v.close_persistent_index()
     77 )

File ~/anaconda3/envs/textgen/lib/python3.10/site-packages/chromadb/segment/impl/vector/local_persistent_hnsw.py:398, in PersistentLocalHnswSegment.get_file_handle_count()
    395 @staticmethod
    396 def get_file_handle_count() -> int:
    397     """Return how many file handles are used by the index"""
--> 398     hnswlib_count = hnswlib.Index.file_handle_count
    399     hnswlib_count = cast(int, hnswlib_count)
    400     # One extra for the metadata file

AttributeError: type object 'hnswlib.Index' has no attribute 'file_handle_count'
@malaccan malaccan added the bug Something isn't working label Aug 4, 2023
@HammadB
Copy link
Collaborator

HammadB commented Aug 4, 2023

// in python is floor division not comment - https://www.freecodecamp.org/news/what-does-double-slash-mean-in-python/. That line should not be commented out

I suspect something is wrong with your environment. Can you try deleting your env and reinstalling the deps?

@malaccan
Copy link
Author

malaccan commented Aug 4, 2023

thanks for checking. it looks like my env. i checked again - and after removing my existing hnswlib module (but keeping the chroma-hnswlib module), it works fine.

@HammadB
Copy link
Collaborator

HammadB commented Aug 4, 2023

Ah - glad you figured it out! Closing this out.

@batman-do
Copy link

@malaccan @HammadB why when i remove lib hnswlib and after i run get no module hnswlib, how to fix that ?

@macksjlazarus
Copy link

@malaccan @HammadB why when i remove lib hnswlib and after i run get no module hnswlib, how to fix that ?

Having the same issue, did you ever figure it out?

@joshua1996
Copy link

@malaccan @HammadB why when i remove lib hnswlib and after i run get no module hnswlib, how to fix that ?

me too

@macksjlazarus
Copy link

@malaccan @HammadB why when i remove lib hnswlib and after i run get no module hnswlib, how to fix that ?

me too

I was able to fix this by commenting out the offending line in hnswlibs local.py file, where the error is being thrown.

@HammadB
Copy link
Collaborator

HammadB commented Dec 4, 2023

Are you installing both hnswlib and chroma-hnswlib? That will cause issues as mentioned above. Please run pip freeze and check

@NeonBohdan
Copy link

NeonBohdan commented Dec 17, 2023

Other people also get this error
And here is the fix
zylon-ai/private-gpt#1012 (comment)

Wrong comment symbol used
// instead of #

akri2021 added a commit to akri2021/chroma that referenced this issue Nov 1, 2024
There seems to be a trivial bug with a wrong comment sign from another programming language. 
I stumbled upon it when testing nano-graphrag with Kotaemon (https://github.com/Cinnamon/kotaemon)
Other mentions of this bug:
Cinnamon/kotaemon#440
chroma-core#931
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants