Skip to content

Dense Vector Field Support #356

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
YakPort opened this issue Jul 6, 2021 · 3 comments
Closed

Dense Vector Field Support #356

YakPort opened this issue Jul 6, 2021 · 3 comments

Comments

@YakPort
Copy link

YakPort commented Jul 6, 2021

I note the closed issue
https://github.com/elastic/elasticsearch-dsl-py/issues/1278

I created a densevector field as mentioned above
I have a method on my model the calculates the embedding called get_embedding

I am trying to create a Field that will access the attribute on my model and store the calculated embedding similar to

class DenseVector(DEDField, Field):
    name = 'dense_vector'

    def __init__(self):
        dims = 1024
        super(DenseVector, self).__init__(dims=dims)

in documents.py

@registry.register_document
class ItemDocument(Document):
    title_vector = DenseVector(attr='get_embedding')

    class Index:
        name = "products"
        settings = {
            "number_of_shards": 1,
            "number_of_replicas": 0,
            "analysis": {"analyzer": {"standard": {"type": "standard"}}},
        }

    class Django:
        model = Item

I am getting TypeError: init() got an unexpected keyword argument 'attr'
where is attr being initialised, or am I going down the wrong path?

Many thanks

@saadmk11
Copy link
Contributor

saadmk11 commented Jul 7, 2021

you need to pass attr kwarg to the __init__() method of DenseVector field, so that DEDField can process the attr

def __init__(self, attr=None, **kwargs):
    dims = 1024
    super(DenseVector, self).__init__(attr=attr, dims=dims, **kwargs)

@YakPort YakPort closed this as completed Jul 7, 2021
@BoPeng
Copy link

BoPeng commented Oct 18, 2024

It looks like elastiearch_dsl now has support for DenseVector

https://github.com/elastic/elasticsearch-dsl-py/blob/579f57205c395e17024d9ae827cbf6fd626969c4/elasticsearch_dsl/field.py#L392-L397

The DenseVector field is defined as multi Float field, with no dims. Maybe dense_vector does not require a fixed dimensions?

Then, in django_elasticsearch_dsl, the proper way to define a field might be

class DenseVectorField(DEDField, DenseVector):
      pass

although I need to explore how to use DenseVectorField for embedding search.

@Tayyab-R
Copy link

Custom DenseVector Field
Followed every step as mentioned in the thread.

class DenseVector(DEDField, Field):
    """
    Custom field from me for DenseVector in django-elasticsearch-dsl
    """

    name = 'dense_vector'

    def __init__(self, attr=None, **kwargs):
        dims = 1024
        super(DenseVector, self).__init__(attr=attr, dims=dims, **kwargs)

@registry.register_document
class BlogDocument(Document): 
             tags_embeddings = DenseVector(dims=384)
                  

but got error:
super(DenseVector, self).init(attr=attr, dims=dims, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: django_elasticsearch_dsl.fields.DEDField.init() got multiple values for keyword argument 'dims'


What am I doing wrong here?

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants