fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1 #20069

cq-bot · 2025-01-07T13:46:09Z

This PR contains the following updates:

Package	Update	Change
getmeili/meilisearch	minor	`v1.1.0` -> `v1.12.1`

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

meilisearch/meilisearch (getmeili/meilisearch)

`v1.12.1`

Compare Source

Fixes

There was a bug in the engine when adding an empty payload, it was making the batch fails.
Fixed by @irevoire in https://github.com/meilisearch/meilisearch/pull/5192

Full Changelog: meilisearch/meilisearch@v1.12.0...v1.12.1

`v1.12.0`: 🦗

Compare Source

Meilisearch v1.12 introduces significant indexing speed improvements, almost halving the time required to index large datasets. This release also introduces new settings to customize and potentially further increase indexing speed.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Improve indexing speed

Indexing time is improved across the board!

Performance is maintained or better on smaller machines
On bigger machines with multiple cores and good IO, Meilisearch v1.12 is much faster than Meilisearch v1.11
- More than twice as fast for raw document insertion tasks.
- More than x4 as fast for incrementally updating documents in a large database.
- Embeddings generation was also improved up to x1.5 for some workloads.

The new indexer also makes task cancellation faster.

Done by @dureuill, @ManyTheFish, and @Kerollmops in #4900.

New index settings: use `facetSearch` and `prefixSearch` to improve indexing speed

v1.12 introduces two new index settings: facetSearch and prefixSearch.

Both settings allow you to skip parts of the indexing process. This leads to significant improvements to indexing speed, but may negatively impact search experience in some use cases.

Done by @ManyTheFish in #5091

`facetSearch`

Use this setting to toggle facet search:

curl \
  -X PUT 'http://localhost:7700/indexes/books/settings/facet-search' \
  -H 'Content-Type: application/json' \
  --data-binary 'true'

The default value for facetSearch is true. When set to false, this setting disables facet search for all filterable attributes in an index.

`prefixSearch`

Use this setting to configure the ability to search a word by prefix on an index:

curl \
  -X PUT 'http://localhost:7700/indexes/books/settings/prefix-search' \
  -H 'Content-Type: application/json' \
  --data-binary 'disabled'

prefixSearch accepts one of the following values:

"indexingTime": enables prefix processing during indexing. This is the default Meilisearch behavior
"disabled": deactivates prefix search completely

Disabling prefix search means the query he will no longer match the word hello. This may significantly impact search result relevancy, but speeds up the indexing process.

New API route: `/batches`

The new /batches endpoint allow you to query information about task batches.

GET /batches returns a list of batch objects:

curl  -X GET 'http://localhost:7700/batches'

This endpoint accepts the same parameters as GET /tasks route, allowing you to narrow down which batches you want to see. Parameters used with GET /batches apply to the tasks, not the batches themselves. For example, GET /batches?uid=0 returns batches containing tasks with a taskUid of 0 , not batches with a batchUid of 0.

You may also query GET /batches/:uid to retrieve information about a single batch object:

curl  -X GET 'http://localhost:7700/batches/BATCH_UID'

/batches/:uid does not accept any parameters.

Batch objects contain the following fields:

{
  "uid": 160,
  "progress": {
    "steps": [
      {
        "currentStep": "processing tasks",
        "finished": 0,
        "total": 2
      },
      {
        "currentStep": "indexing",
        "finished": 2,
        "total": 3
      },
      {
        "currentStep": "extracting words",
        "finished": 3,
        "total": 13
      },
      {
        "currentStep": "document",
        "finished": 12300,
        "total": 19546
      }
    ],
    "percentage": 37.986263
  },
  "details": {
    "receivedDocuments": 19547,
    "indexedDocuments": null
  },
  "stats": {
    "totalNbTasks": 1,
    "status": {
      "processing": 1
    },
    "types": {
      "documentAdditionOrUpdate": 1
    },
    "indexUids": {
      "mieli": 1
    }
  },
  "duration": null,
  "startedAt": "2024-12-12T09:44:34.124726733Z",
  "finishedAt": null
}

Additionally, task objects now include a new field, batchUid. Use this field together with /batches/:uid to retrieve data on a specific batch.

{
  "uid": 154,
  "batchUid": 142,
  "indexUid": "movies_test2",
  "status": "succeeded",
  "type": "documentAdditionOrUpdate",
  "canceledBy": null,
  "details": {
    "receivedDocuments": 1,
    "indexedDocuments": 1
  },
  "error": null,
  "duration": "PT0.027766819S",
  "enqueuedAt": "2024-12-02T14:07:34.974430765Z",
  "startedAt": "2024-12-02T14:07:34.99021667Z",
  "finishedAt": "2024-12-02T14:07:35.017983489Z"
}

Done by @irevoire in #5060, #5070, #5080

Other improvements

New query parameter for GET /tasks: reverse. If reverse is set to true, tasks will be returned in reversed order, from oldest to newest tasks. Done by @irevoire in #5048
Phrase searches withshowMatchesPosition set to true give a single location for the whole phrase @flevi29 in #4928
New Prometheus metrics by @PedroTurik in #5044
When a query finds matching terms in document fields with array values, Meilisearch now includes an indices field to _matchesPosition specifying which array elements contain the matches by @LukasKalbertodt in #5005
⚠️ Breaking vectorStore change: field distribution no longer contains _vectors. Its value used to be incorrect, and there is no current use case for the fixed, most likely empty, value. Done as part of #4900
Improve error message by adding index name in #5056 by @airycanon

Fixes 🐞

Return appropriate error when primary key is greater than 512 bytes, by @flevi29 in #4930
Fix issue where numbers were segmented in different ways depending on tokenizer, by @dqkqd in https://github.com/meilisearch/charabia/pull/311
Fix pagination when embedding fails by @dureuill in https://github.com/meilisearch/meilisearch/pull/5063
Fix issue causing Meilisearch to ignore stop words in some cases by @ManyTheFish in #5062
Fix phrase search with attributesToSearchOn in #5062 by @ManyTheFish

Misc

Dependencies updates
- Update benchmarks to match the new crates subfolder by @Kerollmops in #5021
- Fix the benchmarks by @irevoire in #5037
- Bump Swatinem/rust-cache from 2.7.1 to 2.7.5 in #5030
- Update charabia v0.9.2 by @ManyTheFish in #5098
- Update mini-dashboard to v0.2.16 version by @curquiza in #5102
CIs and tests
- Improve performance of delete_index.rs by @DerTimonius in #4963
- Improve performance of create_index.rs by @DerTimonius in #4962
- Improve performance of get_documents.rs by @PedroTurik in #5025
- Improve performance of formatted.rs by @PedroTurik in #5043
- Fix the path used in the flaky tests CI by @Kerollmops in #5049
Misc
- Rollback the Meilisearch Kawaii logo by @Kerollmops in #5017
- Add image source label to Dockerfile by @wuast94 in #4990
- Hide code complexity into a subfolder by @Kerollmops in #5016
- Internal tool: implement offline upgrade from v1.10 to v1.11 by @irevoire in #5034
- Internal tool: implement offline upgrade from v1.11 to v1.12 by @ManyTheFish in #5146
- Meilisearch is now able to retrieve Katakana words from a Hiragana query by @tats-u in https://github.com/meilisearch/charabia/pull/312
- Improve error handling when writing into LMDB by @Kerollmops in https://github.com/meilisearch/meilisearch/pull/5089

❤️ Thanks again to our external contributors:

`v1.11.3`: 🐿️

Compare Source

What's Changed

For REST/OpenAI/ollama autoembedders users: Retry if deserialization of remote response failed by @dureuill in https://github.com/meilisearch/meilisearch/pull/5058

Full Changelog: meilisearch/meilisearch@v1.11.2...v1.11.3

Meilisearch v1.11 introduces AI-powered search performance improvements thanks to binary quantization and various usage changes, all of which are steps towards a future stabilization of the feature. We have also improved federated search usage following user feedback.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Experimental - AI-powered search improvements

This release is Meilisearch's first step towards stabilizing AI-powered search and introduces a few breaking changes to its API. Consult the PRD for full usage details.

Done by @dureuill in #4906, #4920, #4892, and #4938.

⚠️ Breaking changes

When performing AI-powered searches, hybrid.embedder is now a mandatory parameter in GET and POST /indexes/{:indexUid}/search
As a consequence, it is now mandatory to pass hybrid even for pure semantic searches
embedder is now a mandatory parameter in GET and POST /indexes/{:indexUid}/similar
Meilisearch now ignores semanticRatio and performs a pure semantic search for queries that include vector but not q

Addition & improvements

The default model for OpenAI is now text-embedding-3-small instead of text-embedding-ada-002
This release introduces a new embedder option: documentTemplateMaxBytes. Meilisearch will truncate a document's template text when it goes over the specified limit
Fields in documentTemplate include a new field.is_searchable property. The default document template now filters out both empty fields and fields not in the searchable attributes list:

v1.11:

{% for field in fields %}
  {% if field.is_searchable and not field.value == nil %}
    {{ field.name }}: {{ field.value }}\n
  {% endif %}
{% endfor %}

v1.10:

{% for field in fields %}
  {{ field.name }}: {{ field.value }}\n
{% endfor %}

Embedders using the v1.10 document template will continue working as before. The new default document template will only work with newly created embedders.

Vector database indexing performance improvements

v1.11 introduces a new embedder option, binaryQuantized:

curl \
  -X PATCH 'http://localhost:7700/indexes/movies/settings' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "embedders": {
      "image2text": {
        "binaryQuantized": true
      }
    }
  }'

Enable binary quantization to convert embeddings of floating point numbers into embeddings of boolean values. This will negatively impact the relevancy of AI-powered searches but significantly improve performance in large collections with more than 100 dimensions.

In our benchmarks, this reduced the size of the database by a factor of 10 and divided the indexing time by a factor of 6 with little impact on search times.

[!WARNING]
Enabling this feature will update all of your vectors to contain only 1s or -1s, significantly impacting relevancy.

You cannot revert this option once you enable it. Before setting binaryQuantized to true, Meilisearch recommends testing it in a smaller or duplicate index in a development environment.

Done by @irevoire in #4941.

Federated search improvements

Facet distribution and stats for federated searches

This release adds two new federated search options, facetsByIndex and mergeFacets. These allow you to request a federated search for facet distributions and stats data.

Facet information by index

To obtain facet distribution and stats for each separate index, use facetsByIndex when querying the POST /multi-search endpoint:

POST /multi-search
{
  "federation": {
    "limit": 20,
    "offset": 0,
	"facetsByIndex": {
	  "movies": ["title", "id"],
	  "comics": ["title"],
	}
  },
  "queries": [
    {
      "q": "Batman",
      "indexUid": "movies"
    },
    {
      "q": "Batman",
      "indexUid": "comics"
    }
  ]
}

The multi-search response will include a new field, facetsByIndex with facet data separated per index:

{
  "hits": […],
  …
  "facetsByIndex": {
      "movies": {
        "distribution": {
          "title": {
            "Batman returns": 1
          },
          "id": {
            "42": 1
          }
        },
        "stats": {
          "id": {
            "min": 42,
            "max": 42
          }
        }
      },
     …
  }
}

Merged facet information

To obtain facet distribution and stats for all indexes merged into a single, use both facetsByIndex and mergeFacets when querying the POST /multi-search endpoint:

POST /multi-search
{

  "federation": {
    "limit": 20,
    "offset": 0,
	  "facetsByIndex": {
	    "movies": ["title", "id"],
	    "comics": ["title"],
	  },
	  "mergeFacets": {
	    "maxValuesPerFacet": 10,
	  }
  }
  "queries": [
    {
      "q": "Batman",
      "indexUid": "movies"
    },
    {
      "q": "Batman",
      "indexUid": "comics"
    }
  ]
}

The response includes two new fields, facetDistribution and facetStarts:

{
  "hits": […],
  …
  "facetDistribution": {
    "title": {
      "Batman returns": 1
      "Batman: the killing joke":
    },
    "id": {
      "42": 1
    }
  },
  "facetStats": {
    "id": {
      "min": 42,
      "max": 42
    }
  }
}

Done by @dureuill in #4929.

Experimental — New `STARTS WITH` filter operator

Enable the experimental feature to use the STARTS WITH filter operator:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "containsFilter": true
  }'

Use the STARTS WITH operator when filtering:

curl \
  -X POST http://localhost:7700/indexes/movies/search \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "filter": "hero STARTS WITH spider"
  }'

🗣️ This is an experimental feature, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

Done by @Kerollmops in #4939.

Other improvements

Language support and localizedAttributes settings by @ManyTheFish in #4937
- Add ISO-639-1 variants
- Convert ISO-639-1 into ISO-639-3
Add a German language tokenizer by @luflow in meilisearch/charabia#303 and in #4945
Improve Turkish language support by @tkhshtsh0917 in meilisearch/charabia#305 and in #4957
Upgrade "batch failed" log to error level in #4955 by @dureuill.
Update the search UI: remove the forced capitalized fields, by @curquiza in #4993

Fixes 🐞

⚠️ When using federated search, query.facets was silently ignored at the query level, but should not have been. It now returns the appropriate error. Use federation.facetsByIndex instead if you want facets to be applied during federated search.
Prometheus /metrics return the route pattern instead of the real route when returning the HTTP requests total by @irevoire in #4839
Truncate values at the end of a list of facet values when the number of facet values is larger than maxValuesPerFacet. For example, setting maxValuesPerFacet to 2 could result in ["blue", "red", "yellow"], being truncated to ["blue", "yellow"] instead of ["blue", "red"]`. By @dureuill in #4929
Improve the task cancellation when vectors are used, by @irevoire in #4971
Swedish support: the characters å, ä, ö are no longer normalized to a and o. By @ManyTheFish in #4945
Update rhai to fix an internal error when updating documents with a function (experimental) by @irevoire in #4960
Fix the bad experimental search queue size by @irevoire in #4992
Do not send empty edit document by function by @irevoire in #5001
Display vectors when no custom vectors were ever provided by @dureuill in #5008

Misc

Dependencies updates
- Security dependency upgrade: bump quinn-proto from 0.11.3 to 0.11.8 by @dependabot in #4911
CIs and tests
- Make the tests run faster by @irevoire in #4808
Documentation
- Fix broken links in README by @iornstein in #4943
Misc
- Allow Meilitool to upgrade from v1.9 to v1.10 without a dump in some conditions, by @dureuill in #4912
- Fix bench by adding embedder by @dureuill in #4954
- Revamp analytics by @irevoire in #5011

❤️ Thanks again to our external contributors:

Meilisearch: @iornstein.
Charabia: @luflow, @tkhshtsh0917.

`v1.10.3`: 🦩

Compare Source

Search improvements

This PR lets you configure two behaviors of the engine through experimental cli flags:

The number of searches Meilisearch can process concurrently per core with the --experimental-nb-searches-per-core cli flag
After how many seconds Meilisearch can consider a search as irrelevant and drop it straight away without processing it with the --experimental-drop-search-after cli flag

Done by @irevoire in https://github.com/meilisearch/meilisearch/pull/5000

Full Changelog: meilisearch/meilisearch@v1.10.2...v1.10.3

`v1.10.2`: 🦩

Compare Source

Fixes 🦋

Activate the Swedish tokenization Pipeline

The Swedish tokenization pipeline were deactivated in the previous versions, now it is activated when specifying the index Language in the settings:

PATCH `/indexes/:index-name/settings`

{
  "localizedAttributes": [ { "locales": ["swe"], "attributePatterns": ["*"] } ]
}

related PR: #4949

`v1.10.1`: 🦩

Compare Source

Fixes 🦋

Better search handling under heavy loads

All of the next PR should make meilisearch behave better under heavy loads:

Only spawn one search queue in actix-web by @irevoire in https://github.com/meilisearch/meilisearch/pull/4893
Make sure the index scheduler never stops running by @irevoire in https://github.com/meilisearch/meilisearch/pull/4896
Explicitly drop the search permits by @irevoire in https://github.com/meilisearch/meilisearch/pull/4898
Stop trying to process searches after one minute by @irevoire in https://github.com/meilisearch/meilisearch/pull/4899

Speed improvement 🐎

We made the autobatching of the document deletion with the document deletion by filter possible which should uncklog the task queue of the people using these two operations heavily.
Meilisearch still cannot autobatch the document deletion by filter and the document addition, though.

Autobatch document deletion by filter by @irevoire in https://github.com/meilisearch/meilisearch/pull/4901
Do not fail the whole batch when a single document deletion by filter fails by @irevoire in https://github.com/meilisearch/meilisearch/pull/4905

Full Changelog: meilisearch/meilisearch@v1.10.0...v1.10.1

`v1.10.0`: 🦩

Compare Source

Meilisearch v1.10 introduces federated search. This innovative feature allows you to receive a single list of results for multi-search requests. v1.10 also includes a setting to manually define which language or languages are present in your documents, and two new new experimental features: the CONTAINS filter operator and the ability to update a subset of your dataset with a function.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Federated search

Use the new federation setting of the /multi-search route to return a single search result object:

curl \
  -X POST 'http://localhost:7700/multi-search' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "federation": {
      "offset": 5,
      "limit": 10
    }
    "queries": [
      {
        "q": "Batman",
        "indexUid": "movies"
      },
      {
        "q": "Batman",
        "indexUid": "comics"
      }
    ]
  }'

Response:

{
  "hits": [
    {
      "id": 42,
      "title": "Batman returns",
      "overview": "..",
      "_federation": {
        "indexUid": "movies",
        "queriesPosition": 0
      }
    },
    {
      "comicsId": "batman-killing-joke",
      "description": "..",
      "title": "Batman: the killing joke",
      "_federation": {
        "indexUid": "comics",
        "queriesPosition": 1
      }
    },
    …
 ],
  processingTimeMs: 0,
  limit: 20,
  offset: 0,
  estimatedTotalHits: 2,
  semanticHitCount: 0,
}

When performing a federated search, Meilisearch merges the results coming from different sources in descending ranking score order.

If federation is empty ({}), Meilisearch sets offset and limit to 0 and 20 respectively.

If federation is null or missing, multi-search returns one list of search result objects for each index.

Federated results relevancy

When performing federated searches, use federationOptions in the request's queries array to configure the relevancy and the weight of each index:

curl \
 -X POST 'http://localhost:7700/multi-search' \
 -H 'Content-Type: application/json' \
 --data-binary '{
  "federation": {},
  "queries": [
    {
      "q": "apple red",
      "indexUid": "fruits",
      "filter": "BOOSTED = true",
      "_showRankingScore": true,
      "federationOptions": {
        "weight": 3.0
      }
    },
    {
      "q": "apple red",
      "indexUid": "fruits",
      "_showRankingScore": true,
    }
  ]
}'

federationOptions must be an object. It supports a single field, weight, which must be a positive floating-point number:

if weight < 1.0, results from this index are less likely to appear in the results
if weight > 1.0, results from this index are more likely to appear in the results
if not specified, weight defaults to 1.0

📖 Consult the usage page for more information about the merge algorithm.

Done by @dureuill in #4769.

Experimental: `CONTAINS` filter operator

Enable the containsFilter experimental feature to use the CONTAINS filter operator:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "containsFilter": true
  }'

CONTAINS filters results containing partial matches to the specified string, similar to a SQL LIKE:

curl \
  -X POST http://localhost:7700/indexes/movies/search \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "q": "super hero",
    "filter": "synopsis CONTAINS spider"
  }'

🗣️ This is an experimental feature, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

Done by @irevoire in #4804.

Language settings

Use the new localizedAttributes index setting and the locales search parameter to explicitly set the languages used in document fields and the search query itself. This is particularly useful for <=v1.9 users who have to occasionally resort to alternative Meilisearch images due to language auto-detect issues in Swedish and Japanese datasets.

Done by @ManyTheFish in #4819.

Set language during indexing with `localizedAttributes`

Use the newly introduced localizedAttributes setting to explicitly declare which languages correspond to which document fields:

curl \
  -X PATCH 'http://localhost:7700/indexes/movies/settings' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "localizedAttributes": [
      {"locales": ["jpn"], "attributePatterns": ["*_ja"]},
      {"locales": ["eng"], "attributePatterns": ["*_en"]},
      {"locales": ["cmn"], "attributePatterns": ["*_zh"]},
      {"locales": ["fra", "ita"], "attributePatterns": ["latin.*"]},
      {"locales": [], "attributePatterns": ["*"]}
    ]
  }'

locales is a list of ISO-639-3 language codes to assign to a pattern. The currently supported languages are: epo, eng, rus, cmn, spa, por, ita, ben, fra, deu, ukr, kat, ara, hin, jpn, heb, yid, pol, amh, jav, kor, nob, dan, swe, fin, tur, nld, hun, ces, ell, bul, bel, mar, kan, ron, slv, hrv, srp, mkd, lit, lav, est, tam, vie, urd, tha, guj, uzb, pan, aze, ind, tel, pes, mal, ori, mya, nep, sin, khm, tuk, aka, zul, sna, afr, lat, slk, cat, tgl, hye.

attributePattern is a pattern that can start or end with a * to match one or several attributes.

If an attribute matches several rules, only the first rule in the list will be applied. If the locales list is empty, then Meilisearch is allowed to auto-detect any language in the matching attributes.

These rules are applied to the searchableAttributes, the filterableAttributes, and the sortableAttributes.

Set language at search time with `locales`

The /search route accepts a new parameter, locales. Use it to define the language used in the current query:

curl \
  -X POST http://localhost:7700/indexes/movies/search \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "q": "進撃の巨人",
    "locales": ["jpn"]
  }'

The locales parameter overrides eventual locales in the index settings.

Experimental: Edit documents with a Rhai function

Use a Rhai function to edit documents in your database directly from Meilisearch:

First, activate the experimental feature:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "editDocumentsByFunction": true
  }'

Then query the /documents/edit route with the editing function:

curl http://localhost:7700/indexes/movies/documents/edit \
  -H 'content-type: application/json' \
  -d '{
   "function": "doc.title = `✨ ${doc.title.to_upper()} ✨`",
   "filter": "id > 3000"
  }'

/documents/edit accepts three parameters in its payload: function, filter, and context.

function must be a string with a Rhai function. filter must be a filter expression.. context must be an object with data you want to make available for the editing function.

📖 More information here.

🗣️ This is an experimental feature and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.

Done by @Kerollmops in #4626.

Experimental AI-powered search: quality of life improvements

For the purpose of future stabilization of the feature, we are applying changes and quality-of-life improvements.

Done by @dureuill in #4801, #4815, #4818, #4822.

⚠️ Breaking changes: Changing the parameters of the REST API

The old parameters of the REST API are too numerous and confusing.

Removed parameters: query , inputField, inputType, pathToEmbeddings and embeddingObject.
Replaced by:

request : A JSON value that represents the request made by Meilisearch to the remote embedder. The text to embed must be replaced by the placeholder value “{{text}}”.
response: A JSON value that represents a fragment of the response made by the remote embedder to Meilisearch. The embedding must be replaced by the placeholder value "{{embedding}}".

Before:

// v1.10 version ✅
{
  "source": "rest",
  "url": "https://localhost:10006",
  "request": {
    "model": "minillm",
    "prompt": "{{text}}"
  },
  "response": {
    "embedding": "{{embedding}}"
  }
}

// v1.9 version ❌
{
  "source": "rest",
  "url": "https://localhost:10006",
  "query": {
    "model": "minillm",
  },
  "inputField": ["prompt"],
  "inputType": "text",
  "embeddingObject": ["embedding"]
}

[!CAUTION]
This is a breaking change to the configuration of REST embedders.
Importing a dump containing a REST embedder configuration will fail in v1.10 with an error: "Error: unknown field query, expected one of source, model, revision, apiKey, dimensions, documentTemplate, url, request, response, distribution at line 1 column 752".

Upgrade procedure:

Remove embedders with source "rest"
Update your Meilisearch Cloud project or self-hosted Meilisearch instance as usual

Add custom headers to REST embedders

When the source of an embedder is set to rest, you may include an optional headers parameter. Use this to configure custom headers you want Meilisearch to include in the requests it sends the embedder.

Embedding requests sent from Meilisearch to a remote REST embedder always contain two headers:

Authorization: Bearer <apiKey> (only if apiKey was provided)
Content-Type: application/json

When provided, headers should be a JSON object whose keys represent the name of additional headers to send in requests, and the values represent the value of these additional headers.

If headers is missing or null for a rest embedder, only Authorization and Content-Type are sent, as described above.

If headers contains Authorization and Content-Type, the declared values will override the ones that are sent by default.

Using the headers parameter for any other source besides rest results in an invalid_settings_embedder error.

Other quality-of-life improvements

📖 More details here

Add url parameter to the OpenAI embedder. url should be an URL to the embedding endpoint (including the v1/embeddingspart) from OpenAI. If url is missing or null for an openAi embedder, the default OpenAI embedding route will be used (https://api.openai.com/v1/embeddings).
dimensions is now available as an optional parameter for ollama embedders. Previously it was only available for rest, openAi and userProvided embedders.
Previously _vectors.embedder was omitted for documents without at least one embedding for embedder. This was inconsistent and prevented the user from checking the value of regenerate.
When a request to a REST embedder fails, the duration of the exponential backoff is now randomized up to twice its base duration
Truncate rather than embed by chunk when OpenAI embeddings are bigger than the max number of tokens
Improve error message when indexing documents and embeddings are missing for a user-provided embedder
Improve error message when a model configuration cannot be loaded and its "architectures" field does not contain "BertModel"

⚠️ Important change regarding the minimal Ubuntu version compatible with Meilisearch

Because the GitHub Actions runner now enforces the usage of a Node version that is not compatible with Ubuntu 18.04 anymore, we had to upgrade the minimal Ubuntu version compatible with Meilisearch. Indeed, we use these GitHub actions to build and provide our binaries.

Now, Meilisearch is only compatible with Ubuntu 20.04 and later and not with Ubuntu 18.4 anymore.

Done by @curquiza in #4783.

Other improvements

Search speed optimization: implement intersection at the end of the search pipeline by @Kerollmops in #4717
Indexing speed optimization: stop opening indexes to only check if they exist by @Karribalu in #4787
Improve tenant token error messages by @irevoire in #4724
Add null byte as hard context separator by @LukasKalbertodt in meilisearch/charabia#295
Adds all math symbols to the default separator list by @phillitrOSU in meilisearch/charabia#301
Errors emitted at the main level of the Meilisearch binary are now logged with level ERROR by @dureuill in #4835

Fixes 🐞

Fix invalid primary key for big numbers @JWSong in #4725
Fix wrong HTTP status and confusing error message on wrong payload by @Karribalu in #4716
Fix the missing geo distance when one or both of the lat/lng are string by @irevoire in #4731
Fix errors related to OffsetDateTime: use a fixed date format regardless of features by @dureuill in #4850
Fix filter that doesn't return valid documents by @dureuill in #4864 & #4858

Misc

Dependencies updates
- Update most of the dependencies by @irevoire in #4786
- Update yaup by @irevoire in #4703
- Bump docker/build-push-action from 5 to 6 by @dependabot in #4758
- Bump zerovec from 0.10.1 to 0.10.4 by @dependabot in #4785
- Update rustls as much as possible by @irevoire in #4806
CIs and tests
- Fix CI with Rust v1.79 by @dureuill in #4723
- Fix flaky test by @irevoire in #4730
- Specify the rust toolchain by @irevoire in #4706
- Add vX Docker tag when publishing Docker image by @curquiza in #4761
- Add search benchmarks by @dureuill in #4762
- Add tests on the rest embedder by @irevoire and @dureuill in #4755
- Add OpenAI tests by @dureuill in #4846
Documentation
- Add june 11th webinar banner by @Strift in #4691
- Revert "Add june 11th webinar banner" by @curquiza in #4705
- Update the README to link more demos by @Kerollmops in #4711
- Update README.md by @Strift in #4721
- Change the Meilisearch logo to the kawaii version by @Kerollmops in #4778
Misc
- New workload to ignore the initial compression phase by @Kerollmops in #4773
- Rename the sortable into the filterable movies workload by @Kerollmops in #4774
- Correct apk usages in Dockerfile by @PeterDaveHello in #4781
- Make milli use edition 2021 by @hanbings in #4770
- Allow MEILI_NO_VERGEN env var to skip vergen by @dureuill in #4812

❤️ Thanks again to our external contributors:

Meilisearch: @Karribalu, @hanbings, @junhochoi, @JWSong, @PeterDaveHello.
Charabia: @LukasKalbertodt, @phillitrOSU.

`v1.9.1`: 🦎

Compare Source

Fixes 🪲

Return an empty list of embeddings for embedders that have no document for an embedder. by @dureuill in https://github.com/meilisearch/meilisearch/pull/4889

This fixes an issue where dumps created for indexes with:

A user-provided embedder
At least one documents that opt-out of vectors for that user-provided embedder

would fail to import correctly.

Upgrade path to v1.10.0 🚀

If you are a Cloud user affected by the above issue, please contact customer support so we perform the upgrade for you.

If you are an OSS user affected by the above, perform the following operations:

Upgrade from v1.9.0 to v1.9.1 without using a dump
Upgrade to v1.10.0 using a dump created from v1.9.1

Full Changelog

`v1.9.0`: 🦎

Compare Source

Meilisearch v1.9 includes performance improvements for hybrid search and the addition/updating of settings. This version benefits from multiple requested features, such as the new frequency matching strategy and the ability to retrieve similar documents.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and updates 🔥

Hybrid search updates

This release introduces multiple hybrid search updates.

Done by @dureuill and @irevoire in #4633 and #4649

⚠️ Breaking change: Empty `_vectors.embedder` arrays

Empty _vectors.embedder arrays are now interpreted as having no vector embedding.

Before v1.9, Meilisearch interpreted these as a single embedding of dimension 0. This change follows user feedback that the previous behavior was unexpected and unhelpful.

⚠️ Breaking change: `_vectors` field no longer present in search results

When the experimental vectorStore feature is enabled, Meilisearch no longer includes _vectors in returned search results by default. This will considerably improve performance.

Use the new retrieveVectors search parameter to display the _vectors field:

curl \
  -X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
  -H 'Content-Type: application/json' \
  --data-binary '{
    "q": "SEARCH QUERY",
    "retrieveVectors": true
  }'

⚠️ Breaking change: Meilisearch no longer preserves the exact representation of embeddings appearing in `_vectors`

In order to save storage and run faster, Meilisearch is no longer storing your vector "as-is". Meilisearch now returns the float in a canonicalized representation rather than the user-provided representation.

For example, 3 may be represented as 3.0

Document `_vectors` accepts object values

The document _vectors field now accepts objects in addition to embedding arrays:

{
  "id": 42,
  "_vectors": {
    "default": [0.1, 0.2 ],
    "text": {
      "embeddings": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]],
      "regenerate": false
    },
    "translation": {
      "embeddings": [0.1, 0.2, 0.3, 0.4],
      "regenerate": true
    }
  }
}

The _vectors object may contain two fields: embeddings and regenerate.

If present, embeddings will replace this document's embeddings.

regenerate must be either true or false. If regenerate: true, Meilisearch will overwrite the document embeddings each time the document is updated in the future. If regenerate: false, Meilisearch will keep the last provided or generated embeddings even if the document is updated in the future.

This change allows importing embeddings to autoembedders as a one-shot process, by setting them as regenerate: true. This change also ensures embeddings are not regenerated when importing a dump created with Meilisearch v1.9.

Meilisearch v1.9.0 also improves performance when indexing and using hybrid search, avoiding useless operations and optimizing the important ones.

New feature: Ranking score threshold

Use rankingScoreThreshold to exclude search results with low ranking scores:

curl \
 -X POST 'http://localhost:7700/indexes/movies/search' \
 -H 'Content-Type: application/json' \
 --data-binary '{
    "q": "Badman dark returns 1",
    "showRankingScore": true,
    "limit": 5,
    "rankingScoreThreshold": 0.2
 }'

Meilisearch does not return any documents below the configured threshold. Excluded results do not count towards estimatedTotalHits, totalHits, and facet distribution.

⚠️ For performance reasons, if the number of documents above rankingScoreThreshold is higher than limit, Meilisearch does not evaluate the ranking score of the remaining documents. Results ranking below the threshold are not immediately removed from the set of candidates. In this case, Meilisearch may overestimate the count of estimatedTotalHits, totalHits and facet distribution.

Done by @dureuill in #4666

New feature: Get similar documents endpoint

This release introduces a new AI-powered search feature allowing you to send a document to Meilisearch and receive a list of similar documents in return.

Use the /indexes/{indexUid}/similar endpoint to query Meilisearch for related documents:

curl \
  -X POST /indexes/:indexUid/similar
  -H 'Content-Type: application/json' \
  --data-binary '{
    "id": "23",
    "offset": 0,
    "limit": 2,
    "filter": "release_date > 1521763199",
    "embedder": "default",
    "attributesToRetrieve": [],
    "showRankingScore": false,
    "showRankingScoreDetails": false
  }'

id: string indicating the document needing similar results, required
offset: number of results to skip when paginating, optional, defaults to 0
limit: number of results to display, optional, defaults to 20
filter: string with a filter expression Meilisearch should apply to the results, optional, defaults to null
embedder: string indicating the embedder Meilisearch should use to retrieve similar documents, optional, defaults to "default"
attributesToRetrieve: array of strings ind

Configuration

📅 Schedule: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR has been generated by Renovate Bot.

cq-bot requested review from a team and murarustefaan January 7, 2025 13:46

cq-bot added automerge Automatically merge once required checks pass area/plugin/destination/meilisearch labels Jan 7, 2025

kodiakhq bot approved these changes Jan 7, 2025

View reviewed changes

fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1

27d7c13

cq-bot force-pushed the renovate/getmeili-meilisearch-1.x branch from d8ad73d to 27d7c13 Compare January 7, 2025 14:06

erezrokah approved these changes Jan 7, 2025

View reviewed changes

erezrokah changed the title ~~fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1~~ chore(deps): Update getmeili/meilisearch Docker tag to v1.12.1 Jan 7, 2025

Merge branch 'main' into renovate/getmeili-meilisearch-1.x

aeb96b3

kodiakhq bot merged commit 0205de5 into main Jan 7, 2025
12 checks passed

kodiakhq bot deleted the renovate/getmeili-meilisearch-1.x branch January 7, 2025 17:27

cq-bot changed the title ~~chore(deps): Update getmeili/meilisearch Docker tag to v1.12.1~~ fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1 Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1 #20069

fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1 #20069

cq-bot commented Jan 7, 2025

fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1 #20069

fix(deps): Update getmeili/meilisearch Docker tag to v1.12.1 #20069

Conversation

cq-bot commented Jan 7, 2025

Release Notes

v1.12.1

Fixes

v1.12.0: 🦗

New features and updates 🔥

Improve indexing speed

New index settings: use facetSearch and prefixSearch to improve indexing speed

facetSearch

prefixSearch

New API route: /batches

Other improvements

Fixes 🐞

Misc

v1.11.3: 🐿️

What's Changed

v1.11.2: 🐿️

What's Changed

v1.11.1: 🐿️

What's Changed

v1.11.0: 🐿️

New features and updates 🔥

Experimental - AI-powered search improvements

⚠️ Breaking changes

Addition & improvements

Vector database indexing performance improvements

Federated search improvements

Facet distribution and stats for federated searches

Facet information by index

Merged facet information

Experimental — New STARTS WITH filter operator

Other improvements

Fixes 🐞

Misc

v1.10.3: 🦩

Search improvements

v1.10.2: 🦩

Fixes 🦋

Activate the Swedish tokenization Pipeline

PATCH /indexes/:index-name/settings

v1.10.1: 🦩

Fixes 🦋

Better search handling under heavy loads

Speed improvement 🐎

v1.10.0: 🦩

New features and updates 🔥

Federated search

Federated results relevancy

Experimental: CONTAINS filter operator

Language settings

Set language during indexing with localizedAttributes

Set language at search time with locales

Experimental: Edit documents with a Rhai function

Experimental AI-powered search: quality of life improvements

⚠️ Breaking changes: Changing the parameters of the REST API

Add custom headers to REST embedders

Other quality-of-life improvements

⚠️ Important change regarding the minimal Ubuntu version compatible with Meilisearch

Other improvements

Fixes 🐞

Misc

v1.9.1: 🦎

Fixes 🪲

Upgrade path to v1.10.0 🚀

v1.9.0: 🦎

New features and updates 🔥

Hybrid search updates

⚠️ Breaking change: Empty _vectors.embedder arrays

⚠️ Breaking change: _vectors field no longer present in search results

⚠️ Breaking change: Meilisearch no longer preserves the exact representation of embeddings appearing in _vectors

Document _vectors accepts object values

New feature: Ranking score threshold

New feature: Get similar documents endpoint

Configuration

`v1.12.1`

`v1.12.0`: 🦗

New index settings: use `facetSearch` and `prefixSearch` to improve indexing speed

`facetSearch`

`prefixSearch`

New API route: `/batches`

`v1.11.3`: 🐿️

`v1.11.2`: 🐿️

`v1.11.1`: 🐿️

`v1.11.0`: 🐿️

Experimental — New `STARTS WITH` filter operator

`v1.10.3`: 🦩

`v1.10.2`: 🦩

PATCH `/indexes/:index-name/settings`

`v1.10.1`: 🦩

`v1.10.0`: 🦩

Experimental: `CONTAINS` filter operator

Set language during indexing with `localizedAttributes`

Set language at search time with `locales`

`v1.9.1`: 🦎

`v1.9.0`: 🦎

⚠️ Breaking change: Empty `_vectors.embedder` arrays

⚠️ Breaking change: `_vectors` field no longer present in search results

⚠️ Breaking change: Meilisearch no longer preserves the exact representation of embeddings appearing in `_vectors`

Document `_vectors` accepts object values