[Bug]: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. #8976

TanaroSch · 2025-03-04T12:16:17Z

What happened?

I want to use Cohere-embed-v3-multilingual from Azure AI Foundry. For this I have configured the config as follows:

  - model_name: azure-Cohere-embed-v3-multilingual
    litellm_params:
      model: azure/Cohere-embed-v3-multilingual
      api_type: azure_ai
      api_base: https://CustomAzureURLEndpointBaseURL.services.ai.azure.com
      api_key: azureAPIKey
      azure_ad_token_provider: true

(when adding the /models after the URL, the error says that the model cannot be found)

LiteLLM works great with my other Azure AI Foundry chatcompletion models. But when using the embedding, I receive the error:
AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual.
According to the documentation "LiteLLM supports all models on Azure AI Studio" (now Azure AI Foundry).

Relevant log output

12:03:48 - LiteLLM Proxy:DEBUG: proxy_server.py:4155 - An error occurred: litellm.BadRequestError: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.. Received Model Group=azure-Cohere-embed-v3-multilingual                                                                                                         Available Model Group Fallbacks=None LiteLLM Retried: 1 times, LiteLLM Max Retries: 2                                                                                                                                                                                          Model: azure/Cohere-embed-v3-multilingual                                                                                                                                                                                                                                      API Base: `https://CustomAzureURLEndpointBaseURL.services.ai.azure.com`                                                                                                                                                                                                         model_group: `azure-Cohere-embed-v3-multilingual`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             deployment: `azure/Cohere-embed-v3-multilingual`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Debug this by setting `--debug`, e.g. `litellm --model gpt-3.5-turbo --debug`                                                                                                                                                                                                 12:03:48 - LiteLLM Proxy:ERROR: proxy_server.py:4160 - litellm.proxy.proxy_server.embeddings(): Exception occured - litellm.BadRequestError: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.. Received Model Group=azure-Cohere-embed-v3-multilingual                                                               Available Model Group Fallbacks=None LiteLLM Retried: 1 times, LiteLLM Max Retries: 2                                                                                                                                                                                          Traceback (most recent call last):                                                                                                                                                                                                                                               File "/usr/lib/python3.13/site-packages/litellm/main.py", line 3200, in aembedding                                                                                                                                                                                               response = await init_response  # type: ignore                                                                                                                                                                                                                                            ^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                               File "/usr/lib/python3.13/site-packages/litellm/llms/azure/azure.py", line 859, in aembedding                                                                                                                                                                                    raise e                                                                                                                                                                                                                                                                      File "/usr/lib/python3.13/site-packages/litellm/llms/azure/azure.py", line 831, in aembedding                                                                                                                                                                                    raw_response = await openai_aclient.embeddings.with_raw_response.create(                                                                                                                                                                                                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                           **data, timeout=timeout                                                                                                                                                                                                                                                        ^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                    )                                                                                                                                                                                                                                                                              ^                                                                                                                                                                                                                                                                            File "/usr/lib/python3.13/site-packages/openai/_legacy_response.py", line 381, in wrapped                                                                                                                                                                                        return cast(LegacyAPIResponse[R], await func(*args, **kwargs))                                                                                                                                                                                                                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                File "/usr/lib/python3.13/site-packages/openai/resources/embeddings.py", line 238, in create                                                                                                                                                                                     return await self._post(                                                                                                                                                                                                                                                              ^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                       ...<10 lines>...                                                                                                                                                                                                                                                               )                                                                                                                                                                                                                                                                              ^                                                                                                                                                                                                                                                                            File "/usr/lib/python3.13/site-packages/openai/_base_client.py", line 1849, in post                                                                                                                                                                                              return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)                                                                                                                                                                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                               File "/usr/lib/python3.13/site-packages/openai/_base_client.py", line 1543, in request                                                                                                                                                                                           return await self._request(                                                                                                                                                                                                                                                           ^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                    ...<5 lines>...                                                                                                                                                                                                                                                                )                                                                                                                                                                                                                                                                              ^                                                                                                                                                                                                                                                                            File "/usr/lib/python3.13/site-packages/openai/_base_client.py", line 1644, in _request                                                                                                                                                                                          raise self._make_status_error_from_response(err.response) from None                                                                                                                                                                                                        openai.BadRequestError: Error code: 400 - {'error': {'code': 'OperationNotSupported', 'message': 'The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.'}}                                                                                                                                                                                                                                                                                                                                                                                                                                                                During handling of the above exception, another exception occurred:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Traceback (most recent call last):                                                                                                                                                                                                                                               File "/usr/lib/python3.13/site-packages/litellm/proxy/proxy_server.py", line 4112, in embeddings                                                                                                                                                                                 responses = await llm_responses                                                                                                                                                                                                                                                            ^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                              File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2397, in aembedding                                                                                                                                                                                             raise e                                                                                                                                                                                                                                                                      File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2386, in aembedding                                                                                                                                                                                             response = await self.async_function_with_fallbacks(**kwargs)                                                                                                                                                                                                                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3079, in async_function_with_fallbacks                                                                                                                                                                          raise original_exception                                                                                                                                                                                                                                                     File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2893, in async_function_with_fallbacks                                                                                                                                                                          response = await self.async_function_with_retries(*args, **kwargs)                                                                                                                                                                                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                           File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3269, in async_function_with_retries                                                                                                                                                                            raise original_exception                                                                                                                                                                                                                                                     File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3162, in async_function_with_retries                                                                                                                                                                            response = await self.make_call(original_function, *args, **kwargs)                                                                                                                                                                                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                          File "/usr/lib/python3.13/site-packages/litellm/router.py", line 3278, in make_call                                                                                                                                                                                              response = await response                                                                                                                                                                                                                                                                 ^^^^^^^^^^^^^^                                                                                                                                                                                                                                                    File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2466, in _aembedding                                                                                                                                                                                            raise e                                                                                                                                                                                                                                                                      File "/usr/lib/python3.13/site-packages/litellm/router.py", line 2453, in _aembedding                                                                                                                                                                                            response = await response                                                                                                                                                                                                                                                                 ^^^^^^^^^^^^^^                                                                                                                                                                                                                                                    File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1397, in wrapper_async                                                                                                                                                                                           raise e                                                                                                                                                                                                                                                                      File "/usr/lib/python3.13/site-packages/litellm/utils.py", line 1256, in wrapper_async                                                                                                                                                                                           result = await original_function(*args, **kwargs)                                                                                                                                                                                                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                            File "/usr/lib/python3.13/site-packages/litellm/main.py", line 3216, in aembedding                                                                                                                                                                                               raise exception_type(                                                                                                                                                                                                                                                                ~~~~~~~~~~~~~~^                                                                                                                                                                                                                                                              model=model,                                                                                                                                                                                                                                                                   ^^^^^^^^^^^^                                                                                                                                                                                                                                                               ...<3 lines>...                                                                                                                                                                                                                                                                    extra_kwargs=kwargs,                                                                                                                                                                                                                                                           ^^^^^^^^^^^^^^^^^^^^                                                                                                                                                                                                                                                       )                                                                                                                                                                                                                                                                              ^                                                                                                                                                                                                                                                                            File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2210, in exception_type                                                                                                                                                     raise e                                                                                                                                                                                                                                                                      File "/usr/lib/python3.13/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 1988, in exception_type                                                                                                                                                     raise BadRequestError(                                                                                                                                                                                                                                                         ...<5 lines>...                                                                                                                                                                                                                                                                )                                                                                                                                                                                                                                                                          litellm.exceptions.BadRequestError: litellm.BadRequestError: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.. Received Model Group=azure-Cohere-embed-v3-multilingual                                                                                                                                               Available Model Group Fallbacks=None LiteLLM Retried: 1 times, LiteLLM Max Retries: 2                                                                                                                                                                                          INFO:     172.17.0.1:48964 - "POST /embeddings HTTP/1.1" 400 Bad Request

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.62.1

Twitter / LinkedIn details

No response

The text was updated successfully, but these errors were encountered:

TanaroSch added the bug Something isn't working label Mar 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. #8976

[Bug]: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. #8976

TanaroSch commented Mar 4, 2025

[Bug]: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. #8976

[Bug]: AzureException - The embeddings operation does not work with the specified model, Cohere-embed-v3-multilingual. #8976

Comments

TanaroSch commented Mar 4, 2025

What happened?

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details