Skip to content

Error: could not create backend -> jinaai/jina-reranker-v1-turbo-en #579

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
4 tasks
CoolFish88 opened this issue Apr 10, 2025 · 10 comments · May be fixed by #582
Open
4 tasks

Error: could not create backend -> jinaai/jina-reranker-v1-turbo-en #579

CoolFish88 opened this issue Apr 10, 2025 · 10 comments · May be fixed by #582
Assignees

Comments

@CoolFish88
Copy link

CoolFish88 commented Apr 10, 2025

System Info

Hello,

When deploying jinaai/jina-reranker-v1-turbo-en to a Sagemaker endpoint using model artifacts stored in S3, the following error was raised:

Could not start Candle backend: Could not start backend: classifier model type is not supported for Jina
Error: Could not create backend
Caused by: Could not start backend: Could not start a suitable backend

The issue is different from the one reported in #556 for jinaai/jina-embeddings-v2-small-en in terms of the backend error having a different source. If "_name_or_path": "jinaai/jina-bert-implementation" is missing from config.json, then the error #556 will emerge instead.

Cloudwatch logs:

{
"level": "INFO",
"message": "Starting FlashJinaBert model on Cuda(CudaDevice(DeviceId(1)))",
"target": "text_embeddings_backend_candle",
"filename": "backends/candle/src/lib.rs",
"line_number": 296
}
{
"level": "ERROR",
"message": "Could not start Candle backend: Could not start backend: classifier model type is not supported for Jina",
"target": "text_embeddings_backend",
"filename": "backends/src/lib.rs",
"line_number": 388
}

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Model artifacts

  • Model files fetched using CrossEncoder from sentence-transformers
  • Packaged as .tar.gz and uploaded to S3
model_id = "jinaai/jina-reranker-v1-turbo-en"
model = CrossEncoder(model_id, trust_remote_code=True)
mode.save_pretrained(local_path)

Config.json

TEI version:

  • Built TEI 1.6.1 and 1.7 compatible with Sagemaker and A10G GPUs (both versions exhibit the above error when deployed)

Expected behavior

Model deployed successfully with TEI

@CoolFish88
Copy link
Author

CoolFish88 commented Apr 10, 2025

The error persists under the following scenarios involving model artifacts stored in S3 that were:

  • Fetched using CrossEncoder and stored locally using save_pretrained()
  • Downloaded manually from the Hub

For each of these cases, AutoMap from config.json was altered such that it points to (1) artifacts from jinaai or (2) artifacts stored in the same model archive (after adding them: e.g. modeling_bert.py, configuration_bert.py)

The above scenarios incorporate the modifications listed here: https://huggingface.co/jinaai/jina-reranker-v1-turbo-en/discussions/13

@CoolFish88
Copy link
Author

@alvarobartt , @Narsil

I am not familiar with Rust, but I looked a bit into router/src/lib.rs:

  • The function get_backend_model_type returns text_embeddings_backend::ModelType::Classifier since config.json lists an architecture ending with "Classification" ("architectures": ["JinaBertForSequenceClassification"])
  • In the run function, The Backend::new() call uses backend_model_type (which is Classifier) rather than the converted model_type (which would be a Reranker)

It seems the conversion from Classifier to Reranker needs to happen in get_backend_model_type() before the backend is initialized, rather than after. The current code structure means that even though we correctly determine it should be a reranker in the router code, we're still passing it to the backend as a classifier.

I hope this is useful and provides a starting point towards a solution

@ruxandraburtica
Copy link

Facing the same issue when attempting to deploy jina reranker models on GPUs devices

@ionutcatalinsandu
Copy link

Hello,
I followed this thread: #554 to deploy the jinaai/jina-reranker-v1-turbo-en model but encountered the "classifier" error reported here.

@alvarobartt
Copy link
Member

Thank you all for reporting, I'll have a look into it in the coming days hoping to push a patch soon! 🤗

@CoolFish88
Copy link
Author

Thank you @alvarobartt for looking into this.

Btw, are there any TEI environment variables for tuning the performance of the containers (e.g. batch size).

@alvarobartt
Copy link
Member

Hey @CoolFish88, I just got back, and was about to look into this; before going on, just to confirm that the fix in https://huggingface.co/jinaai/jina-reranker-v1-turbo-en/discussions/13 seems to be working fine with ONNX (on CPU), which is the issue I intentionally fixed; I'll investigate the issue further on both MPS and CUDA devices as we may need to refactor the current JinaBert implementation to support re-rankers there, but I can also confirm that the model type being set to "classifier" is right, as the call to rerank runs the predictions as the standard "predict" method 🤗

@alvarobartt alvarobartt linked a pull request Apr 14, 2025 that will close this issue
5 tasks
@alvarobartt
Copy link
Member

@CoolFish88 I've created a draft for it at #582, in case you want to give it an early look!

@alvarobartt alvarobartt self-assigned this Apr 14, 2025
@CoolFish88
Copy link
Author

@alvarobartt,
Thank you for addressing this use case so quickly. Looking forward to the merge!

@CoolFish88
Copy link
Author

@alvarobartt
I got the draft and validated that it works! Great job!

Awaiting for the reviewers to approve the merge

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants