[FEATURE] Add support for `HF Nvidia NIM API` on `InferenceEndpointsLLM` #947

plaguss · 2024-09-04T13:52:26Z

Is your feature request related to a problem? Please describe.
Add support for serverless Nvidia NIM API.

Describe the solution you'd like
As suggested, it will require the following:

The new NIM API requires a specific base_url to be passed:

client = InferenceClient(
    base_url="https://huggingface.co/api/integrations/dgx/v1",
    api_key="MY_FINEGRAINED_TOKEN"
)

And then it requires a model id to be passed in the model argument of chat_completions:

output = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Count to 10"},
    ],
    stream=True,
    max_tokens=1024,
)

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
PR for reference: huggingface/huggingface_hub#2482

MoritzLaurer · 2024-09-04T15:52:08Z

Thanks! note that the api is paid and it only works if you use a fine-grained token from an enterprise hub org (see details here)

plaguss added the enhancement New feature or request label Sep 4, 2024

plaguss added this to the 1.4.0 milestone Sep 4, 2024

plaguss self-assigned this Sep 4, 2024

gabrielmbmb modified the milestones: 1.4.0, 1.5.0 Oct 8, 2024

gabrielmbmb removed this from the 1.5.0 milestone Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add support for `HF Nvidia NIM API` on `InferenceEndpointsLLM` #947

[FEATURE] Add support for `HF Nvidia NIM API` on `InferenceEndpointsLLM` #947

plaguss commented Sep 4, 2024

MoritzLaurer commented Sep 4, 2024

[FEATURE] Add support for HF Nvidia NIM API on InferenceEndpointsLLM #947

[FEATURE] Add support for HF Nvidia NIM API on InferenceEndpointsLLM #947

Comments

plaguss commented Sep 4, 2024

MoritzLaurer commented Sep 4, 2024

[FEATURE] Add support for `HF Nvidia NIM API` on `InferenceEndpointsLLM` #947

[FEATURE] Add support for `HF Nvidia NIM API` on `InferenceEndpointsLLM` #947