Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[FEATURE] Add support for HF Nvidia NIM API on InferenceEndpointsLLM #947

Open
plaguss opened this issue Sep 4, 2024 · 1 comment
Open
Assignees
Labels
enhancement New feature or request

Comments

@plaguss
Copy link
Contributor

plaguss commented Sep 4, 2024

Is your feature request related to a problem? Please describe.
Add support for serverless Nvidia NIM API.

Describe the solution you'd like
As suggested, it will require the following:

The new NIM API requires a specific base_url to be passed:

client = InferenceClient(
    base_url="https://huggingface.co/api/integrations/dgx/v1",
    api_key="MY_FINEGRAINED_TOKEN"
) 

And then it requires a model id to be passed in the model argument of chat_completions:

output = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3-8B-Instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Count to 10"},
    ],
    stream=True,
    max_tokens=1024,
)

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
PR for reference: huggingface/huggingface_hub#2482

@plaguss plaguss added the enhancement New feature or request label Sep 4, 2024
@plaguss plaguss added this to the 1.4.0 milestone Sep 4, 2024
@plaguss plaguss self-assigned this Sep 4, 2024
@MoritzLaurer
Copy link

Thanks! note that the api is paid and it only works if you use a fine-grained token from an enterprise hub org (see details here)

@gabrielmbmb gabrielmbmb modified the milestones: 1.4.0, 1.5.0 Oct 8, 2024
@gabrielmbmb gabrielmbmb removed this from the 1.5.0 milestone Jan 16, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants