-
Notifications
You must be signed in to change notification settings - Fork 1.5k
HTTP Client configuration for models and vector stores #512
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Comments
Hello, I was available to contribute too but till now I had little luck getting feedback from project owners. |
There is a lot to unpack here, so let's start small and work our way to more features. At the lowest level, we are using either our own hand written client to talk with a model, OpenAiApi is a perfect example. If a user is operating at this level, there are a few things that can be done.
For other models, for example AzureOpenAI or Google vertex, we are using client libraries provided by Microsoft and Google and we can't use the approach above. We can however at a high level, the Potentially we can still have a logging advisor, but it would serve a different purpose, and is likely still a useful addition. On another topic, of retry, this could potentially move out of the Thoughts? |
I like the idea of creating advisors for logging purposes 👍 However, when thinking about retry logic... Currently, we handle two ways of calling models:
I imagine the retry logic should be the same across all models. Tying it to the Therefore, I suggest introducing a new retry layer — or even more broadly, a resilience layer (starting with retry support but with the potential to add new features in the future): There could also be several other layers for customizing the HTTP client and so on, as @ThomasVitale mentioned. |
@markpollack @piotrooo thank you both for sharing your thoughts! I see two types of logs that can be useful in an application using Spring AI. My original intent with this issue was to cover the first type.
What do you think? |
I thought about some customizers for SDK clients, but I'm not really convinced by this approach. However, I think this is probably how I want to customize e.g., Azure
Right now, |
Uh oh!
There was an error while loading. Please reload this page.
Enhancement Description
Each model integration is composed of two aspects: an
*Api
class calling the model provider over HTTP, and a*Client
class encapsulating the LLM specific aspects.Each
*Client
class is highly customizable based on nice interfaces, making it possible to overwrite many different options. It would be nice to provide similar flexibility for each*Api
class as well. In particular, it would be useful to be able to configure options related to the HTTP Client.Examples of aspects that would need to be configured:
SslBundle
to connect with on-prem model providers using custom CA certificates;Furthermore, there might be additional needs for configuring resilience patterns:
More settings that right now are part of the model connection configuration (and that still relates to the HTTP interaction) would also need to be customisable in enterprise use cases in production (e.g. multi-user applications or even multi-tenant applications). For example, when using OpenAI, the following could need changing per request/session.
All the above is focused on the HTTP interactions with model providers, but the same would be useful for vector stores.
Possible Solutions
Drawing from the nice abstractions designed to customize the model integrations and ultimately implementing the
ModelOptions
interface, it could be an idea to define a dedicated abstraction to pass HTTP client customizations to an*Api
class (something likeHttpClientConfig
), which might also be exposed via configuration properties (underspring.ai.<model>.client.*
).For the more specific resilience configurations (like retries and fallbacks), an annotation-driven approach might be more suitable. Resilience4j might provide a way to achieve this, since I don't think Spring supports the Fault Tolerance Microprofile spec.
A partial alternative solution would be for developers to define a custom
RestClient.Builder
orWebClient.Builder
and pass that to each*Api
class, but it would result in a lot of extra configurations and reduce the convenience of the autoconfiguration. Also, it would tight a generic configuration like "enable logs" or "use a custom CA" to the specific client used, resulting in duplication when both blocking and streaming interactions are used in the same application.I'm available to contribute and help solve this issue.
Related Issues
The text was updated successfully, but these errors were encountered: