-
Notifications
You must be signed in to change notification settings - Fork 17
WIP: AWS Bedrock Support, Google Gemini Function calling, and file support, and more. #320
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: main
Are you sure you want to change the base?
Conversation
And I just saw #312 |
Oh wow, that's a ton - thanks for bringing that forward! 🙏 I think three things would be important here:
Cheers |
Hey @chr-hertel, yeah, these are things I had to do in my projects that got scattered all over the place, so I'm integrating all of them into the architecture of this project. Now, answering your questions:
|
@chr-hertel I think it's now stable enough for me to start writing tests for it, and for us to decide how I should split this PR. Thoughts? |
Hey, i hope i got it right, but i'd propose to split it into 4 different PRs:
and of course
so, best to start with 1-3? wasn't able to progress with #301 this week - and weekend looks busy as well |
So, the idea is that to keep the message history correct and track all tool calls and usage tokens, I need access to the intermediary messages. The chain processor was hiding all these intermediary messages and only calling the output processor at the end with the last response that wasn't a tool call. We can maybe change it to some "Intermediary Processor"
We are achieving similar results. I focused initially on Nova and Titan embedding models and used the raw HTTP client. Both of us are using external packages, though. I need AWS SDK for the credentials signature, and the other PR is using an external implementation for calling the service itself, and transforming the response (kinda of what I did initially the first time I interacted with bedrock and php-llm) IMHO, it's better to use the official SDK and rely only on HTTP requests so you can keep it "uniform", but that's just my thought. P.S.: Agreed with the Gemini and cerebras splitting. Also, there's an OpenAI-compatible bridge in this PR, don't know if you noticed. |
uh, interesting - i like that 👍
I def agree on that uniform argument, but another thought came to my mind - maybe you can confirm or challenge that: It is quite likely that LLM Chain gets integrated into an existing application, that also has already it's AWS infrastructure layer. And with that infrastructure layer it was already decided to use the SDK or the Async lib. |
@chr-hertel The thing is, the SDK just allows you to invoke the model, AWS calls then "Foundation Models", so everything else is on you. Tool calling management, message bags, and everything. I mean, I only used the SDK because I didn't want to handle AWS signatures |
gave it another thought and would go with both AWS implementations - one relies on that async package and one the sdk - like you did. still makes sense to me - would need docs and reasonable naming tho |
We usually use async-aws since it's smaller then the official SDK. The official SDK has more services supported though. I find working with async-aws also much easier. First I thought the way the signer is used in this PR would hinder role based authentication and require an access key. But now I think the default CredentialProvider will probably handle the authentication chain the same way as just using a service would and thus eliminate the need for handling access keys in an AWS infrastructure. Can you confirm this @tryvin? Maybe in the chain bundle we can set a bedrock flavor and use the correct library based on that configuration. |
@bjalt The way the PR is currently set up, it would support Session Tokens (Lambdas and IAM Anywhere), as well as the usual static KEY/SECRET. Still, there's also the possibility to add a new type ( There's also another option, which is supporting both |
Explanation
Hey guys, I'm finally adding what I have been working on in other projects and bringing it up-to-date with the new Metadata and Output processor for token counts.
This also adds new providers, and an "OpenAI generic" provider is also coming.
I created the PR so you are aware of what I'm doing and prevent double work, and of course, so you can start the review while I finish the tests and more.