You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
we need a simple agentic framework that can handle tool calls, run them, and then provide the output back in the chat for follow-up responses, especially focusing on Batch generations to maximize GPU utility.
For the initial version, we'll use a Transformers model, and later on, we'll switch to a VLLM version for better efficiency.
I’ve been avoiding Langchain and smolagents since they rely heavily on pre-written prompts, which overcomplicates and limits training, plus they aren’t designed for batch generation.
If there’s a way to achieve this with existing libraries like Langchain or smolagents, I’d love to hear your thoughts!
The text was updated successfully, but these errors were encountered:
we need a simple agentic framework that can handle tool calls, run them, and then provide the output back in the chat for follow-up responses, especially focusing on Batch generations to maximize GPU utility.
For the initial version, we'll use a Transformers model, and later on, we'll switch to a VLLM version for better efficiency.
I’ve been avoiding Langchain and smolagents since they rely heavily on pre-written prompts, which overcomplicates and limits training, plus they aren’t designed for batch generation.
If there’s a way to achieve this with existing libraries like Langchain or smolagents, I’d love to hear your thoughts!
The text was updated successfully, but these errors were encountered: