Skip to content

Conversation

dansingerman
Copy link
Contributor

What this does

When used within our app, streaming error responses were throwing an error and not being properly handled

worker      | D, [2025-07-03T18:49:52.221013 #81269] DEBUG -- RubyLLM: Received chunk: event: error
worker      | data: {"type":"error","error":{"details":null,"type":"overloaded_error","message":"Overloaded"}               }
worker      | 
worker      | 
worker      | 2025-07-03 18:49:52.233610 E [81269:sidekiq.default/processor chat_agent.rb:42] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Error in ChatAgent#send_with_streaming: NoMethodError - undefined method `merge' for nil:NilClass
worker      | 
worker      |       error_response = env.merge(body: JSON.parse(error_data), status: status)
worker      |                           ^^^^^^
worker      | 2025-07-03 18:49:52.233852 E [81269:sidekiq.default/processor chat_agent.rb:43] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Backtrace: /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:91:in `handle_error_chunk'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:62:in `process_stream_chunk'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:70:in `block in legacy_stream_processor'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:113:in `block in perform_request'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:535:in `call_block'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:526:in `<<'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb

It looks like the introduction of support for Faraday V1 introduced this error, as the error handling relies on an env that is no longer passed. This should provide a fix for both V1 and V2.

One thing to note, I had to manually construct the VCR cassettes, I'm not sure of a better way to test an intermittent error response.

I have also only written the tests against anthropic/claude-3-5-haiku-20241022 - it's possible other models with a different error format may still not be properly handled, but even in that case it won't error for the reasons fixed here.

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Performance improvement

Scope check

  • I read the Contributing Guide
  • This aligns with RubyLLM's focus on LLM communication
  • This isn't application-specific logic that belongs in user code
  • This benefits most users, not just my specific use case

Quality check

  • I ran overcommit --install and all hooks pass
  • I tested my changes thoroughly
  • I updated documentation if needed
  • I didn't modify auto-generated files manually (models.json, aliases.json)

API changes

  • Breaking change
  • New public methods/classes
  • Changed method signatures
  • No API changes

Related issues

Copy link

codecov bot commented Jul 16, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.20%. Comparing base (a9a1446) to head (ad9104b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
lib/ruby_llm/streaming.rb 63.63% 4 Missing ⚠️
lib/ruby_llm/providers/openai/streaming.rb 71.42% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #273      +/-   ##
==========================================
+ Coverage   86.64%   87.20%   +0.56%     
==========================================
  Files          79       79              
  Lines        3129     3142      +13     
  Branches      613      621       +8     
==========================================
+ Hits         2711     2740      +29     
+ Misses        418      402      -16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes to the spec file are quite different than our current style for spec files. We also assume that cassettes can be removed, and in fact rake vcr:record[anthropic] would remove yours. I'd suggest to mock the messages coming back from the API instead.

It would also be great to test with other providers too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've removed the VCR cassettes and replaced it with stubbed_requests.

I've also added support for other providers (other than Bedrock)

I've verified the error format used in the mocks against Anthropic and Open AI apis, so I'm reasonably confident in that mocking. I don't have access to Bedrock right now, and the error handling seems somewhat different as far as I can tell, so I haven't added support for that in the tests yet. That would ideally be done by someone with access to Bedrock.

@crmne crmne added the bug Something isn't working label Jul 16, 2025
@dansingerman dansingerman requested a review from crmne July 17, 2025 11:52
@finbarr
Copy link
Contributor

finbarr commented Jul 24, 2025

I just noticed this PR. I have something similar here that takes a slightly different approach: #292

Definitely needs fixed one way or another.

@crmne crmne closed this Jul 30, 2025
@crmne crmne reopened this Jul 30, 2025
@crmne
Copy link
Owner

crmne commented Jul 30, 2025

After further review, I think this implementation is actually the right approach. Let me give this another test run and we can get it merged.

Copy link
Owner

@crmne crmne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crmne crmne merged commit 735e36b into crmne:main Jul 30, 2025
14 checks passed
tpaulshippy pushed a commit to tpaulshippy/ruby_llm that referenced this pull request Aug 3, 2025
…y V1 and V2 (crmne#273)

## What this does

When used within our app, streaming error responses were throwing an
error and not being properly handled

```
worker      | D, [2025-07-03T18:49:52.221013 #81269] DEBUG -- RubyLLM: Received chunk: event: error
worker      | data: {"type":"error","error":{"details":null,"type":"overloaded_error","message":"Overloaded"}               }
worker      | 
worker      | 
worker      | 2025-07-03 18:49:52.233610 E [81269:sidekiq.default/processor chat_agent.rb:42] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Error in ChatAgent#send_with_streaming: NoMethodError - undefined method `merge' for nil:NilClass
worker      | 
worker      |       error_response = env.merge(body: JSON.parse(error_data), status: status)
worker      |                           ^^^^^^
worker      | 2025-07-03 18:49:52.233852 E [81269:sidekiq.default/processor chat_agent.rb:43] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Backtrace: /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:91:in `handle_error_chunk'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:62:in `process_stream_chunk'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:70:in `block in legacy_stream_processor'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:113:in `block in perform_request'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:535:in `call_block'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:526:in `<<'
worker      | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb
```

It looks like the [introduction of support for Faraday V1
](crmne#173 this error, as
the error handling relies on an `env` that is no longer passed. This
should provide a fix for both V1 and V2.

One thing to note, I had to manually construct the VCR cassettes, I'm
not sure of a better way to test an intermittent error response.

I have also only written the tests against
`anthropic/claude-3-5-haiku-20241022` - it's possible other models with
a different error format may still not be properly handled, but even in
that case it won't error for the reasons fixed here.

## Type of change

- [x] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation
- [ ] Performance improvement

## Scope check

- [x] I read the [Contributing
Guide](https://github.com/crmne/ruby_llm/blob/main/CONTRIBUTING.md)
- [x] This aligns with RubyLLM's focus on **LLM communication**
- [x] This isn't application-specific logic that belongs in user code
- [x] This benefits most users, not just my specific use case

## Quality check

- [x] I ran `overcommit --install` and all hooks pass
- [x] I tested my changes thoroughly
- [x] I updated documentation if needed
- [x] I didn't modify auto-generated files manually (`models.json`,
`aliases.json`)

## API changes

- [ ] Breaking change
- [ ] New public methods/classes
- [ ] Changed method signatures
- [x] No API changes

## Related issues

---------

Co-authored-by: Carmine Paolino <carmine@paolino.me>
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants