Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Proxy: compresssion support #1246

Open
wants to merge 20 commits into
base: branch-hackathon
Choose a base branch
from

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Feb 14, 2025

Motivation

During hackathon, there emerged a need to test compressed communication through proxy. However, proxy did not yet support compression.

What's done

scylla-cql compression utilities

Exposed some scylla-cql compression utilities

These are now used by the proxy.

New impls

  • impl FromStr for Compression
  • impl DeserializableRequest for Startup

The impls are used by the proxy to read the negotiated compression from STARTUP frame.

Minor proxy fixes / features

impl (Partial)Eq for {Request,Response}Frame

It's convenient to be able to compare the whole frame at once.

Fixed some debug prints

Before, they printed the shard number next to the driver, not the node, which was confusing.

Stopped abusing COMPRESSED frame flag in tests randomly

Some tests just set random CQL flags and assert that they are simply passed through the proxy. It used to work, but as now we make the proxy clever enough to interpret at least one of them, namely the compression flag, we needed to stop abusing it blindly.

Combat timing-based flakiness

The test proxy_processes_requests_concurrently showed to be flaky on my machine. As its timeout there was arbitrarily tight, I increased it.

Compression abstractions

CompressionReader and CompressionWriter work in tandem to propagate the compression that the driver negociated with the node (in STARTUP frame) across the proxy connection workers.

  • CompressionWriter is used by the request_processor to set the compression globally for the whole bunch of proxy workers for that connection once a STARTUP message is intercepted.
  • CompressionReader is used by all workers that perform ser/de ({sender_to,receiver_from}_{driver,cluster}) to (de)compress frames as they come. Proxy always stores the frames and feeds them into feedback channels in the decompressed form, so that it is for example possible to use Conditions based on uncompressed body contents.

Unit test for compression mechanisms in the proxy

Outline of the test:

  1. "driver" sends an, uncompressed, e.g., QUERY frame, feedback returns its uncompressed body, and "node" receives the uncompressed frame.
  2. "node" responds with an uncompressed RESULT frame, feedback returns its uncompressed body, and "driver" receives the uncompressed frame.
  3. "driver" sends an uncompressed STARTUP frame, feedback returns its uncompressed body, and "node" receives the uncompressed frame.
  4. "driver" sends a compressed, e.g., QUERY frame, feedback returns its uncompressed body, and "node" receives the compressed frame.
  5. "node" responds with a compressed RESULT frame, feedback returns its uncompressed body, and "driver" receives the compressed frame.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • [ ] I have adjusted the documentation in ./docs/source/.
  • [ ] I added appropriate Fixes: annotations to PR description.

@wprzytula wprzytula self-assigned this Feb 14, 2025
Copy link

github-actions bot commented Feb 14, 2025

cargo semver-checks found no API-breaking changes in this PR! 🎉🥳
Checked commit: ee4f8e1

@wprzytula wprzytula force-pushed the proxy-compresssion-support branch from f8ebaac to a745540 Compare February 14, 2025 09:09
It was already such for scylla crate, so this is not a big change.
Those will be used by the proxy.
This will also be used by the proxy.
Again, proxy will take use of this.
Now they can be called in the static context, too.
It's convenient to be able to compare the whole frame at once.
It was surely `pub` by mistake. `ResponseFrame::write` was `pub(crate)`.
Before, they printed the shard number next to the driver, not the node,
which was confusing.
Some tests just set random CQL flags and assert that they are simply
passed through the proxy. It used to work, but as now we make the proxy
clever enough to interpret at least one of them, namely the compression
flag, we need to stop abusing it blindly.

The compression flag is the bit 0x1, so we simply must not use any odd
numbers as flags.
After introducing changes connected to compression, the test
`proxy_processes_requests_concurrently` started to be flaky on my
machine. As its timeout was arbitrarily tight, I increased it.

As a side note, the particular timeout is actually redundant.
We could fully rely on the ntest::timeout.
For now, its only variant is FrameHeaderParseError, whose usage in
proxy is replaced by ReadFrameError. In next commits, another variant
will be added for (de)compression errors.
`CompressionReader` and `CompressionWriter` work in tandem to propagate
the compression that the driver negociated with the node (in STARTUP
frame) across the proxy connection workers.

- `CompressionWriter` will be used by the `request_processor` to set the
  compression globally for the whole bunch of proxy workers for that
  connection once a STARTUP message is intercepted.
- `CompressionReader` will be used by all workers that perform ser/de
  (`{sender_to,receiver_from}_{driver,cluster}`) to (de)compress frames
  as they come. Proxy will always store the frames and feed them into
  feedback channels in the decompressed form, so that it is for example
  possible to use `Condition`s based on uncompressed body contents.

This commit introduces the bare minimum of those abstractions and
passes them to all relevant workers. For now, no compression is ever
set.
`request_processor` worker now intercepts STARTUP messages and sets
compression through its `CompressionWriter` according to the (optional)
`COMPRESSION` key in a STARTUP frame.

The set compression is still not used anywhere.
CompressionReader is now passed down to functions that read frames.
In the next commits, it will be actually used to decompress frames
if applicable.
This commit finally makes proxy decompress frames, if they have the
compression flag set AND the proxy already intercepted a STARTUP frame.
CompressionReader is now passed down to functions that write frames.
In the next commits, it will be actually used to compress frames
if applicable.
This commit finally makes proxy compress frames, if they have the
compression flag set AND the proxy already intercepted a STARTUP frame.
This effectively makes the proxy re-compress frames that it decompressed
when it intercepted them.
A unit test is written that tests compression mechanisms in the proxy.

Outline of the test:
1. "driver" sends an, uncompressed, e.g., QUERY frame, feedback returns
   its uncompressed body, and "node" receives the uncompressed frame.
2. "node" responds with an uncompressed RESULT frame, feedback returns
   its uncompressed body, and "driver" receives the uncompressed frame.
3. "driver" sends an uncompressed STARTUP frame, feedback returns its
   uncompressed body, and "node" receives the uncompressed frame.
4. "driver" sends a compressed, e.g., QUERY frame, feedback returns its
   uncompressed body, and "node" receives the compressed frame.
5. "node" responds with a compressed RESULT frame, feedback returns its
   uncompressed body, and "driver" receives the compressed frame.
@wprzytula wprzytula force-pushed the proxy-compresssion-support branch from 12443ad to ee4f8e1 Compare February 14, 2025 09:46
@Lorak-mmk
Copy link
Collaborator

Is that useful for @fruch 's compression tests?

@fruch
Copy link

fruch commented Feb 14, 2025

Is that useful for @fruch 's compression tests?

In my tests I wasn't reading the content, just using the total sizes of frames

I can help as extra validation that it's compressed, and which algorithm was used

@wprzytula
Copy link
Collaborator Author

Is that useful for @fruch 's compression tests?

In my tests I wasn't reading the content, just using the total sizes of frames

I can help as extra validation that it's compressed, and which algorithm was used

Yes: once proxy detects COMPRESSION key in STARTUP frame, it starts to interpret further frames having compression flag set as compressed, and decompresses them automatically using the negotiated compression. So it proxy fails to decompress a frame with an error, then we now that the frame was not compressed even though it should, or it was ill-compressed (perhaps with a wrong compression method).

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants