Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

performance problem met in range request with 304 update scenario, any suggestions ? #12047

Open
wangchunpeng opened this issue Feb 20, 2025 · 4 comments · May be fixed by #12092
Open

performance problem met in range request with 304 update scenario, any suggestions ? #12047

wangchunpeng opened this issue Feb 20, 2025 · 4 comments · May be fixed by #12092
Assignees

Comments

@wangchunpeng
Copy link

In ATS, we use the Range_request plugin and the Slice plugin to handle large file requests. when the fragment size is set 1MB and slice range is also set 1MB,each http sub-request generates only one single document(including doc,header_info and bodydata). When a file expires and an IMS (If-Modified-Since) request is made to the origin server, and the server responds with a 304 (Not Modified), ATS generates a separate document to store the updated header information. This results in the creation of many small files, each around a few kilobytes in size. When a subsequent request is made, it triggers two IO read operations, which reduces performance. Are there any optimization suggestions for this scenario ?

@wangchunpeng
Copy link
Author

what we want to do is make the 8k doc into doc or ssd as more as possible, which can reduce hdd read IO times.

@moonchen
Copy link
Contributor

moonchen commented Mar 4, 2025

Hi @wangchunpeng,

As I understand it, ATS's behavior when handling a 304 Not Modified response from the origin is mostly the same whether using the Slice plugin or not. In both cases:

  1. A new metadata record is created for a "header update," only one per URL regardless of slicing.
  2. A subsequent client request requires two I/O reads per slice: one for the metadata and one for the body data.

Are you seeing I/O overhead that is different for slicing, specifically when handling 304?

@traeak
Copy link
Contributor

traeak commented Mar 4, 2025

I think what's happening is that each slice of that asset is having to handle and store the 304 response. In the general case I'm not sure how to get around that since individual slices of the same asset name might have different content from multiple different versions of the named asset.

A possible solution:

Use a reference slice. That should always have the most up to date etag/last-modified "variant identifier" (or whatever it's called).

Pass this reference slice etag/last-modified "variant identifier" to subsequent calls of the CRR plugin. If the CRR plugin encounters a STALE in-cache result (from cache read hook) but the variant identifier matches, switch to FRESH and continue (reference regex_revalidate plugin for a rough example). This should stop the CRR plugin from going to parent and also should result in not having a variant header being written to cache. I think.

@wangchunpeng
Copy link
Author

Hi @wangchunpeng,

As I understand it, ATS's behavior when handling a 304 Not Modified response from the origin is mostly the same whether using the Slice plugin or not. In both cases:

  1. A new metadata record is created for a "header update," only one per URL regardless of slicing.
  2. A subsequent client request requires two I/O reads per slice: one for the metadata and one for the body data.

Are you seeing I/O overhead that is different for slicing, specifically when handling 304?

The version we use is 9.3.2, and for the very first request, each slice request is stored using only one doc when the request is closed when writing to the Vol

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants