Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

indexDocument not handled when bucket contains > 1k objects #118

Closed
chiefy opened this issue Jan 4, 2021 · 6 comments
Closed

indexDocument not handled when bucket contains > 1k objects #118

chiefy opened this issue Jan 4, 2021 · 6 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@chiefy
Copy link

chiefy commented Jan 4, 2021

Describe the bug
indexDocument doesn't work when folder contains large number of files before i

To Reproduce
Steps to reproduce the behavior:

  1. create s3 bucket with hundreds of files - mostly numbered, but with one index.html
  2. create target config to that bucket etc.
  3. page will display index of bucket up to about page 3-4/5, but not the actual index.html

Expected behavior
Proxy serves index.html

Screenshots
If applicable, add screenshots to help explain your problem.

Version and platform (please complete the following information):

  • Platform Linux
  • Arch: x86_64
  • Version latest docker
@chiefy chiefy added the bug Something isn't working label Jan 4, 2021
@chiefy
Copy link
Author

chiefy commented Jan 4, 2021

Looking some more into this, our bucket has 1.4k objects, and it looks like by default the s3 client will only return MaxKeys of 1k by default

@oxyno-zeta oxyno-zeta added the good first issue Good for newcomers label Jan 4, 2021
@oxyno-zeta oxyno-zeta self-assigned this Jan 4, 2021
@oxyno-zeta
Copy link
Owner

oxyno-zeta commented Jan 4, 2021

Hello @chiefy !

Thanks for this issue and all informations you give me. You are right. I must admit that I didn't though that this project will be used for this kind of huge bucket 😄 .

I will work on a quick fix within next days (I have it in mind, need to be tested) and I probably found a solution to allow more results to be served by allowing people to configure the max results limit they want (to avoid looping over all elements and eating all memory !).

I will keep you posted.

Regards,

Oxyno-zeta

@chiefy
Copy link
Author

chiefy commented Jan 5, 2021

@oxyno-zeta thanks. As it turns out our CI is publishing stuff to s3 without cleaning the bucket beforehand. So I'd say it's an edge-case.

@chiefy chiefy changed the title indexDocument not handled when bucket contains hundreds of files indexDocument not handled when bucket contains > 1k objects Jan 5, 2021
oxyno-zeta added a commit that referenced this issue Jan 5, 2021
Do a head request on index document to avoid skipping it on large
buckets and do not use the listed files for that anymore.

Fix issue #118
oxyno-zeta added a commit that referenced this issue Jan 5, 2021
Do a head request on index document to avoid skipping it on large
buckets and do not use the listed files for that anymore.

Fix issue #118
@oxyno-zeta
Copy link
Owner

This should be fixed with the 3.1.1 release 🎉

@oxyno-zeta
Copy link
Owner

Hello, Can you confirm or not that the issue is fixed ? Just to know I need to think about another solution / another fix 😅 . Thanks !

@oxyno-zeta
Copy link
Owner

This should be ok and now you can manage the pagination if needed. I'm closing this :)

iskandar pushed a commit to iskandar/s3-proxy that referenced this issue Feb 24, 2021
Do a head request on index document to avoid skipping it on large
buckets and do not use the listed files for that anymore.

Fix issue oxyno-zeta#118
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants