-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19576: Disable Purging Pending MPUs Before Directory Purge #7722
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: trunk
Are you sure you want to change the base?
Conversation
💔 -1 overall
This message was automatically generated. |
@steveloughran - Could you please review the changes ? |
cn you point to some docs about s3express and MPUs? we had lots of pain related to directories not existing but still being found in list calls, and had to make changes across the code to cope with it. Are these now superfluous? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 pending you declaring what the test cli was, and what you tested against. s3 express, presumably
@steveloughran - I have added that details in PR description Test with us-east-1 with S3 express store bucket. The following tests were failing with and without the change ITestTreewalkProblems.testDistCp:317->lambda$testDistCp$3:318 |
https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-differences.html |
ITestTreewalkProblems looks related; the others shouldn't be happening at all, though the endpoint one may be from a locked down test machine. the ITestS3APutIfMatchAndIfNoneMatch &c tests are new and from conditional writes. If you test with an s3 standard bucket, do you see these? |
these are the problems that the object delete stuff is trying to address...if it is now consistent with s3 standard then we can turn it off. otherwise we have a problem: turning off that purge on delete behaviour introduces its own problems which code is doing the deletion? Can it use the bulk delete API to explitictly delete files? there's no extra work there other than the building up of the bulk delete POSTs and a bit of rate throttling |
On standard bucket all these tests passed. On S3 express buckets these tests are failing with and without this change, |
|
If bucket policies are configured properly - i don't this causing issues.
Deletion happens here - https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/DeleteOperation.java#L268 |
hhm, yes, I just got the conditional one to fail for me, because
I must have merged that test after all the SDK update pain and not tested it against an S3 Express bucket before. Looks like S3 Express isn't returning the 412 error code expected. @ahmarsuhail: is this expected, an SDK quirk or are we doing something wrong code wise? (update: most of my failures come from using fips endpoint flag, analytics stream and incomplete role permissions on aws console + software creation). But the conditional stuff new and caused by a different error on conditional writes
|
@steveloughran - https://aws.amazon.com/about-aws/whats-new/2025/06/amazon-s3-express-one-zone-atomic-renaming-objects-api/ S3 express supports atomic renames now - Need to see how efficient our MagicCommitter is as compared FileOutputCommitter now |
Let me rerun |
@shameersss1 will create a JIRA for rename support. An S3 team will commit the patch to support it I think, but it needs an SDK upgrade, one of us will have to volunteer to do that, think it will have to be me.. @steveloughran which test is not returning the 412? |
@steveloughran - The following is the failure `[ERROR] ITestTreewalkProblems.testDistCpNoIterator: Expecting: not sure if my setup was strong or is it expected |
![]() This looks like it is expecting different class of assertion failure error. @steveloughran - Are you seeing the same ? |
Description of PR
Pending MPUs are aborted by default for S3 express store. This leads to job failure for use cases where the directory needs be purged before the final job commit, Hence disabling the pending MPUs purging for all types of buckets.
How was this patch tested?
Test with
us-east-1
with S3 express store bucket. The following tests were failing with and without the changeFor code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?