Skip to content

HADOOP-19576: Disable Purging Pending MPUs Before Directory Purge #7722

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: trunk
Choose a base branch
from

Conversation

shameersss1
Copy link
Contributor

Description of PR

Pending MPUs are aborted by default for S3 express store. This leads to job failure for use cases where the directory needs be purged before the final job commit, Hence disabling the pending MPUs purging for all types of buckets.

How was this patch tested?

Test with us-east-1 with S3 express store bucket. The following tests were failing with and without the change

ITestTreewalkProblems.testDistCp:317->lambda$testDistCp$3:318 
ITestTreewalkProblems.testDistCpNoIterator:340->lambda$testDistCpNoIterator$4:341 [Exit code of distcp -update -delete -direct 
ITestCustomSigner.testCustomSignerAndInitializer
ITestS3AContractAnalyticsStreamVectoredRead.testVectoredReadAfterNormalRead
ITestS3AEndpointRegion.testCentralEndpointAndNullRegionFipsWithCRUD:510 » AWSUnsupportedFeature
ITestS3AEndpointRegion.testCentralEndpointAndNullRegionWithCRUD:501->assertOpsUsingNewFs:548 » UnknownHost
ITestS3AEndpointRegion.testWithCrossRegionAccess:395 » UnknownHost getFileStat...
ITestS3AEndpointRegion.testWithOutCrossRegionAccess:374->lambda$testWithOutCrossRegionAccess$2:376 » UnknownHost
ITestConnectionTimeouts.testObjectUploadTimeouts:265 » AWSBadRequest Writing O...
ITestS3APutIfMatchAndIfNoneMatch.testIfMatchTwoMultipartUploadsRaceConditionOneClosesFirst:551 » AWSS3IO
ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchConflictOnMultipartUpload:321->lambda$testIfNoneMatchConflictOnMultipartUpload$2:322->createFileWithFlags:176 » O
ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchMultipartUploadWithRaceCondition:349 » AWSS3IO
ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchTwoConcurrentMultipartUploads:372 » AWSS3

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 21m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 9s trunk passed
+1 💚 compile 0m 47s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 36s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 checkstyle 0m 33s trunk passed
+1 💚 mvnsite 0m 42s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 34s trunk passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 1m 11s trunk passed
+1 💚 shadedclient 39m 43s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 31s the patch passed
+1 💚 compile 0m 39s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 39s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 22s the patch passed
+1 💚 mvnsite 0m 33s the patch passed
+1 💚 javadoc 0m 30s the patch passed with JDK Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 26s the patch passed with JDK Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
+1 💚 spotbugs 1m 7s the patch passed
+1 💚 shadedclient 39m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 29s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
156m 55s
Subsystem Report/Notes
Docker ClientAPI=1.50 ServerAPI=1.50 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7722/1/artifact/out/Dockerfile
GITHUB PR #7722
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 723ee363572c 5.15.0-139-generic #149-Ubuntu SMP Fri Apr 11 22:06:13 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ba9df1c
Default Java Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.27+6-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_452-8u452-gaus1-0ubuntu120.04-b09
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7722/1/testReport/
Max. process+thread count 577 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7722/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@shameersss1
Copy link
Contributor Author

@steveloughran - Could you please review the changes ?

@steveloughran
Copy link
Contributor

cn you point to some docs about s3express and MPUs?

we had lots of pain related to directories not existing but still being found in list calls, and had to make changes across the code to cope with it. Are these now superfluous?

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 pending you declaring what the test cli was, and what you tested against. s3 express, presumably

@shameersss1
Copy link
Contributor Author

@steveloughran - I have added that details in PR description

Test with us-east-1 with S3 express store bucket. The following tests were failing with and without the change

ITestTreewalkProblems.testDistCp:317->lambda$testDistCp$3:318
ITestTreewalkProblems.testDistCpNoIterator:340->lambda$testDistCpNoIterator$4:341 [Exit code of distcp -update -delete -direct
ITestCustomSigner.testCustomSignerAndInitializer
ITestS3AContractAnalyticsStreamVectoredRead.testVectoredReadAfterNormalRead
ITestS3AEndpointRegion.testCentralEndpointAndNullRegionFipsWithCRUD:510 » AWSUnsupportedFeature
ITestS3AEndpointRegion.testCentralEndpointAndNullRegionWithCRUD:501->assertOpsUsingNewFs:548 » UnknownHost
ITestS3AEndpointRegion.testWithCrossRegionAccess:395 » UnknownHost getFileStat...
ITestS3AEndpointRegion.testWithOutCrossRegionAccess:374->lambda$testWithOutCrossRegionAccess$2:376 » UnknownHost
ITestConnectionTimeouts.testObjectUploadTimeouts:265 » AWSBadRequest Writing O...
ITestS3APutIfMatchAndIfNoneMatch.testIfMatchTwoMultipartUploadsRaceConditionOneClosesFirst:551 » AWSS3IO
ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchConflictOnMultipartUpload:321->lambda$testIfNoneMatchConflictOnMultipartUpload$2:322->createFileWithFlags:176 » O
ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchMultipartUploadWithRaceCondition:349 » AWSS3IO
ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchTwoConcurrentMultipartUploads:372 » AWSS3

@shameersss1
Copy link
Contributor Author

cn you point to some docs about s3express and MPUs?

https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-express-differences.html

@steveloughran
Copy link
Contributor

ITestTreewalkProblems looks related; the others shouldn't be happening at all, though the endpoint one may be from a locked down test machine.

the ITestS3APutIfMatchAndIfNoneMatch &c tests are new and from conditional writes.

If you test with an s3 standard bucket, do you see these?

@steveloughran
Copy link
Contributor

  1. what is the current s3 express lifecycle support? Is there some now?
  2. do object prefixes still appear in LIST calls when there are pending uploads?

these are the problems that the object delete stuff is trying to address...if it is now consistent with s3 standard then we can turn it off. otherwise we have a problem: turning off that purge on delete behaviour introduces its own problems

which code is doing the deletion? Can it use the bulk delete API to explitictly delete files? there's no extra work there other than the building up of the bulk delete POSTs and a bit of rate throttling

@shameersss1
Copy link
Contributor Author

ITestTreewalkProblems looks related; the others shouldn't be happening at all, though the endpoint one may be from a locked down test machine.

the ITestS3APutIfMatchAndIfNoneMatch &c tests are new and from conditional writes.

If you test with an s3 standard bucket, do you see these?

On standard bucket all these tests passed. On S3 express buckets these tests are failing with and without this change,

@shameersss1
Copy link
Contributor Author

shameersss1 commented Jun 17, 2025

  1. what is the current s3 express lifecycle support? Is there some now?

  2. do object prefixes still appear in LIST calls when there are pending uploads?

these are the problems that the object delete stuff is trying to address...if it is now consistent with s3 standard then we can turn it off. otherwise we have a problem: turning off that purge on delete behaviour introduces its own problems

which code is doing the deletion? Can it use the bulk delete API to explitictly delete files? there's no extra work there other than the building up of the bulk delete POSTs and a bit of rate throttling

  1. Bucket policy is same as standard buckets. Dangling MPUs can be deleted by AbortIncompleteMultipartUpload bucket policy

  2. Yes. Pending MPU path becomes visible (The file becomes visible only after complete MPU). I,e let's say we are writing to S3 express bucket - s3:///path/key/file.txt using MPU - Once the MPU is initiated - The path s3:///path/key/ becomes visible and can be listed (file.txt) won't be visible

@shameersss1
Copy link
Contributor Author

@steveloughran

these are the problems that the object delete stuff is trying to address...if it is now consistent with s3 standard then we can turn it off. otherwise we have a problem: turning off that purge on delete behaviour introduces its own problems

If bucket policies are configured properly - i don't this causing issues.

which code is doing the deletion? Can it use the bulk delete API to explitictly delete files? there's no extra work there other than the building up of the bulk delete POSTs and a bit of rate throttling

Deletion happens here - https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/DeleteOperation.java#L268

@steveloughran
Copy link
Contributor

hhm, yes, I just got the conditional one to fail for me, because

Caused by: software.amazon.awssdk.services.s3.model.S3Exception: At least one of the pre-conditions you specified did not hold (Service: S3, Status Code: 200, Request ID: 019bc747f80001978438fae6050994f1a175e4fc, Extended Request ID: Oax6srKk)

I must have merged that test after all the SDK update pain and not tested it against an S3 Express bucket before. Looks like S3 Express isn't returning the 412 error code expected. @ahmarsuhail: is this expected, an SDK quirk or are we doing something wrong code wise?

(update: most of my failures come from using fips endpoint flag, analytics stream and incomplete role permissions on aws console + software creation). But the conditional stuff new and caused by a different error on conditional writes

[ERROR]   ITestS3AContractAnalyticsStreamVectoredRead>AbstractContractVectoredReadTest.testVectoredReadAfterNormalRead:490 » IO                                                                                                                       
[ERROR]   ITestS3AContractAnalyticsStreamVectoredRead>AbstractContractVectoredReadTest.testVectoredReadAfterNormalRead:490 » IO                                                                                                                       
[ERROR]   ITestS3AEndpointRegion.testCentralEndpointAndNullRegionFipsWithCRUD:510 » AWSUnsupportedFeature
[ERROR]   ITestS3AEndpointRegion.testCentralEndpointAndNullRegionWithCRUD:501->assertOpsUsingNewFs:548 » UnknownHost
[ERROR]   ITestS3AEndpointRegion.testWithCrossRegionAccess:395 » UnknownHost getFileStat...
[ERROR]   ITestS3AEndpointRegion.testWithOutCrossRegionAccess:374->lambda$testWithOutCrossRegionAccess$2:376 » UnknownHost
[ERROR]   ITestS3AFailureHandling.testMultiObjectDeleteLargeNumKeys » OutOfMemory Java h...
[ERROR]   ITestAssumeRole.testBulkDeleteOnReadOnlyAccess:715->executeBulkDeleteOnReadOnlyFiles:740 » AccessDenied
[ERROR]   ITestAssumeRole.testBulkDeleteWithReadWriteAccess:721->executeBulkDeleteOnSomeReadOnlyFiles:782 » AccessDenied
[ERROR]   ITestAssumeRole.testPartialDelete:703->executePartialDelete:850 » AccessDenied
[ERROR]   ITestAssumeRole.testPartialDeleteSingleDelete:709->executePartialDelete:850 » AccessDenied
[ERROR]   ITestAssumeRole.testReadOnlyOperations:470 » AccessDenied s3a://stevel--usw2-a...
[ERROR]   ITestAssumeRole.testRestrictedCommitActions:633 » AccessDenied s3a://stevel--u...
[ERROR]   ITestAssumeRole.testRestrictedWriteSubdir:520 » AccessDenied s3a://stevel--usw...
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testAbortNonexistentDir:235 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testBaseRelativePath:283 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testBulkCommitFiles:603 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testCommitEmptyFile:218->ITestCommitOperations.createCommitAndVerify:347 » AccessDenied                                                                                              
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testCommitSmallFile:224->ITestCommitOperations.createCommitAndVerify:347 » AccessDenied                                                                                              
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testCreateAbortEmptyFile:173 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testFailuresInAbortListing:567 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testMarkerFileRename:305 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testRevertCommit:545 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testRevertMissingCommit:557 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testUploadEmptyFile:473 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testUploadSmallFile:504 » AccessDenied
[ERROR]   ITestAssumedRoleCommitOperations>ITestCommitOperations.testWriteNormalStream:586 » AccessDenied
[ERROR]   ITestCustomSigner.testCustomSignerAndInitializer:137->runStoreOperationsAndVerify:157->lambda$runStoreOperationsAndVerify$0:162 » AWSBadRequest                                                                                             
[ERROR]   ITestCustomSigner.testCustomSignerAndInitializer:137->runStoreOperationsAndVerify:157->lambda$runStoreOperationsAndVerify$0:162 » AWSBadRequest                                                                                             
[ERROR]   ITestRestrictedReadAccess.testNoReadAccess:211->checkBasicFileOperations:283 » AccessDenied
[ERROR]   ITestRoleDelegationInFilesystem>ITestSessionDelegationInFilesystem.testDelegatedFileSystem:405->ITestSessionDelegationInFilesystem.executeDelegatedFSOperations:453 » AccessDenied                                                          
[ERROR]   ITestConnectionTimeouts.testObjectUploadTimeouts:265 » AWSBadRequest Writing O...
[ERROR]   ITestPartialRenamesDeletes.testCopyDirFailsToReadOnlyDir:519 » AccessDenied s3...
[ERROR]   ITestPartialRenamesDeletes.testCopyDirFailsToReadOnlyDir:519 » AccessDenied s3...
[ERROR]   ITestPartialRenamesDeletes.testPartialDirDelete[bulk-delete=false] » AWSBadRequest
[ERROR]   ITestPartialRenamesDeletes.testPartialDirDelete:607 » AccessDenied s3a://steve...
[ERROR]   ITestPartialRenamesDeletes.testPartialEmptyDirDelete:572 » AccessDenied s3a://...
[ERROR]   ITestPartialRenamesDeletes.testPartialEmptyDirDelete:572 » AccessDenied s3a://...
[ERROR]   ITestPartialRenamesDeletes.testRenameDirFailsInDelete:457 » AccessDenied s3a:/...
[ERROR]   ITestPartialRenamesDeletes.testRenameDirFailsInDelete:457 » AccessDenied s3a:/...
[ERROR]   ITestPartialRenamesDeletes.testRenameFileFailsNoWrite:503 » AccessDenied s3a:/...
[ERROR]   ITestPartialRenamesDeletes.testRenameFileFailsNoWrite:503 » AccessDenied s3a:/...
[ERROR]   ITestPartialRenamesDeletes.testRenameParentPathNotWriteable:393 » AccessDenied
[ERROR]   ITestPartialRenamesDeletes.testRenameParentPathNotWriteable:393 » AccessDenied
[ERROR]   ITestPartialRenamesDeletes.testRenamePermissionRequirements:756 » AccessDenied
[ERROR]   ITestPartialRenamesDeletes.testRenamePermissionRequirements:756 » AccessDenied
[ERROR]   ITestPartialRenamesDeletes.testRenameSingleFileFailsInDelete:416 » AccessDenied
[ERROR]   ITestPartialRenamesDeletes.testRenameSingleFileFailsInDelete:416 » AccessDenied
[ERROR]   ITestS3APutIfMatchAndIfNoneMatch.testIfMatchTwoMultipartUploadsRaceConditionOneClosesFirst:551 » AWSS3IO
[ERROR]   ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchConflictOnMultipartUpload:321->lambda$testIfNoneMatchConflictOnMultipartUpload$2:322->createFileWithFlags:176 » AWSS3IO                                                                     
[ERROR]   ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchMultipartUploadWithRaceCondition:349 » AWSS3IO
[ERROR]   ITestS3APutIfMatchAndIfNoneMatch.testIfNoneMatchTwoConcurrentMultipartUploads:372 » AWSS3IO
[INFO] 
[ERROR] Tests run: 1460, Failures: 0, Errors: 52, Skipped: 20

@shameersss1
Copy link
Contributor Author

@steveloughran - https://aws.amazon.com/about-aws/whats-new/2025/06/amazon-s3-express-one-zone-atomic-renaming-objects-api/

S3 express supports atomic renames now - Need to see how efficient our MagicCommitter is as compared FileOutputCommitter now

@shameersss1
Copy link
Contributor Author

Let me rerun ITestTreewalkProblems since you have not noticed this failure in your runs.

@ahmarsuhail
Copy link
Contributor

@shameersss1 will create a JIRA for rename support. An S3 team will commit the patch to support it I think, but it needs an SDK upgrade, one of us will have to volunteer to do that, think it will have to be me..

@steveloughran which test is not returning the 412? ITestS3APutIfMatchAndIfNoneMatch?

@shameersss1
Copy link
Contributor Author

@steveloughran - ITestTreewalkProblems fails in my setup (even without this commit).

The following is the failure

`[ERROR] ITestTreewalkProblems.testDistCpNoIterator:

Expecting:
<-999>
to be equal to:
<0>
but was not.
`

not sure if my setup was strong or is it expected

@shameersss1
Copy link
Contributor Author

Screenshot 2025-06-20 at 11 58 28 AM

This looks like it is expecting different class of assertion failure error.

@steveloughran - Are you seeing the same ?

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants