Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

HADOOP-19485. S3A: upgrade AWS SDK #7479

Open
wants to merge 5 commits into
base: trunk
Choose a base branch
from

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Mar 7, 2025

Upgrade to 2.30.27

How was this patch tested?

Regression testing in progress

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@steveloughran
Copy link
Contributor Author

testing in progress; also writing a new, expanded and very strict doc on qualifying a release, based on the experience of recent upgrades.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 0s Docker mode activated.
-1 ❌ patch 0m 16s #7479 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.
Subsystem Report/Notes
GITHUB PR #7479
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7479/1/console
versions git=2.34.1
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

Test failures with the 2.30.27

The issue of HADOOP-19272 #7048; S3A: AWS SDK 2.25.53 warnings logged about transfer manager not using CRT client has been fixed

java.lang.AssertionError: 
[LOG output does not contain the forbidden text. Has the SDK been fixed?] 
Expecting:
 <"">
to contain:
 <"The provided S3AsyncClient is an instance of MultipartS3AsyncClient"> 
        at org.apache.hadoop.fs.s3a.impl.ITestAwsSdkWorkarounds.testNoisyLogging(ITestAwsSdkWorkarounds.java:100)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.lang.Thread.run(Thread.java:750)

this test can be culled

@steveloughran steveloughran force-pushed the s3/HADOOP-19485-SDK-upgrade branch from 8639127 to 9720b02 Compare March 7, 2025 10:59
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 0s Docker mode activated.
-1 ❌ patch 0m 16s #7479 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.
Subsystem Report/Notes
GITHUB PR #7479
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7479/2/console
versions git=2.34.1
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

steveloughran commented Mar 7, 2025

ITestS3AEndpointRegion failures are probably caused by aws/aws-sdk-java-v2#5562 or a related change. Key:the execution context attribute AwsExecutionAttribute.ENDPOINT_OVERRIDDEN is now false when not true, rather than unset/null.
This is a test-only regression.

@steveloughran
Copy link
Contributor Author

Testing with third party stores shows the MD-5 thing is there. Also shows that no SDK testing is ever performed against third-party stores, which is something to consider

       org.apache.hadoop.fs.s3a.AWSBadRequestException: Remove S3 Files on s3a://dellecs/job-00-fork-0003/test/test: software.amazon.awssdk.services.s3.model.InvalidRequestException: Missing required header for this request: Content-MD5 (Service: S3, Status Code: 400, Request ID: 0c07c879:1953935cbee:1a45b:1d85, Extended Request ID: 85e1d41b57b608d4e58222b552dea52902e93b05a12f63f54730ae77769df8d1) (SDK Attempt Count: 1):InvalidRequest: Missing required header for this request: Content-MD5 (Service: S3, Status Code: 400, Request ID: 0c07c879:1953935cbee:1a45b:1d85, Extended Request ID: 85e1d41b57b608d4e58222b552dea52902e93b05a12f63f54730ae77769df8d1) (SDK Attempt Count: 1)

Upgrade to 2.30.27

Change-Id: Ic0652dc95c619559c45c9f0a153813b73a076d13
AwsSdkWorkarounds no longer needs to cut back on transfer manager logging
(HADOOP-19272).

Remove log downgrade and change assertion to expect nothing to be logged.

Change-Id: I5edcf674c1eede8327538979ddab2fe98d2e53e2
Change in state of AwsExecutionAttribute.ENDPOINT_OVERRIDDEN
attribute requires test tuning to match.

Change-Id: I80050ce9ffffa6b4f1b05dd16e83b18d2ce63678
Refresh IAM credentials a hard coded 60s before the session credentials
fully expire.

Change-Id: I2a61584cc99d761cc4b9af6a669224f309425088
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 0s Docker mode activated.
-1 ❌ patch 0m 15s #7479 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.
Subsystem Report/Notes
GITHUB PR #7479
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7479/3/console
versions git=2.34.1
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Switch is in client; commented out in test log properties;
covered in troubleshooting doc

Change-Id: If70447d8eb3d3d0e03db5c169cd1aabf844931bd
@steveloughran steveloughran force-pushed the s3/HADOOP-19485-SDK-upgrade branch from b8314e3 to 288fd09 Compare March 7, 2025 19:38
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 0s Docker mode activated.
-1 ❌ patch 0m 17s #7479 does not apply to trunk. Rebase required? Wrong Branch? See https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.
Subsystem Report/Notes
GITHUB PR #7479
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7479/4/console
versions git=2.34.1
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants