-
Notifications
You must be signed in to change notification settings - Fork 9k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
HADOOP-18910: [ABFS] Adding Support for MD5 Hash based integrity verification of the request content during transport #6069
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good code!
Some concerns over offsets; More tests around readAhead would be awesome.
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
...op-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/FileSystemConfigurations.java
Show resolved
Hide resolved
...doop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemChecksum.java
Outdated
Show resolved
Hide resolved
...doop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemChecksum.java
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java
Outdated
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
5ff8d5b
to
03a7453
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great tests addition!
Thanks for taking all the suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed
|
||
AbfsHttpHeader rangeHeader = new AbfsHttpHeader(RANGE, | ||
String.format("bytes=%d-%d", position, position + bufferLength - 1)); | ||
requestHeaders.add(rangeHeader); | ||
addEncryptionKeyRequestHeaders(path, requestHeaders, false, | ||
contextEncryptionAdapter, tracingContext); | ||
requestHeaders.add(new AbfsHttpHeader(RANGE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this going in here when like 1085 sets this header too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be outdated.
Caused by merge conflicts but it was fixed in the latest commit: 590a003
More merge conflicts need to be resolved now.
Will take them up.
:::: AGGREGATED TEST RESULT :::: HNS-OAuth[INFO] Results: HNS-SharedKey[INFO] Results: NonHNS-SharedKey[INFO] Results: AppendBlob-HNS-OAuth[INFO] Results: Time taken: 25 mins 26 secs. |
🎊 +1 overall
This message was automatically generated. |
@steveloughran @mukund-thakur Thanks a lot. |
? abfsRestOperationException.getErrorCode().getErrorCode() | ||
: AzureServiceErrorCode.UNKNOWN.getErrorCode(), | ||
abfsRestOperationException != null | ||
? abfsRestOperationException.toString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need full toString here or just ex.getMessage()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toString() will be better it has more information like the URL which was hit.
// | ||
// | ||
//<<<<<<< HEAD | ||
//private AzureBlobFileSystem getFileSystem(boolean optimizeFooterRead, | ||
// int fileSize) throws IOException { | ||
//final AzureBlobFileSystem fs = getFileSystem(); | ||
// getAbfsStore(fs).getAbfsConfiguration() | ||
// .setOptimizeFooterRead(optimizeFooterRead); | ||
// getAbfsStore(fs).getAbfsConfiguration() | ||
// .setIsChecksumValidationEnabled(true); | ||
// if (fileSize <= getAbfsStore(fs).getAbfsConfiguration() | ||
// .getReadBufferSize()) { | ||
// ======= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cut this
please fix this checkstyle error. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1
Seeing these failures in branch-3.4 after backporting this and #5881. These failures are happening even without these changes. @anujmodi2021 Can you figure out what other commits are missing in branch-3.4 or these are genuine failures?
[ERROR] ITestGetNameSpaceEnabled.testGetIsNamespaceEnabledWhenConfigIsFalse:98->unsetAndAssert:109 [getIsNamespaceEnabled should return the value configured for fs.azure.test.namespace.enabled] expected:<[fals]e> but was:<[tru]e> |
Thanks for pointing this out. I am not able to point out any changes missing in branch3.4 Let me check from my end what is the issue. Will take this up and create a PR for 3.4 with the fix. |
can you please create a backport PR on branch-3.4 and run the tests? |
Sure, Mukund. Will create one. Regarding, the failures you indicated above. I do not see these tests failing for me on either trunk or branch-3.4 (as of 26th Feb) |
:::: AGGREGATED TEST RESULT On branch-3.4:::: HNS-OAuth[INFO] Results: HNS-SharedKey[INFO] Results: NonHNS-SharedKey[INFO] Results: AppendBlob-HNS-OAuth[INFO] Results: Time taken: 21 mins 25 secs. |
Will run the test suite again on the backport PR to 3.4 as well... |
This is my auth-keys for a HNS account. and all of the above 3 tests fails from me even on trunk.
|
You also need to add following test configuration to specify the account type you are using
|
Yes after adding this, ITestGetNameSpaceEnabled succeed. |
Hi @mukund-thakur Having said that, we recommend that developers making changes in driver, should follow the configuration template we have designed so that they do not end up skipping a lot of tests. Recommendation is to follow the template and set all the configs mentioned there and then use our test scripts to run the whole test suite. That will ensure easy test runs with all the combinations of account types and auth types.. We will also make sure to update the documentations regarding ABFS Testing with this recommendation. Thanks for pointing this out. |
…fication of the request content during transport (apache#6069) Contributed By: Anuj Modi
@mukund-thakur Created a common PR for both commits as they tend to have conflicts |
Have you created the Jira's for these test issues? Found one more today https://issues.apache.org/jira/browse/HADOOP-19106 |
think its time for the default for namespace.enabled to become true? I'd support that |
…fication of the request content during transport (#6069) Contributed By: Anuj Modi
Working on all the test fixes. Will create a common PR for all these related Jira. Thanks for reporting and patience. |
Description of PR
Jira Ticket: https://issues.apache.org/jira/browse/HADOOP-18910
Azure Storage Supports Content-MD5 Request Headers in Both Read and Append APIs.
Read File Doc
Append File Doc
This change is to make client-side changes to support them. In Read request, we will send the appropriate header in response to which server will return the MD5 Hash of the data it sends back. On Client we will tally this with the MD5 hash computed from the data received.
In Append request, we will compute the MD5 Hash of the data that we are sending to the server and specify that in appropriate header. Server on finding that header will tally this with the MD5 hash it will compute on the data received.
This whole Checksum Validation Support is guarded behind a config, Config is by default disabled because with the use of "https" integrity of data is preserved anyways. This is introduced as an additional data integrity check which will have a performance impact as well.
Users can decide if they want to enable this or not by setting the following config to "true" or "false" respectively. Config: "fs.azure.enable.checksum.validation"
How was this patch tested?
New tests were added and existing tests were run
:::: AGGREGATED TEST RESULT ::::
HNS-OAuth
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[ERROR] TestAbfsClientThrottlingAnalyzer.testManySuccessAndErrorsAndWaiting:181->fuzzyValidate:64 The actual value 12 is not within the expected range: [5.60, 8.40].
[INFO]
[ERROR] Tests run: 141, Failures: 2, Errors: 0, Skipped: 5
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:344->lambda$testAcquireRetry$6:345 » TestTimedOut
[INFO]
[ERROR] Tests run: 593, Failures: 0, Errors: 1, Skipped: 54
[INFO] Results:
[INFO]
[WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41
HNS-SharedKey
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[ERROR] TestAbfsClientThrottlingAnalyzer.testManySuccessAndErrorsAndWaiting:181->fuzzyValidate:64 The actual value 9 is not within the expected range: [5.60, 8.40].
[INFO]
[ERROR] Tests run: 141, Failures: 2, Errors: 0, Skipped: 5
[INFO] Results:
[INFO]
[WARNING] Tests run: 593, Failures: 0, Errors: 0, Skipped: 54
[INFO] Results:
[INFO]
[WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41
NonHNS-SharedKey
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 141, Failures: 1, Errors: 0, Skipped: 11
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemCheckAccess.testCheckAccessForAccountWithoutNS:181->lambda$testCheckAccessForAccountWithoutNS$0:184 » AbfsRestOperation
[INFO]
[ERROR] Tests run: 593, Failures: 0, Errors: 1, Skipped: 277
[INFO] Results:
[INFO]
[WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 44
AppendBlob-HNS-OAuth
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[ERROR] TestAbfsClientThrottlingAnalyzer.testManySuccessAndErrorsAndWaiting:181->fuzzyValidate:64 The actual value 11 is not within the expected range: [5.60, 8.40].
[INFO]
[ERROR] Tests run: 141, Failures: 2, Errors: 0, Skipped: 5
[INFO] Results:
[INFO]
[WARNING] Tests run: 593, Failures: 0, Errors: 0, Skipped: 54
[INFO] Results:
[INFO]
[WARNING] Tests run: 339, Failures: 0, Errors: 0, Skipped: 41
Time taken: 82 mins 12 secs.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?