Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add datasegment copier interface and s3 impl #17430

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jtuglu-netflix
Copy link
Contributor

@jtuglu-netflix jtuglu-netflix commented Oct 28, 2024

This PR creates a DataSegmentCopier interface, and corresponding S3DataSegmentCopier implementation. The goal here is to provide an alternative for those wishing to move datasegments around between clusters. These classes are used in a CLI tool for copying datasources between clusters that was similar to the older, now-deprecated migration tool and plan to release that to open-source soon as well.

Description

Currently, Druid only provides a means of moving (deleting from the source) a datasegment from one deep storage location to another. This adds flexibility to copy instead, while refactoring the code between S3DataSegmentMover and S3DataSegmentCopier to be shared inside S3DataSegmentTransferUtility.

Release note


Key changed/added classes in this PR
  • extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentCopier.java
  • extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentMover.java
  • extensions-core/s3-extensions/src/main/java/org/apache/druid/storage/s3/S3DataSegmentTransferUtility.java
  • extensions-core/s3-extensions/src/test/java/org/apache/druid/storage/s3/S3DataSegmentCopierTest.java
  • processing/src/main/java/org/apache/druid/segment/loading/DataSegmentCopier.java

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@jtuglu-netflix jtuglu-netflix force-pushed the add-datasegment-copier-interface-and-s3-impl branch from a2534fb to 49c7941 Compare October 28, 2024 22:14
@jtuglu-netflix jtuglu-netflix marked this pull request as ready for review October 28, 2024 22:14
@kfaraz
Copy link
Contributor

kfaraz commented Oct 29, 2024

Thanks for the PR @jtuglu-netflix !
Could you share some details on how you plan to use this feature?

);
}
catch (Exception e) {
Throwables.propagateIfInstanceOf(e, AmazonServiceException.class);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
Throwables.propagateIfInstanceOf
should be avoided because it has been deprecated.
}
catch (Exception e) {
Throwables.propagateIfInstanceOf(e, AmazonServiceException.class);
Throwables.propagateIfInstanceOf(e, SegmentLoadingException.class);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
Throwables.propagateIfInstanceOf
should be avoided because it has been deprecated.

private MockAmazonS3Client()
{
super(new AmazonS3Client(), new NoopServerSideEncryption());

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
AmazonS3Client.AmazonS3Client
should be avoided because it has been deprecated.
@jtuglu-netflix jtuglu-netflix force-pushed the add-datasegment-copier-interface-and-s3-impl branch from af56f75 to ec80d2f Compare October 30, 2024 16:02
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants