Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

NCT upload script #1

Merged
merged 22 commits into from
Oct 20, 2022
Merged

NCT upload script #1

merged 22 commits into from
Oct 20, 2022

Conversation

mephenor
Copy link
Member

@mephenor mephenor commented Oct 17, 2022

Added script to encrypt and upload to S3 storage.
File secret and necessary data is stored locally in a owner read-only file.

Co-authored-by: KerstenBreuer <kersten-breuer@outlook.com>
Co-authored-by: Moritz Hahn <Moritz.Hahn@uni-tuebingen.de>

Do we need a test against a localstack Testcontainer for this?
Retries are handled through a request session object.
Multipart uploads are cleaned up, but if something happens during/after dowanload, object stays in storage.
Do we need to clean this up as well?

Copy link
Contributor

@KerstenBreuer KerstenBreuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sieht gut aus, aber ein local test mittel localstack, den man dann auch im CI laufen lassen kann, wäre schon gut.

Bezüglich dem cleanup bei Fehlern. Es wäre in der Tat ganz gut, wenn die bisher hochgeladenen Files dann gelöscht werden. Sonst haben wir ganz viele Dateileichen, die wir später nichtmer zuordenen können. Kann ja teil der Exception handling logic werden.

mephenor and others added 3 commits October 17, 2022 15:18
Co-authored-by: Kersten Breuer <kersten-breuer@outlook.com>
Copy link

@MoritzHahn1337 MoritzHahn1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Found a typo
    Regarding what Kersten mentioned:
  • We should always use S3-Testcontainers when using S3-Capabilities
  • I feel like always using pycurl-requests for consistency, but we can debate this

In total, I think the script could be refactored for readability before we give it to a wider audience - as long as we only use it for the nct master dataset, we should be fine.

mephenor and others added 5 commits October 18, 2022 09:27
Co-authored-by: Moritz Hahn <75744178+MoritzHahn1337@users.noreply.github.com>
Happy path test with output cleanup check
Now back to global SESSION object
mephenor and others added 2 commits October 19, 2022 15:57
Co-authored-by: Kersten Breuer <kersten-breuer@outlook.com>
@mephenor mephenor merged commit 0a83697 into main Oct 20, 2022
@mephenor mephenor deleted the feature/nct_upload_GDEV-1297 branch October 20, 2022 09:14
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants