Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Upload downloaded files to S3 after every block so they can be used for subsequent blocks #1801

Merged
merged 1 commit into from
Feb 20, 2025

Conversation

wintonzheng
Copy link
Contributor

@wintonzheng wintonzheng commented Feb 20, 2025

Important

Reorder file upload logic in agent.py and adjust get_downloaded_files calls in block.py to ensure S3 uploads after every block execution.

  • Behavior:
    • Reorder conditional block in clean_up_task in agent.py to upload files to S3 before checking if task is part of a workflow run.
    • Set workflow_run_id to None in get_downloaded_files calls in block.py to ensure files are uploaded to S3 after every block execution.
  • Misc:
    • Minor reordering of code in agent.py for logical consistency.

This description was created by Ellipsis for fce6a35. It will automatically update as commits are pushed.

<!-- ELLIPSIS_HIDDEN -->

> [!IMPORTANT]
> Upload downloaded files to S3 after every block execution and adjust file retrieval logic in `block.py` and `agent.py`.
>
>   - **Behavior**:
>     - Upload downloaded files to S3 after every block execution in `block.py`.
>     - Ensure `workflow_run_id` is not passed when retrieving downloaded files in `block.py` and `agent.py`.
>   - **Code Reordering**:
>     - Reorder code in `clean_up_task()` in `agent.py` for clarity.
>   - **Misc**:
>     - Minor logging adjustments in `agent.py`.
>
> <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=Skyvern-AI%2Fskyvern-cloud&utm_source=github&utm_medium=referral)<sup> for afcb86b440d9b73c39c280dca1dd7066bcbeec8f. It will automatically update as commits are pushed.</sup>

<!-- ELLIPSIS_HIDDEN -->
@wintonzheng wintonzheng force-pushed the suchintan.fix-save-file-logic-cloud branch from 85a8153 to fce6a35 Compare February 20, 2025 15:34
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to fce6a35 in 1 minute and 16 seconds

More details
  • Looked at 60 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 14 drafted comments based on config settings.
1. skyvern/forge/sdk/workflow/models/block.py:1316
  • Draft comment:
    In _upload_file_to_s3, os.unlink(file_path) is called in finally. Consider handling potential errors if file deletion fails.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
2. skyvern/forge/sdk/workflow/models/block.py:1453
  • Draft comment:
    UploadToS3Block: When listing directory files, the code raises an error if file count exceeds MAX_UPLOAD_FILE_COUNT. Ensure this limit is intentional and documented.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
3. skyvern/forge/sdk/workflow/models/block.py:1522
  • Draft comment:
    SendEmailBlock: Validate that file attachments from parameters are securely handled and paths are sanitized to prevent path injection risks.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
4. skyvern/forge/sdk/workflow/models/block.py:1152
  • Draft comment:
    CodeBlock uses exec() with parameter_values. Confirm this is intentional and that the execution context is strictly controlled.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
5. skyvern/forge/sdk/workflow/models/block.py:1964
  • Draft comment:
    PDFParserBlock: Accumulate extracted text by concatenating each page's text. Consider performance implications for large PDFs.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
6. skyvern/forge/sdk/workflow/models/block.py:1620
  • Draft comment:
    FileParserBlock: When writing downloaded bytes to a temporary file, ensure proper file handling. Verify that create_named_temporary_file returns a file object with a 'write' method.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
7. skyvern/forge/sdk/workflow/models/block.py:1740
  • Draft comment:
    SendEmailBlock: When constructing the email, confirm that sensitive SMTP credentials aren’t logged or exposed in error messages.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
8. skyvern/forge/sdk/workflow/models/block.py:1124
  • Draft comment:
    Disabled CodeBlock raises DisabledBlockExecutionError. If not intended for production, consider removing or better documenting this dead code.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
9. skyvern/forge/sdk/workflow/models/block.py:1618
  • Draft comment:
    In _download_from_s3, the temporary file is written to but not explicitly closed. Consider using a context manager or closing the file to ensure resources are properly released.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. skyvern/forge/sdk/workflow/models/block.py:1445
  • Draft comment:
    In UploadToS3Block, a FileNotFoundError is raised immediately when the path does not exist. Consider handling missing or mis-parameterized paths more gracefully.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. skyvern/forge/sdk/workflow/models/block.py:1964
  • Draft comment:
    For PDFParserBlock, consider accumulating page texts in a list and then joining them, instead of using string concatenation in a loop, to improve performance for large PDFs.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. skyvern/forge/sdk/workflow/models/block.py:1319
  • Draft comment:
    In _upload_file_to_s3, if os.unlink(file_path) fails, it will be silent. Consider logging any errors during temporary file cleanup to aid debugging.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
13. skyvern/forge/agent.py:1024
  • Draft comment:
    Typo spotted: In the log message, "Continue excuting." should be corrected to "Continue executing."
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
14. skyvern/forge/agent.py:418
  • Draft comment:
    Typo spotted: The log message reads "Detecting incompleted download file, skip the rename". Consider changing "incompleted" to "incomplete" for correct wording.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_y8F0MDgz8aTvyuHj


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on fce6a35 in 2 minutes and 12 seconds

More details
  • Looked at 60 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 13 drafted comments based on config settings.
1. skyvern/forge/sdk/workflow/models/block.py:1534
  • Draft comment:
    In _decrypt_smtp_parameters, smtp_port remains a string. Consider converting to int before passing to smtplib.SMTP.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. skyvern/forge/sdk/workflow/models/block.py:1316
  • Draft comment:
    In _upload_file_to_s3, os.unlink(file_path) in finally block might throw if the file does not exist. Consider checking for its existence or handling potential exceptions.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. skyvern/forge/sdk/workflow/models/block.py:1617
  • Draft comment:
    In _download_from_s3 within SendEmailBlock, ensure that the temporary file is properly closed after writing to it.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. skyvern/forge/sdk/workflow/models/block.py:1315
  • Draft comment:
    Ensure that 'create_named_temporary_file' returns a file opened in binary mode – otherwise file_path.write(downloaded_bytes) may fail.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. skyvern/forge/sdk/workflow/models/block.py:1390
  • Draft comment:
    Consider using pathlib methods (e.g. Path.exists()) for file path checks and path joining to improve clarity in UploadToS3Block.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
6. skyvern/forge/sdk/workflow/models/block.py:842
  • Draft comment:
    Rendering a loop variable as JSON via a Jinja template may be fragile; add error handling around json.loads to catch formatting issues.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. skyvern/forge/sdk/workflow/models/block.py:1616
  • Draft comment:
    Verify that the temporary file in _download_from_s3 is opened for binary writing and consider closing it explicitly after writing.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
8. skyvern/forge/sdk/workflow/models/block.py:1600
  • Draft comment:
    When checking file paths in _get_file_paths, ensure that URL/S3 URIs are correctly distinguished from local files to avoid false negatives.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50%
    None
9. skyvern/forge/agent.py:1837
  • Draft comment:
    Setting workflow_run_id to None in get_downloaded_files and save_downloaded_files calls affects sharing downloaded files across blocks. Please verify this aligns with the intended behavior.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
10. skyvern/forge/sdk/workflow/models/block.py:1700
  • Draft comment:
    Avoid catching generic exceptions in block execution methods; consider catching specific exceptions to improve debuggability.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
11. skyvern/forge/agent.py:1357
  • Draft comment:
    Minor typo: In the comment starting with "# HACK: try scrape_website three time to handle screenshot timeout", consider changing 'three time' to 'three times' for correct grammar.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
12. skyvern/forge/agent.py:465
  • Draft comment:
    Minor issue: In the TODO comment that reads "keep the task object uptodate at all times so that clean_up_task can just use it", consider updating 'uptodate' to 'up-to-date' for clarity.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.
13. skyvern/forge/sdk/workflow/models/block.py:765
  • Draft comment:
    Typographical error: 'last_ouput' is misspelled. It should be corrected to 'last_output' for clarity.
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_ZtydFnR52QOGGIeZ


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@suchintan suchintan merged commit 30d2bbd into main Feb 20, 2025
7 checks passed
@suchintan suchintan deleted the suchintan.fix-save-file-logic-cloud branch February 20, 2025 16:30
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants