Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: make DataSet initialization similar to pandas' #4864

Merged
merged 4 commits into from
Nov 27, 2024
Merged

Conversation

ogabrielluiz
Copy link
Contributor

@ogabrielluiz ogabrielluiz commented Nov 26, 2024

This pull request introduces significant enhancements to the DataSet class in the langflow schema, including new methods for initialization and row addition, as well as comprehensive testing for these features. Below are the most important changes:

Enhancements to DataSet class:

  • Initialization Enhancements:

    • Added support for various data formats during initialization, including lists of Data objects, dictionaries, dictionary of lists, and pandas DataFrames.
    • Introduced the __init__ method to handle these different formats and ensure proper data conversion.
  • Row Addition Methods:

    • Added add_row method to allow adding a single row to the dataset, supporting both Data objects and dictionaries.
    • Added add_rows method to allow adding multiple rows to the dataset, supporting lists of Data objects or dictionaries.

Testing Enhancements:

  • New Test Cases:
    • Added multiple test cases to verify the functionality of the new initialization methods with different data formats.
    • Added test cases for the add_row and add_rows methods to ensure correct behavior when adding rows with both dictionaries and Data objects.

Import and Export Enhancements:

  • Module Import Adjustments:

    • Updated __init__.py to include DataSet in the module exports, ensuring it is accessible when the module is imported.
  • Type Casting:

    • Added import for cast from typing to facilitate type casting within the DataSet class methods.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Nov 26, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Nov 26, 2024
Copy link

codspeed-hq bot commented Nov 26, 2024

CodSpeed Performance Report

Merging #4864 will degrade performances by 28.65%

Comparing improve-ds-dx (15fc22e) with main (159f6e5)

Summary

❌ 3 regressions
✅ 12 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark main improve-ds-dx Change
test_successful_run_with_input_type_any 234.8 ms 324.6 ms -27.67%
test_successful_run_with_input_type_text 231 ms 323.8 ms -28.65%
test_successful_run_with_output_type_debug 222.6 ms 253.1 ms -12.03%

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 26, 2024
… better data handling

- Added custom constructor to support various input formats including lists of Data objects, dictionaries, and existing DataFrames.
- Introduced methods `add_row` and `add_rows` for adding single or multiple rows to the DataSet.
- Updated docstrings and examples for clarity and usability.
- Ensured compatibility with pandas DataFrame operations while preserving Data object structures.
@ogabrielluiz ogabrielluiz enabled auto-merge (squash) November 26, 2024 23:55
@ogabrielluiz ogabrielluiz changed the title feat: add easier initialization to DataSet feat: make DataSet initialization similar to pandas' Nov 26, 2024
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Nov 26, 2024
@ogabrielluiz ogabrielluiz merged commit 7e88a47 into main Nov 27, 2024
21 checks passed
@ogabrielluiz ogabrielluiz deleted the improve-ds-dx branch November 27, 2024 00:01
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants