Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Add exclude_signals to select_rows, and include_signals to export methods. #1139

Merged
merged 3 commits into from
Jan 26, 2024

Conversation

nsthorat
Copy link
Contributor

The motivation here is for export methods to quickly allow disabling signals for exporting.

  • select_rows now has exclude_signals, which automatically disables signals.
  • to_* now have include_signals, which by default removes signals. If you set this to true, you will get signals on the other side.

Also add a cluster sampling notebook that shows this workflow.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -2640,7 +2666,7 @@ def _column_to_duckdb_paths(
is_petal = schema.get_field(path).dtype is not None

# NOTE: The order of this array matters as we check the source and map manifests for fields
# before reading signal manifests, via source_or_map_has_path.
# before reading signal manifests, via source_or_map_has_path.s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

length_signal = LengthSignal()
dataset.compute_signal(length_signal, 'text')

result = dataset.select_rows(combine_columns=True, exclude_signals=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test where you select explicitly just a single sub-field of a signal (via columns, but exclude_signals=True)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@nsthorat nsthorat merged commit c7e7e28 into main Jan 26, 2024
4 checks passed
@nsthorat nsthorat deleted the nik-clustnote branch January 26, 2024 22:09
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants