Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix the output type of Padding op #145

Merged
merged 1 commit into from
May 10, 2023
Merged

Conversation

sararb
Copy link
Contributor

@sararb sararb commented May 10, 2023

Goals ⚽

  • The Padding operation currently converts the values of all input columns to float64. However, we want to maintain the original dtypes (essential for embedding lookup of categorical features, for example).
  • The existing padding operation does not take the padding_value into consideration, as it always uses zero-padding.

Implementation Details 🚧

  • Updated the padding function pad_put_zeros to preserve the original column type.
  • Modified pad_put_zeros to consider the padding_value
  • Revised the docstrings for the Padding class.

Testing Details 🔍

  • Added a unit test for the data loader with ragged embeddings and the transform: Padding() >> Embeddings().

@sararb sararb added the enhancement New feature or request label May 10, 2023
@sararb sararb added this to the Merlin 23.05 milestone May 10, 2023
@sararb sararb self-assigned this May 10, 2023
@sararb sararb changed the title [DRAFT] Fix the output type of Padding op Fix the output type of Padding op May 10, 2023
@karlhigley karlhigley merged commit a2cc7ee into main May 10, 2023
Copy link
Contributor

@karlhigley karlhigley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants