Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[INF] Merlin commons - Shapes #813

Closed
6 of 7 tasks
Tracked by #776
viswa-nvidia opened this issue Feb 15, 2023 · 1 comment
Closed
6 of 7 tasks
Tracked by #776

[INF] Merlin commons - Shapes #813

viswa-nvidia opened this issue Feb 15, 2023 · 1 comment
Assignees
Milestone

Comments

@viswa-nvidia
Copy link

viswa-nvidia commented Feb 15, 2023

  • Use values/offsets everywhere (and tuples nowhere)
  • Change dataloader output from 2-d scalars to 1-d (pulling that code from Merlin Models, Systems)
  • Capture shapes everywhere #823
  • Confusing list terminology (and possibly misusage of one or the other) ()-

Tasks

@oliverholworthy
Copy link
Member

I've been looking at the last point * Confusing list terminology (and possibly misusage of one or the other* from the dataloader perspective which is where most of this confusing terminology show show up.

The dataloader currently uses the dataset schema value_count (now set by the shape) to determine the output type. This has caused problems in #819 because setting the shape (value_count) changes the output type from a ragged representation (tuple of values and row_lengths) to a sparse tensor which is unsupported as an input type to the model. We can work around this by removing the shape (value_count) from the schema of the dataset when using the dataloader from Merlin Models.

The more general fix for this that is being worked toward is removing the relationship between the schema shape and the output representation of list features. And leaving that to operators to transform if required.

This would also allow us to be closer to removing the wrapper classes in Transformers4Rec and Merlin Models that work around the interface to the dataloader and change the schema with confusing terminology in the names of arguments.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants