[INF] Merlin commons - Shapes #813

viswa-nvidia · 2023-02-15T17:15:15Z

Use values/offsets everywhere (and tuples nowhere)
Change dataloader output from 2-d scalars to 1-d (pulling that code from Merlin Models, Systems)
Capture shapes everywhere #823
Confusing list terminology (and possibly misusage of one or the other) ()-

Tasks

Change dataloader to output 1D tensors for scalar features dataloader#105
Change Merlin Models library to expect scalar features to be 1D and /list feature to be a dict instead of tuple #819
Change T4Rec library to expect scalar features to be 1D ibrary to expect scalar features and /list feature to be a dict instead of tuple #820

oliverholworthy · 2023-02-28T17:48:47Z

I've been looking at the last point * Confusing list terminology (and possibly misusage of one or the other* from the dataloader perspective which is where most of this confusing terminology show show up.

The dataloader currently uses the dataset schema value_count (now set by the shape) to determine the output type. This has caused problems in #819 because setting the shape (value_count) changes the output type from a ragged representation (tuple of values and row_lengths) to a sparse tensor which is unsupported as an input type to the model. We can work around this by removing the shape (value_count) from the schema of the dataset when using the dataloader from Merlin Models.

The more general fix for this that is being worked toward is removing the relationship between the schema shape and the output representation of list features. And leaving that to operators to transform if required.

This would also allow us to be closer to removing the wrapper classes in Transformers4Rec and Merlin Models that work around the interface to the dataloader and change the schema with confusing terminology in the names of arguments.

viswa-nvidia mentioned this issue Feb 15, 2023

[INF] Merlin Commons #776

Open

11 tasks

viswa-nvidia assigned karlhigley, rnyak, bschifferer and gabrielspmoreira Feb 15, 2023

viswa-nvidia added this to the Merlin 23.03 milestone Feb 28, 2023

oliverholworthy mentioned this issue Mar 14, 2023

Remove sparse tensor output type for list features NVIDIA-Merlin/dataloader#103

Merged

karlhigley closed this as completed Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[INF] Merlin commons - Shapes #813

[INF] Merlin commons - Shapes #813

viswa-nvidia commented Feb 15, 2023 •

edited by karlhigley

Loading

oliverholworthy commented Feb 28, 2023

[INF] Merlin commons - Shapes #813

[INF] Merlin commons - Shapes #813

Comments

viswa-nvidia commented Feb 15, 2023 • edited by karlhigley Loading

Tasks

oliverholworthy commented Feb 28, 2023

viswa-nvidia commented Feb 15, 2023 •

edited by karlhigley

Loading