Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Maintain lookup table value column order #521

Conversation

stevebachmeier
Copy link
Contributor

Maintain LookupTable value_columns order

Description

Changes and notes

Testing

added a new test and it passes

@property
def standard_lookup_tables(self) -> List[str]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it didn't seem to be using this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember what this was for but Rajan and I had tried to add some feature like this but I think this is deprecated now.

@stevebachmeier stevebachmeier marked this pull request as ready for review October 29, 2024 23:49
@stevebachmeier stevebachmeier force-pushed the sbachmei/mic-5205/bugfix-lookup-table-non-deterministic-list branch from e8d49c1 to 0f74c92 Compare October 29, 2024 23:52
@@ -610,31 +610,38 @@ def build_lookup_table(
if isinstance(data, list):
return builder.lookup.build_table(data, value_columns=list(value_columns))
if isinstance(data, pd.DataFrame):
all_columns = set(data.columns)
duplicated_columns = set(data.columns[data.columns.duplicated()])
if duplicated_columns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

col
for col in all_columns
if col not in value_columns and col not in bin_edge_columns
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't lists not ordered? So if weh ave a different seed they could be in a different order?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also say make all this logic for getting the different columns a helper function/

Copy link
Contributor Author

@stevebachmeier stevebachmeier Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lists are ordered. Since all_columns is a list (ordered) of the data.columns (ordered), and we're building this up using list comprehension (col for col in all_columns...), it will build up the list in the same order as all_columns

key_columns: Union[List[str], Tuple[str, ...]],
parameter_columns: Union[List[str], Tuple[str, ...]],
value_columns: Union[List[str], Tuple[str, ...]],
key_columns: Sequence[str],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused how this is a Sequence if you originally defined it as [] but I haven't really used Sequence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultimately it can be any sequence (i.e. list or tuple). But it's kinda weird b/c as you said we instantiate it specifically as a list. But we were already saying Union[list, tuple] anyway! So I dunno - I think we're just a little inconsistent here in general

@property
def standard_lookup_tables(self) -> List[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember what this was for but Rajan and I had tried to add some feature like this but I think this is deprecated now.

@stevebachmeier stevebachmeier requested a review from albrja October 30, 2024 18:27
)

return builder.lookup.build_table(data)

def _get_columns(
self, value_columns: Optional[Sequence[str]], data: float | pd.DataFrame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use Optional

@stevebachmeier stevebachmeier merged commit 11b19a8 into main Oct 31, 2024
6 checks passed
@stevebachmeier stevebachmeier deleted the sbachmei/mic-5205/bugfix-lookup-table-non-deterministic-list branch October 31, 2024 23:14
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants