Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

metaselection needs error handling, or screening beforehand #139

Closed
ardunn opened this issue Dec 5, 2018 · 0 comments
Closed

metaselection needs error handling, or screening beforehand #139

ardunn opened this issue Dec 5, 2018 · 0 comments
Assignees

Comments

@ardunn
Copy link
Contributor

ardunn commented Dec 5, 2018

Sometimes there can be a bad composition object passed to metaselector, and it will fail. For example, try running metaselector on the mp_all dataset. It will give the error:

"Traceback (most recent call last):\n  File \"/global/scratch/ardunn/python/lib/python3.7/site-packages/fireworks/core/rocket.py\", line 262, in run\n    m_action = t.run_task(my_spec)\n  File \"/global/scratch/ardunn/codes/hmprivate/hmprivate/automatminer/benchmarking/tasks.py\", line 72, in run_task\n    predicted_test_df = pipe.benchmark(df, target, test_spec=0.2)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/utils/package_tools.py\", line 76, in wrapper\n    result = func(*args, **kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/pipeline.py\", line 229, in benchmark\n    df = self.autofeaturizer.fit_transform(df, target)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/utils/package_tools.py\", line 76, in wrapper\n    result = func(*args, **kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/core.py\", line 321, in fit_transform\n    return self.fit(df, target).transform(df, target, tidy_column=False)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/utils/package_tools.py\", line 76, in wrapper\n    result = func(*args, **kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/core.py\", line 259, in fit\n    self._customize_featurizers(df)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/core.py\", line 364, in _customize_featurizers\n    auto_exclude = self.metaselector.auto_excludes(df)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/core.py\", line 172, in auto_excludes\n    self.dataset_mfs = dataset_metafeatures(df, **mfs_kwargs)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/core.py\", line 45, in dataset_metafeatures\n    if mfs_func is not None else {})\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/core.py\", line 63, in _composition_metafeatures\n    mfs[mf] = mf_class.calc(df[composition_col])\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/metafeatures.py\", line 171, in calc\n    stats = composition_stats(X)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/metafeatures.py\", line 137, in composition_stats\n    return _composition_stats(tuple(X.values))\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/metafeatures.py\", line 154, in _composition_stats\n    stats = composition_statistics(X)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/utils.py\", line 23, in composition_statistics\n    stats[idx] = _composition_summary(composition)\n  File \"/global/scratch/ardunn/codes/automatminer/automatminer/featurization/metaselection/utils.py\", line 81, in _composition_summary\n    c = Composition(composition)\n  File \"/global/scratch/ardunn/codes/pymatgen/pymatgen/core/composition.py\", line 134, in __init__\n    elmap = dict(*args, **kwargs)\nTypeError: 'float' object is not iterable\n"

Which is basically just saying it tried to do Composition(1.24) or something (i.e., the original df had a bad value). So we need to figure out a good way to do error handling here to make sure metaselection is robust

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants