Cmiles options #535

jthorton · 2020-03-04T11:20:07Z

Fix Expand to_smiles API #534
Add tests
Update docstrings/documentation, if applicable
Update changelog

This PR will expand the to_smiles API options to allow the creation of cmiles identifiers.

…ield into more_smiles_options Conflicts: openforcefield/topology/molecule.py openforcefield/utils/toolkits.py

…ield into more_smiles_options

codecov-io · 2020-03-04T11:54:55Z

Codecov Report

Merging #535 into master will increase coverage by 0.01%.
The diff coverage is 96.87%.

jthorton · 2020-03-09T17:37:30Z

In order to be able to generate the unique index for a molecule who will be submitted for torsion driving, I also need to add a way to generate a mapped smile for just the atoms involved in the torsion as the current method will only generate the mapped smiles for the whole molecule.

…ecific atoms.

…rder. Also removed the ordering operation from the to_qcschema method.

jthorton · 2020-03-12T12:02:00Z

In testing with the automated qcarchive submission pipeline, I found that the canonical ordering would often fail on molecules with undefined stereo even after users have allowed this so I have also fixed this issue.

I have also removed the automatic ordering of a molecule when using the to_qcschema method as this makes it difficult to generate the cmiles information in the correct order. Now the method just gives a schema representation of the exact molecule it is called on, to get the molecule in an canonical order for schema the canonical order method must be called first like in the follwoing example which will be part of the automated submission pipeline:

ethanol = Molecule.from_smiles('CCO')
ethanol.generate_conformers()

# canonical order the atoms
ordered_ethanol = ethanol.canonical_order_atoms()

# now make the schema and the cmiles tags for this ordering
schema = ordered_ethanol.to_qcschema()
mapped_hydrogen_smiles = ordered_ethanol.to_smiles(mapped=True)

j-wags

Looks pretty good. Requested a few changes, mostly spelling things but some larger requests. Overall I'm excited to bring this functionality in.

I think we're doing the right thing with the atom mapping -- We support it for the general case (all atoms mapped), and have semi-hidden functionality for our "expert case" while we work on our final spec for atom maps. I'd just add the "hidden functionality" to the to_smiles docstrings for now (basically,a quick primer on how offmol.properties['atom_map'] is interpreted if present).

And, of course, update the release notes :-)

openforcefield/tests/test_molecule.py

j-wags · 2020-03-19T19:59:01Z

openforcefield/tests/test_molecule.py

+
+        if toolkit_class.is_available():
+            toolkit = toolkit_class()
+            mol = Molecule.from_smiles(data['molecule_input'], toolkit_registry=toolkit)


(Blocking) Generating the molecules from SMILES gives this test two ways to fail. Either:

The test actually fails for some reason that we're testing for

RDKitToolkitWrapper.from_smiles indexes molecules differently one day, and we have to do a lot of work overhauling all of the tests that include it

I'd like to reduce the number of tests that could hit option 2. So, I know it's a pain, but could we make the molecules for this test using the Molecule API, to ensure they use a fixed indexing system?

I understand what you mean but I am not sure that would fix the problem as if the canonicalisation method changes all tests will fail regardless of input ordering and the expected results will have to be updated., I think this is always going to be a problem when explicitly testing the output smiles strings. I think with a little more work I can be cleaver about the testing and get around this Ill draft it up and see what you think.

I like your new solution. API-produced molecules are ugly to implement, but will save us a ton of time in the long run.

openforcefield/tests/test_molecule.py

openforcefield/topology/molecule.py

openforcefield/utils/toolkits.py

openforcefield/tests/test_molecule.py

openforcefield/topology/molecule.py

openforcefield/utils/toolkits.py

…matching, added atom map description to doc strings.

…ield into more_smiles_options

j-wags

Looks great, Josh. I like the changes you made, and think this is good to merge. The only thing that I might still change is changing the manually-provided atom map to always expect to start indexing at 1. Though, I haven't thought about this as much as you, so feel free to got forward as-is if there's an underlying reason that it might start at 0.

…ield into more_smiles_options

jthorton · 2020-03-31T08:36:02Z

Thanks Jeff Ill get it merged in and yeah I know what you mean that is an awkward point, for now, they should be able to start from 0 or 1 but if we finish the spec and say 1 only we can make that change. My other thought is that to be consistent with the atom map returned from Molecule.are_isomorphic which starts from 0 so if we do change we should change all of these functions to make sure they take the same types of atom maps.

joshhorton added 6 commits March 2, 2020 17:33

smiles api changes

af71f7f

Merge branch 'master' of https://github.com/openforcefield/openforcef…

d1bf825

…ield into more_smiles_options Conflicts: openforcefield/topology/molecule.py openforcefield/utils/toolkits.py

cache change

5848cff

suggested changes added.

920a94b

Merge branch 'master' of https://github.com/openforcefield/openforcef…

1dfd3de

…ield into more_smiles_options

changed isometric to not require explicit stereochemistry.

0a5febb

added the ability to use atom_maps to create smiles mapped to only sp…

dc7e124

…ecific atoms.

jthorton requested a review from j-wags March 11, 2020 20:57

fixed undefined stereo error when putting a molecule into canonical o…

7eae8f0

…rder. Also removed the ordering operation from the to_qcschema method.

j-wags requested changes Mar 20, 2020

View reviewed changes

JoshHorton added 2 commits March 21, 2020 15:45

Updated the release history, changed tests to not use explict smiles …

95c2483

…matching, added atom map description to doc strings.

Merge branch 'master' of https://github.com/openforcefield/openforcef…

4fc67d1

…ield into more_smiles_options

jthorton requested a review from j-wags March 26, 2020 18:23

j-wags approved these changes Mar 30, 2020

View reviewed changes

Merge branch 'master' of https://github.com/openforcefield/openforcef…

7d0f92a

…ield into more_smiles_options

jthorton merged commit af7c2f8 into master Mar 31, 2020

jthorton deleted the more_smiles_options branch March 31, 2020 16:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cmiles options #535

Cmiles options #535

jthorton commented Mar 4, 2020 •

edited

Loading

codecov-io commented Mar 4, 2020 •

edited

Loading

jthorton commented Mar 9, 2020

jthorton commented Mar 12, 2020 •

edited

Loading

j-wags left a comment

j-wags Mar 19, 2020

jthorton Mar 21, 2020

j-wags Mar 30, 2020

j-wags left a comment

jthorton commented Mar 31, 2020

Cmiles options #535

Cmiles options #535

Conversation

jthorton commented Mar 4, 2020 • edited Loading

codecov-io commented Mar 4, 2020 • edited Loading

Codecov Report

jthorton commented Mar 9, 2020

jthorton commented Mar 12, 2020 • edited Loading

j-wags left a comment

Choose a reason for hiding this comment

j-wags Mar 19, 2020

Choose a reason for hiding this comment

jthorton Mar 21, 2020

Choose a reason for hiding this comment

j-wags Mar 30, 2020

Choose a reason for hiding this comment

j-wags left a comment

Choose a reason for hiding this comment

jthorton commented Mar 31, 2020

jthorton commented Mar 4, 2020 •

edited

Loading

codecov-io commented Mar 4, 2020 •

edited

Loading

jthorton commented Mar 12, 2020 •

edited

Loading