Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

refactor to rdFingerprintGenerator #226

Merged
merged 6 commits into from
Jun 10, 2024
Merged

refactor to rdFingerprintGenerator #226

merged 6 commits into from
Jun 10, 2024

Conversation

zhu0619
Copy link
Contributor

@zhu0619 zhu0619 commented May 28, 2024

Changelogs

This PR addresses the fingerprint deprecation warnings, see issue #225.

  • Refactoring fingerprint atom pair, rdkit, morgan, topological, and their count fingerprints to FingerprintGenerator.
  • ecfp-count now use the ECFP-6, to stay consistent.
  • added new decorator function dm.no_rdkit_log which disable RDKit log for a whole function.
  • set minimum rdkit version to avoid pandas based error for this version of datamol.

Checklist:

  • Was this PR discussed in an issue? It is recommended to first discuss a new feature into a GitHub issue before opening a PR.
  • Add tests to cover the fixed bug(s) or the new introduced feature(s) (if appropriate).
  • Update the API documentation is a new function is added, or an existing one is deleted.
  • Write concise and explanatory changelogs below.
  • If possible, assign one of the following labels to the PR: feature, fix or test (or ask a maintainer to do it for you).

The rdFingerprintGenerator.GetCountFingerprint can be used to compute the count based fingerprint.
The argument countSimulationindicates whether to compute the count fingerprint by bites simulation for computing efficiency. See more in this blog.
I assume in datamol, we want to compute the exact count-based fingerprint. Therefore, countSimulation should be set to False.

@zhu0619 zhu0619 added the enhancement New feature or request label May 28, 2024
@zhu0619 zhu0619 requested a review from maclandrol May 28, 2024 18:05
@zhu0619 zhu0619 marked this pull request as draft May 28, 2024 18:06
@zhu0619 zhu0619 linked an issue May 28, 2024 that may be closed by this pull request
Copy link
Member

@maclandrol maclandrol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's still draft, I can wait.

datamol/fp.py Show resolved Hide resolved
datamol/fp.py Show resolved Hide resolved
@zhu0619
Copy link
Contributor Author

zhu0619 commented May 28, 2024

Since it's still draft, I can wait.

The tests are failing, probably due to the different set arguments in the fingerprint generators.
@maclandrol I need your help with revising default parameters.
And I think the values in the tests for fingerprints need to be updated.

@zhu0619 zhu0619 marked this pull request as ready for review May 28, 2024 19:30
@maclandrol
Copy link
Member

Since it's still draft, I can wait.

The tests are failing, probably due to the different set arguments in the fingerprint generators. @maclandrol I need your help with revising default parameters. And I think the values in the tests for fingerprints need to be updated.

Ok, will do.

@maclandrol
Copy link
Member

@zhu0619, the default parameters were wrong. They are fixed now.

Still have to address this: danielfrg/mkdocs-jupyter#200

Ideally in the next release, we should also have a minimum version for rdkit. As the pandas error still exists.

@maclandrol
Copy link
Member

I also replaced seaborn by pure matplotllib in one of the notebook. Was getting an error and not sure how.

Feel free to merge anytime.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix deprecation warning for dm.to_fp
2 participants