Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Errors in the encoding of some bits #70

Open
salvogalati opened this issue Nov 9, 2022 · 1 comment
Open

Errors in the encoding of some bits #70

salvogalati opened this issue Nov 9, 2022 · 1 comment

Comments

@salvogalati
Copy link

Hi,
I was checking the code behind the PubChem fingerprint generation.
I did some comparisons between fingerprints calculated with your code and those calculated with PyFingerprint which uses the cdk library and noticed some differences.
I noticed that for bits in the range 0-98, smarts are not used and therefore when carbons are counted for example, only aliphatic carbons are considered since the corresponding key is C.
As a result the counting and encoding are incorrect.
The second point concerns the bits in the range 115-231: in this case there are two conditions to be met such as bits 116 and 117 mention ">= 1 saturated or aromatic carbon-only ring size 3 " and ">= 1 saturated or aromatic nitrogen-containing ring size 3" respectively. In this case a cyclopropane ring should be detected by bit 116 but not by bit 117. Instead with your code it is encoded for both bits.

I hope the bugs I reported are corrected otherwise I would be glad to have an explanation of my mistake

Thank you for your helpfulness
Salvatore

@nbehrnd
Copy link

nbehrnd commented Nov 9, 2022 via email

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants