Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

openbabel processing error with crossdocked data #52

Open
SanazKaz opened this issue Feb 18, 2025 · 1 comment
Open

openbabel processing error with crossdocked data #52

SanazKaz opened this issue Feb 18, 2025 · 1 comment

Comments

@SanazKaz
Copy link

Hi all,

Currently trying to pre-process the data using process_crossdocked.py.

i receive this #failed: 89: 100%|██████████| 100000/100000 [29:33<00:00, 56.39it/s]

i have 2 questions -

  1. is it expected to have 89 failed files? I thought all 100,000 should pass fine.

  2. While it is processing files for training i received this error below

  • There are actually quite a few openbabel warnings as the files are being processed.

*** Open Babel Warning in PerceiveBondOrders
Failed to kekulize aromatic bonds in OBMol::PerceiveBondOrders (title is /tmp/tmpn058zqxm)

[16:10:15] Unexpected error hit on line 60
[16:10:15] ERROR: moving to the beginning of the next molecule

7352/7365 successful: 7%|▋ | 7365/99911 [00:20<04:11, 368.12it/s]
Traceback (most recent call last):
File "/data/stat-cadd/wolf7055/diffsbdd-ppo/process_crossdock.py", line 422, in
train_smiles = compute_smiles(lig_coords, lig_one_hot, lig_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/stat-cadd/wolf7055/diffsbdd-ppo/process_crossdock.py", line 138, in compute_smiles
mol = build_molecule(pos, atom_type, dataset_info)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/stat-cadd/wolf7055/diffsbdd-ppo/analysis/molecule_builder.py", line 154, in build_molecule
mol = make_mol_openbabel(positions, atom_types,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/stat-cadd/wolf7055/diffsbdd-ppo/analysis/molecule_builder.py", line 90, in make_mol_openbabel
for atom in tmp_mol.GetAtoms():
^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'GetAtoms'


Thanks in advance!

@SanazKaz
Copy link
Author

SanazKaz commented Feb 20, 2025

Error in file process_crossdocked.py with incorrect dict - see below for fix.

  • although this still results in 89 failed mols due to sanitisation, at least the script runs.
def process_ligand_and_pocket(...):
...
try:
            pocket_one_hot = []
            for a in full_atoms:
                if a in atom_dict: ################################### previously amino_acid_dict 
                    atom = np.eye(1, len(atom_dict),
                                  atom_dict[a.capitalize()]).squeeze()
                elif a != 'H':
                    atom = np.eye(1, len(atom_dict),
                                  len(atom_dict)).squeeze()
                pocket_one_hot.append(atom)
            pocket_one_hot = np.stack(pocket_one_hot)
        except KeyError as e:
            raise KeyError(
                f'{e} not in atom dict ({pdbfile})')
        pocket_data = {
            'pocket_coords': full_coords,
            'pocket_one_hot': pocket_one_hot,
            'pocket_ids': pocket_ids
        }
    return ligand_data, pocket_data

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant