Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Index is out of bound during the all_by_all_pairwise_similarity #28

Open
souzadevinicius opened this issue Feb 3, 2023 · 0 comments
Open
Labels
bug Something isn't working

Comments

@souzadevinicius
Copy link

souzadevinicius commented Feb 3, 2023

Component

GrapeImplementation.all_by_all_pairwise_similarity

Description

During the GrapeImplementation.all_by_all_pairwise_similarity method call, I got an index out of bounds exception:

IndexError                                Traceback (most recent call last)
Cell In [64], line 1
----> 1 tp = oi.all_by_all_pairwise_similarity(oba_list, vt_list)

File ~/.pyenv/versions/3.10.8/lib/python3.10/site-packages/oakx_grape/grape_implementation.py:402, in GrapeImplementation.all_by_all_pairwise_similarity(self, subjects, objects, predicates)
    398     raise ValueError("For now can only use hardcoded ensmallen predicates")
    400 resnik_model = self._make_grape_resnik_model()
--> 402 sim = resnik_model.get_similarities_from_bipartite_graph_node_names(
    403     source_node_names=subjects,
    404     destination_node_names=objects,
    405     return_similarities_dataframe=True,
    406     return_node_names=True,
    407 )
    409 pairs = iter(self._df_to_pairwise_similarity(sim))
    411 return pairs

File ~/.pyenv/versions/3.10.8/lib/python3.10/site-packages/embiggen/similarities/dag_resnik.py:145, in DAGResnik.get_similarities_from_bipartite_graph_node_names(self, source_node_names, destination_node_names, minimum_similarity, return_similarities_dataframe, return_node_names)
    120 def get_similarities_from_bipartite_graph_node_names(
    121     self,
    122     source_node_names: List[str],
   (...)
    126     return_node_names: bool = False
...
     81     ),
     82     "resnik_score": similarities
     83 })

To Reproduce

Steps to reproduce the behavior:

First of all, I merged two ontologies into one, then I did two terms lists subsetting them based on their prefixes. The first one contains OBA terms while the other contains VT terms.

oi = get_implementation_from_shorthand("grape:sqlite:../tmp/oba-vt.owl")
oba_terms = pd.read_csv('../tmp/oba_terms.txt', header=None)
#['OBA:1000035', 'OBA:1000045', 'OBA:0000003', 'OBA:0000005', 'OBA:0000006']
vt_terms = pd.read_csv('../tmp/vt_terms.txt', header=None)
#['VT:0000181', 'VT:0000362', 'VT:0000717', 'VT:0000813', 'VT:0001097']
tp = oi.all_by_all_pairwise_similarity(oba_list, vt_list)

Expected behavior

When I pass the same list in both of GrapeImplementation.all_by_all_pairwise_similarity parameters everything works fine.

tp = oi.all_by_all_pairwise_similarity(oba_list, oba_list)

for t in tp:
    print(t.ancestor_information_content)
10.202258110046387
5.5109100341796875
0.0001483669620938599
0.0001483669620938599
0.11778302490711212
5.5109100341796875
10.202258110046387
0.0001483669620938599
0.0001483669620938599
0.11778302490711212
10.202258110046387
5.587137222290039
5.587137222290039
10.202258110046387
0.0001483669620938599
0.0001483669620938599
10.202258110046387

Additional context

Library versions:
oaklib 0.1.70
oakx-grape 0.1.2
embiggen 0.11.39

@souzadevinicius souzadevinicius added the bug Something isn't working label Feb 3, 2023
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant