Block Filtering and Block Purging after Vector Based Blocking #12
-
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Hello, the proper way to use Vector Based Blocking is presented here: https://pyjedai.readthedocs.io/en/latest/tutorials/pyTorchWorkflow.html Vector Based Blocking generates a dictionary of ids that correspond to candidate matches. Therefore, at the end of vb blocking, you'll either get this dictionary or a graph similar to entity matching. FAISS also gives distance/similarity scores, avoiding the need for an additional step of entity matching. Check out the tutorial, and if you have any questions, I'm happy to help. |
Beta Was this translation helpful? Give feedback.
-
Hi Nikoletos, : Code: from pyjedai.vector_based_blocking import EmbeddingsNNBlockBuilding blocks, g = emb.build_blocks(data, from pyjedai.clustering import ConnectedComponentsClustering, UniqueMappingClustering Results:
Method name: Embeddings-NN Block Building
Method name: Unique Mapping Clustering |
Beta Was this translation helpful? Give feedback.
-
What I suggest you do is start experimenting with:
and then with the clustering:
or you can even check the optuna tutorial here https://pyjedai.readthedocs.io/en/latest/tutorials/Optuna.html |
Beta Was this translation helpful? Give feedback.
What I suggest you do is start experimenting with:
and then with the clustering:
or you can even check the optuna tutorial here https://pyjedai.readthedocs.io/en/latest/tutorials/Optuna.html