Skip to content

feat(heuristics): add SimilarProjectAnalyzer to detect structural similarity across packages from same maintainer #1089

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AmineRaouane
Copy link
Member

Summary

This PR adds a new heuristic analyzer called SimilarProjectAnalyzer. It checks whether a PyPI package has a similar file/folder structure to other packages maintained by the same user. This helps in identifying potentially malicious packages that replicate existing structures.

Description of changes

  • Created a new analyzer: SimilarProjectAnalyzer.
  • The analyzer fetches the list of maintainers of the target package and retrieves other packages published by those maintainers.
  • For each package, it computes a normalized structure hash from its sdist tarball and compares it to the structure hash of the target package.
  • If any match is found, the heuristic fails, flagging potential structural duplication.
  • Added this analyzer to the heuristics.py registry.
  • Modified detect_malicious_metadata_check.py to include and utilize the new heuristic.
  • Added test cases to validate the functionality and edge cases of the analyzer.

Related issues

None

  • I have reviewed the contribution guide.
  • My PR title and commits follow the Conventional Commits convention.
  • My commits include the "Signed-off-by" line.
  • I have signed my commits following the instructions provided by GitHub. Note that we run GitHub's commit verification tool to check the commit signatures. A green verified label should appear next to all of your commits on GitHub.
  • I have updated the relevant documentation, if applicable.
  • I have tested my changes and verified they work as expected.

…ilarity across packages from same maintainer

Signed-off-by: Amine <amine.raouane@enim.ac.ma>
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 20, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant