-
Notifications
You must be signed in to change notification settings - Fork 28
feat(security): Add package name typosquatting detection #1059
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application. When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated. If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public. |
@AmineRaouane Please add unit tests following the instructions here. Take a look at the unit tests for other malware heuristics at For small and standalone functions, you can add test cases to the docstring itself. You can find an example here. |
Would it be possible to make the path to the custom file list of packages configurable through |
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
src/macaron/slsa_analyzer/checks/detect_malicious_metadata_check.py
Outdated
Show resolved
Hide resolved
src/macaron/slsa_analyzer/checks/detect_malicious_metadata_check.py
Outdated
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Show resolved
Hide resolved
0a8ddbf
to
80afd9e
Compare
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Show resolved
Hide resolved
src/macaron/malware_analyzer/pypi_heuristics/metadata/typosquatting_presence.py
Outdated
Show resolved
Hide resolved
@@ -181,3 +181,4 @@ docs/_build | |||
bin/ | |||
requirements.txt | |||
.macaron_env_file | |||
**/.DS_Store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not cover subdirectories such as tests
, parsers
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to .DS_Store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I was wrong above. **/.DS_Store
should be sufficient, you just need to remove all the matching files that were already added.
@@ -29,7 +29,7 @@ | |||
# heuristic, a false negative has been introduced. Note that if the unit test were allowed to access the OSV | |||
# knowledge base, it would report the package as malware. However, we intentionally block unit tests | |||
# from reaching the network. | |||
("pkg:pypi/zlibxjson", CheckResultType.PASSED), | |||
("pkg:pypi/zlibxjson", CheckResultType.UNKNOWN), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is an expected change, the comment above should be updated to reflect the new situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't push the code because that test fails
Adds a new security analysis feature to detect potential typosquatting in package names. Compares the package name against a list of popular packages using the Jaro-Winkler similarity algorithm. Packages exceeding a configurable threshold are flagged. Includes a default popular package list and an option for a custom list via configuration. Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Adds a new security analysis feature to detect potential typosquatting in package names. Compares the package name against a list of popular packages using the Jaro-Winkler similarity algorithm. Packages exceeding a configurable threshold are flagged. Includes a default popular package list and an option for a custom list via configuration. Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Adds a new security analysis feature to detect potential typosquatting in package names. Compares the package name against a list of popular packages using the Jaro-Winkler similarity algorithm. Packages exceeding a configurable threshold are flagged. Includes a default popular package list and an option for a custom list via configuration. Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Added unit tests for typosquatting detection. Analyzer variables, including the file path, are now loaded from defaults.ini. Raised heuristic confidence level from medium to high. BREAKING CHANGE: Analyzer config must now be defined in defaults.ini. Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
Signed-off-by: Amine <amine.raouane@enim.ac.ma>
tuple[HeuristicResult, dict[str, JsonType]]: | ||
The result and related information collected during the analysis. | ||
""" | ||
if not self.popular_packages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now with the change for filling in self.popular_packages
in the __init__
method, I see this scenario is skipping the heuristic in the event there were no popular packages to check against. I think this is okay and a good use of SKIP
here, but in the detail info returned I don't think I'd call the entry key "error":
, probably "warning":
, similar to how you log a warning.
Implement typosquatting detection for package names during analysis. Compares package names against a list of popular packages using the Jaro-Winkler similarity algorithm. Packages exceeding a defined threshold of similarity to a popular package are flagged.
Summary
Adds typosquatting detection for package names during analysis using Jaro-Winkler similarity.
Description of changes
This PR introduces a new security analysis feature to detect potential typosquatting in package names. The implementation compares the name of a package being analyzed against a list of popular package names. By default, it uses a predefined list stored in a dedicated file, but it also offers an option to use a custom list provided via a configuration path.
The comparison utilizes the Jaro-Winkler similarity algorithm to calculate a similarity score between the package name and each name in the popular packages list. If the calculated similarity score exceeds a configurable threshold, the package is flagged as a potential typosquat.
This feature helps identify malicious packages attempting to mimic legitimate, popular ones through slight variations in spelling, thus enhancing the security posture of the project by warning users about such risks.
The changes include:
Related issues
Checklist
verified
label should appear next to all of your commits on GitHub.