Description
What's the problem this feature will solve?
Prevent malicious packages being published with typo'ish names
Describe the solution you'd like
I'd like to propose an algorithm that blocks malicious packages with similar names to well known packages from being published.
Recently there were articles about 12 malicious packages found. Several of them had names very close to Django, and as an avid Django user, this got my attention.
An algorithm could be used that uses Levenshtein distance combined with other input features like number of similar file names, number of similar code lines compared to legitimate packages of a similar name. If there is a close resemblance, then the package could be initially blocked from being published until a human reviews it or permanently blocked.
The algorithm could also be a lot more sophisticated, something such as Android's algorithm that uses machine learning to detect malicious apps and measures over 700+ features I believe.
I am just proposing something of this nature if it hasn't already been proposed.
Additional context
Here is the article link that I am referencing:
https://www.zdnet.com/article/twelve-malicious-python-libraries-found-and-removed-from-pypi/
Thanks,
Aaron