Detect packages being published with typo'ish names

**What's the problem this feature will solve?**
Prevent malicious packages being published with typo'ish names

**Describe the solution you'd like**
I'd like to propose an algorithm that blocks malicious packages with similar names to well known packages from being published.

Recently there were articles about 12 malicious packages found. Several of them had names very close to Django, and as an avid Django user, this got my attention.

An algorithm could be used that uses [Levenshtein distance](https://en.wikipedia.org/wiki/Levenshtein_distance) combined with other input features like number of similar file names, number of similar code lines compared to legitimate packages of a similar name. If there is a close resemblance, then the package could be initially blocked from being published until a human reviews it or permanently blocked.

The algorithm could also be a lot more sophisticated, something such as Android's algorithm that uses machine learning to detect malicious apps and measures over 700+ features I believe.

I am just proposing something of this nature if it hasn't already been proposed.

**Additional context**
Here is the article link that I am referencing:

https://www.zdnet.com/article/twelve-malicious-python-libraries-found-and-removed-from-pypi/

Thanks,
Aaron

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Detect packages being published with typo'ish names #4998

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Detect packages being published with typo'ish names #4998

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions