Skip to content

Improve singular method #493

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Improve singular method #493

wants to merge 2 commits into from

Conversation

slawekjaranowski
Copy link
Member

No description provided.

Copy link
Member

@gnodet gnodet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a proposal:

    private static final Map<String, String> PLURAL_EXCEPTIONS = new HashMap<>();

    static {
        // Irregular plurals
        PLURAL_EXCEPTIONS.put("men", "man");
        PLURAL_EXCEPTIONS.put("women", "woman");
        PLURAL_EXCEPTIONS.put("children", "child");
        PLURAL_EXCEPTIONS.put("mice", "mouse");
        PLURAL_EXCEPTIONS.put("people", "person");
        PLURAL_EXCEPTIONS.put("teeth", "tooth");
        PLURAL_EXCEPTIONS.put("feet", "foot");
        PLURAL_EXCEPTIONS.put("geese", "goose");

        // Invariant plurals
        PLURAL_EXCEPTIONS.put("series", "series");
        PLURAL_EXCEPTIONS.put("species", "species");
        PLURAL_EXCEPTIONS.put("sheep", "sheep");
        PLURAL_EXCEPTIONS.put("fish", "fish");
        PLURAL_EXCEPTIONS.put("deer", "deer");
        PLURAL_EXCEPTIONS.put("aircraft", "aircraft");

        // Special "oes" exceptions
        PLURAL_EXCEPTIONS.put("heroes", "hero");
        PLURAL_EXCEPTIONS.put("potatoes", "potato");
        PLURAL_EXCEPTIONS.put("tomatoes", "tomato");
        PLURAL_EXCEPTIONS.put("echoes", "echo");
        PLURAL_EXCEPTIONS.put("vetoes", "veto");
        PLURAL_EXCEPTIONS.put("torpedoes", "torpedo");
        PLURAL_EXCEPTIONS.put("cargoes", "cargo");
        PLURAL_EXCEPTIONS.put("haloes", "halo");
        PLURAL_EXCEPTIONS.put("mosquitoes", "mosquito");
        PLURAL_EXCEPTIONS.put("buffaloes", "buffalo");
    }

    public static String singular(String plural) {
        if (plural == null || plural.isEmpty()) return plural;

        String lower = plural.toLowerCase();

        if (PLURAL_EXCEPTIONS.containsKey(lower)) {
            return PLURAL_EXCEPTIONS.get(lower);
        }

        // Suffix-based rules
        if (lower.endsWith("ies") && plural.length() > 3) {
            return plural.substring(0, plural.length() - 3) + "y";
        }
        if (lower.endsWith("ves")) {
            return plural.substring(0, plural.length() - 3) + "f";
        }
        if (lower.endsWith("zzes")) {
            return plural.substring(0, plural.length() - 2);
        }
        if (lower.endsWith("sses")) {
            return plural.substring(0, plural.length() - 2);
        }
        if (lower.endsWith("ches") || lower.endsWith("shes")) {
            return plural.substring(0, plural.length() - 2);
        }
        if (lower.endsWith("xes")) {
            return plural.substring(0, plural.length() - 2);
        }
        if (lower.endsWith("oes")) {
            return plural.substring(0, plural.length() - 1);
        }
        if (lower.endsWith("s") && plural.length() > 1) {
            return plural.substring(0, plural.length() - 1);
        }

        return plural;
    }

With a more complete test pairs:

    // Known exceptions
    "men", "man",
    "women", "woman",
    "children", "child",
    "mice", "mouse",
    "people", "person",
    "teeth", "tooth",
    "feet", "foot",
    "geese", "goose",

    "series", "series",
    "species", "species",
    "sheep", "sheep",
    "fish", "fish",
    "deer", "deer",
    "aircraft", "aircraft",

    "heroes", "hero",
    "potatoes", "potato",
    "tomatoes", "tomato",
    "echoes", "echo",
    "vetoes", "veto",
    "torpedoes", "torpedo",
    "cargoes", "cargo",
    "haloes", "halo",
    "mosquitoes", "mosquito",
    "buffaloes", "buffalo",

    // Regular plural forms with suffixes
    "voes", "voe",
    "hoes", "hoe",
    "canoes", "canoe",
    "toes", "toe",
    "foes", "foe",
    "oboes", "oboe",
    "noes", "no",
    "boxes", "box",
    "wishes", "wish",
    "dishes", "dish",
    "brushes", "brush",
    "classes", "class",
    "buzzes", "buzz",
    "cars", "car",
    "dogs", "dog",
    "cats", "cat",
    "horses", "horse",

    // Some test cases with different rules
    "wolves", "wolf",
    "knives", "knife",
    "leaves", "leaf",
    "wives", "wife",
    "lives", "life",
    "babies", "baby",
    "parties", "party",
    "cities", "city",
    "buses", "bus",
    "boxes", "box",
    "churches", "church",
    "matches", "match",
    "watches", "watch",
    "riches", "rich",
    "dresses", "dress",
    "crosses", "cross",

    // More edge cases
    "heroes", "hero",
    "vetoes", "veto",
    "torpedoes", "torpedo",
    "tomatoes", "tomato",
    "potatoes", "potato",
    "echoes", "echo",
    "mosquitoes", "mosquito",
    "buffaloes", "buffalo",
    "volcanoes", "volcano",
    "goes", "go"


static {
PLURAL_EXCEPTION.put("children", "child");
PLURAL_EXCEPTION.put("licenses", "license");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that one an exception?
The plural just adds an 's'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end with es

"repositories, repository",
"roles, role",
"rushes, rush",
"series, series"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a few more, here's a more extensive list:

"women", "woman", "men", "man", "children", "child", "mice", "mouse", "people", "person", "series", "series", "species", "species", "roses", "rose", "fezzes", "fez", "kisses", "kiss", "buses", "bus", "glasses", "glass", "heroes", "hero", "potatoes", "potato", "tomatoes", "tomato", "echoes", "echo", "torpedoes", "torpedo", "vetoes", "veto", "cargoes", "cargo", "haloes", "halo", "mosquitoes", "mosquito", "babies", "baby", "wolves", "wolf", "knives", "knife", "leaves", "leaf", "wives", "wife", "lives", "life", "boxes", "box", "wishes", "wish", "dishes", "dish", "churches", "church", "brushes", "brush", "classes", "class", "buzzes", "buzz", "cars", "car", "dogs", "dog", "voes", "voe", "does", "doe", "hoes", "hoe", "canoes", "canoe"

} else if (name.endsWith("xes")) {
} else if (name.endsWith("zzes")) {
return name.substring(0, name.length() - 3);
} else if (name.endsWith("ches") || name.endsWith("xes") || name.endsWith("ses") || name.endsWith("oes")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The oes rule is a bit more complicated.
It usually looses the final s, but there are exceptions such as heroes, potatoes, tomatoes, echoes, torpedoes, vetoes, cargoes, haloes, and mosquitoes.
I think the rule should be loose the final s, with exceptions such as:

    "heroes", "hero",
    "potatoes", "potato",
    "tomatoes", "tomato",
    "echoes", "echo",
    "vetoes", "veto",
    "torpedoes", "torpedo",
    "cargoes", "cargo",
    "haloes", "halo",
    "mosquitoes", "mosquito",
    "buffaloes", "buffalo"

@slawekjaranowski
Copy link
Member Author

rule:

        if (lower.endsWith("ves")) {
            return name.substring(0, name.length() - 3) + "f";
        }

not works for:

"knives, knife"
"wives, wife"
"lives, life",

@gnodet
Copy link
Member

gnodet commented May 14, 2025

rule:

        if (lower.endsWith("ves")) {
            return name.substring(0, name.length() - 3) + "f";
        }

not works for:

"knives, knife"
"wives, wife"
"lives, life",

Those are actually incorrect plurals. the correct ones are ending with ves, knives, wives and lives.

@slawekjaranowski
Copy link
Member Author

rule:

        if (lower.endsWith("ves")) {
            return name.substring(0, name.length() - 3) + "f";
        }

not works for:

"knives, knife"
"wives, wife"
"lives, life",

Those are actually incorrect plurals. the correct ones are ending with ves, knives, wives and lives.

I see knives the same - in my example and in your 😄

@slawekjaranowski
Copy link
Member Author

@gnodet - based on your proposition I have a next fix ...

I'm afraid that it will be difficult to support all cases, so I added a parameter to Mojo, when we can add special exclusion in project.

@slawekjaranowski slawekjaranowski requested a review from gnodet May 14, 2025 20:51
@slawekjaranowski
Copy link
Member Author

@hboutemy
Copy link
Member

defining what "improving" means would be useful: as such, it's just a vague personal judgement

but I read the content, and it's related to being able to add exceptions and have a default list of classical ones

question: is there an official list somewhere?
i did a search, I could find https://www.thoughtco.com/irregular-plural-nouns-in-english-1692634 and many others
it would be nice to keep a pointer to the reference list used as a default

@hboutemy hboutemy added enhancement and removed bug labels May 17, 2025
@slawekjaranowski
Copy link
Member Author

As I started work on it .... I've hoped it will be only simple improvement ....

But the problem turned out to be more complicated I have added a list of some irregular form @gnodet comments - I don't know a source.

Finally I added a parameter to allow providing more special cases, as it is difficult to discover all irregular noun

Of course we can adjust change title to show what exactly was done.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants