You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The second link contains an extensive dataset of all Kosovo Parliament speeches, facilitating the extraction of common stopwords. Meanwhile, the first link offers the fundamental ones.
In the upcoming pull request, I intend to include stopwords with special Albanian characters and without, as they are often replaced by other characters. For instance, both "tanë" and "tane" will be included.
The text was updated successfully, but these errors were encountered:
I love this package and noticed the absence of Albanian stopwords. I am eager to help by providing them, and I've gathered them from two sources:
https://en.wikipedia.org/wiki/Albanian_morphology
https://huggingface.co/datasets/Kushtrim/Kosovo-Parliament-Transcriptions
The second link contains an extensive dataset of all Kosovo Parliament speeches, facilitating the extraction of common stopwords. Meanwhile, the first link offers the fundamental ones.
In the upcoming pull request, I intend to include stopwords with special Albanian characters and without, as they are often replaced by other characters. For instance, both "tanë" and "tane" will be included.
The text was updated successfully, but these errors were encountered: