- ParsBERT-NER - It is a fine-tuned model based on ParsBERT (a monolingual Persian language model) on a vast range of dataset PEYMA, ARMAN, and PEYMA+ARMAN.
- ALBERT-NER - It is a fine-tuned on PEYMA and ARMAN dataset based on ALBERT Language Model.
- DeepSentiPers
- ParsBERT Digikala OpenData
- ParsBERT SnappFood
- ParsBERT DeepSentiPers Multi
- ParsBERT DeepSentiPers Binary
- ALBERT Digikala OpenData
- ALBERT SnappFood
- ALBERT DeepSentiPers Multi
- ALBERT DeepSentiPers Binary
- mT5 trained on ParsiNLU-ABSA
- BERT2BERT - BERT2BERT is the first pre-trained summarization model trained on Wiki Summary based on ParsBERT.
- Farsi Poem word2vec model - This is a word2vec model deveoped based on a corpus of 48 Persian poets. The corpus consists of 1,216,286 mesras of Farsi poems and 8,102,119 words from which 148,588 are unique.
- Sentence Transformers - ST is a collection of vector representations for sentences and paragraphs (also known as sentence embeddings). ST models are based on transformer networks like ParsBERT, ALBERT (soon). They are tuned based on Textual Thematic Similarity datasets such that sentences with similar meanings are close in vector space.
- ParsBERT: Transformer-based Model for Persian Language Understanding) - It is a monolingual language model based on Google’s BERT architecture for the Persian Language only! This model is pre-trained on a large Persian corpus with various writing styles from numerous subjects (e.g., scientific, novels, news) with more than 2M documents. A large subset of this corpus was crawled manually.
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations for the Persian Language - ALBERT is the first attempt on ALBERT for the Persian Language. The model was trained based on Google's ALBERT BASE Version 2.0 over various writing styles from numerous subjects (e.g., scientific, novels, news) with more than 3.9M documents, 73M sentences, and 1.3B words, like the way we did for ParsBERT.
- g2p_fa - A Persian Grapheme to Phoneme model using LSTM implemented in pytorch.
- Persian_g2p - A seq-to-seq model for Persian (Farsi) Grapheme To Phoneme mapping.
- G2P - Attention Based Grapheme To Phoneme