A POC repository to get some ideas our of my head. Some of the work that will be included in this repository
- Important word extraction
- Identify important word segments from a sentence
- Tokenization and Part-of-Speech (POS) tagging with
spacy
- Identity clauses and verbs
- NER tagger
- Generate questions from text blobs
- Maybe deep learning approach?
Detail insturctions will be included when the work is done.
bash download_importance.sh
pip3 install -r requirements.txt
Pick either en_core_web_sm
or en_core_web_trf
for name entity recognition task.
python -m spacy download en_core_web_sm
python -m spacy download en_core_web_trf
python3 importance.py