Sharing detailed list of NLP topics to get in-depth knowledge;
- Text cleaning process, RegEx, NLTK and Spacy toolkit.
- Feature extraction: tokenization, bag of words, tf-idf, word embeddings.
- Basic NLP tasks: Sentiment Analysis, Text Classification, Topic modelling, Named entity recognition. (This can be solved with basic NLP and ML knowledge only)
To deep-dive further in NLP -->
What I feel is you should consider embeddings part in-dept. word2vec, glove etc
CNN, RNN, LSTM, GRU models, Please consider Seq2Seq(Encoder-Decoder) models, latest Large language models are highly depend on Seq2Seq mechanism.
- Introduction of GenAI
- Attention Mechanism - Self-Attention, Multi-Head Attention
- Transformer Model
- Variants of Transformer models - BERT(Encoder part), GPT(Decoder part), Flan-T5(Encoder-Decoder)
- ChatGPT: GPT, reinforcement learning, reward model.
- Training for LLM models - What is Transfer Learning, Pre-training and Fine-Tuning for specific tasks
- Inference part - Prompt Engineering, In-Context learning.
- Model Evaluation and Benchmarking
- Parameter Efficient Fine-tuning (quite effective when we've computation power limitations)
1. In-Dept Text Classification - Kaggle competition : Feedback Prize - Evaluating Student Writing
2. Tweeter Scrapping:
- Kaggle Notebook: https://www.kaggle.com/shwetagargade/tweeter-scrapping
3. Fun with NLP API's:
- Kaggle Notebook: https://www.kaggle.com/shwetagargade/fun-nlp-with-api-s
4. Learn NLP Libraries:
- Kaggle Notebook: https://www.kaggle.com/shwetagargade/learn-nlp-libraries
5. Recommendation Systems: