Following are some text processing you must think of doing, it is not necessary to do all these, it depends on the nlp task what text processing you want to do before doing that task
- String Manipulation using Regex
- Tokenization
- Stemming & Lemmatization
- Removing Stopwords
We need to represent language mathematically i.e. given a corpus you need to convert this corpus into its numerical form. This mathematical representation is called an embedding/context and the process is called representation learning. Why do this?? Because computers understand only numbers and not texts. We can do this in several ways:
- Via Sentence Embedding
- Via Word Embedding
- Via Character Embedding
- Via Subword Embedding (everyone uses this)
- With foundation models that are able to do multiple tasks, you just need to do prompting to solve a single downstream task problem.
- But many times prompting does not work well, this is called HALLUCINATION PROBLEM. The model would sometimes give wrong answers to prompted questions (incases where such a task was not trained during the training of multitask foundation model)
- To solve this hallucination probelm you can finetune the foundation models for specific tasks. More about this here
- Now once OpenAI made ChatGPT they found that if asked about some harmful activities like ‘tell me techniques to make rat poison at home’ then it would answer such questions too !! If tempted it would also use curse words / …. Hence it was lacking HUMAN ETHICS and if gotten in wrong hands could lead to bigger concerns. Hence researchers wanted to ALIGN the LLM outputs with human preferences.
- This was called as PREFERENCE PROBLEM
- Methods to solve preference problem are called preference alignment. There are two ways to do so
- Fine Tuning LLM with human preference using Reinforcement Learning – RLHF Algorithm
- Fine tuning LLM with human preferences using Supervised Learning – DPO Algorithm
- More information available here