You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This task involves automating the current 'precheck' stage which currently involves a human 'triage-er' to validate whether the student model already knows the information which a user is trying to teach the model.
Similar to the steps used in a standard RAG workflow, the sentences could be converted to vectors using embeddings and then could be compared using metrics like cosine similarity scores. Based on these scores, the 'precheck' stage can either be marked as a 'success' (✅ ) or a 'failure' (❎ ).
This can be included with the precheck call to the @instructlab-bot GH bot.
Let me know if I am off the mark here but this seems similar to the work im doing in #356, its just that the suggested implementation seems different. This issue seems to suggest using a RAG / vector + embeding for a similarity score vs my implementation which a 1 shot model evaluation natural language prompt (with the teacher model being used for precheck and the trained model used for evaluation). cc @mingxzhao, thoughts on this evaluation method?
This task involves automating the current 'precheck' stage which currently involves a human 'triage-er' to validate whether the student model already knows the information which a user is trying to teach the model.
Similar to the steps used in a standard RAG workflow, the sentences could be converted to vectors using embeddings and then could be compared using metrics like cosine similarity scores. Based on these scores, the 'precheck' stage can either be marked as a 'success' (✅ ) or a 'failure' (❎ ).
This can be included with the
precheck
call to the@instructlab-bot
GH bot.References:
The text was updated successfully, but these errors were encountered: