Chain of verification to reduce hallucinations
There are multiple ways to reduce hallucinations in LLM at different levels
A new course launched for interview preparation
We have launched a new course “Interview Questions and Answers on Large Language Models (LLMs)” series.
This program is designed to bridge the job gap in the global AI industry. It includes 100+ questions and answers from top companies like FAANG and Fortune 500 & 100+ self-assessment questions.
The course offers regular updates, self-assessment questions, community support, and a comprehensive curriculum covering everything from Prompt Engineering and basics of LLM to Supervised Fine-Tuning (SFT) LLM, Deployment, Hallucination, Evaluation, and Agents etc.
Detailed curriculum (Get 50% off using coupon code MED50 for first 10 users)
Free self assessment on LLM (30 MCQs in 30 mins)
Using Prompt Engineering:
Refer to how you can reduce hallucinations using our past blogs
Understand Temperature parameter:
Tree of Thoughts prompting:
6 ways to reduce hallucinations:
Using Model
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models [ we will cover this in a future blog ]
Self-check methods like Chain of verification
COVE is a type of self-checking mechanism by planning verification questions and independently answering them, the model can correct its mistakes and produce more accurate and reliable responses.
The research paper proves that COVE helps reduce hallucinations across various tasks, including list-based questions, closed-book QA, and long-form text generation. It leverages the model’s ability to reason and fact-check its own responses to improve correctness. The paper also discusses the challenges posed by exposure bias, especially in longform tasks, and how COVE addresses these issues by improving the verification of facts.
Overall, COVE is presented as an approach that encourages language models to think critically about their responses, fact-check themselves, and ultimately produce more accurate and reliable information. It shows promising results in decreasing the generation of incorrect information, which is a significant concern in the field of natural language processing and AI.
4 Step Process
- Generate Baseline Response: Given a query, generate the response using the LLM.
- Plan Verifications: Given both query and baseline response, generate a list of verification questions that could help to self-analyze if there are any mistakes in the original response.
- Execute Verifications: Answer each verification question in turn, and hence check the answer against the original response to check for inconsistencies or mistakes.
- Generate Final Verified Response: Given the discovered inconsistencies (if any), generate a revised response incorporating the verification results.
Step 1 # Generate Baseline Response
The language model generates an initial response to a query without any special techniques, but it recognizes that this response may contain hallucinations and errors.
Step 2 # Verification Questions
- Purpose: After generating the baseline response, the goal is to identify and correct potential errors or hallucinations in the response.
- Method: The model is prompted to generate a series of verification questions based on both the query and the baseline response. These questions are designed to test the factual accuracy of the original response.
- Example: If the baseline response contains the statement “The Mexican–American War was an armed conflict between the United States and Mexico from 1846 to 1848,” a verification question might be, “When did the Mexican American war start and end?”
- Flexibility: Verification questions can be generated in any form, and they don’t need to closely match the original text.
Step 3 # Answering Verification Question
- Purpose: This step involves answering the verification questions generated in the previous step to assess if there are any hallucinations or inconsistencies in the baseline response.
- Method:
The several variants of verification execution, including “joint” “2-Step”, “factored” and “factor+revise”
- Joint: Verification questions and their answers are provided in a single LLM prompt
- 2-Step: The planning and execution of verification questions are separated into two steps with separate LLM prompts.
- Factored: All verification questions are answered independently as separate prompts, without including the original baseline response.
- Factor+Revise: After answering verification questions, an additional step explicitly cross-checks if the answers indicate inconsistencies with the original response.
Step 4 # Final verified response
- Purpose: In this final step, an improved response is generated, taking into account the results of the verification process.
- Method: A few-shot prompt is used where the context includes all the previous steps, including the baseline response and verification question-answer pairs. If the “Factor+Revise” approach was used, the output of the cross-check inconsistency detection is provided as well.
The goal is to generate a response that corrects any errors or inconsistencies found during the verification process.
All Steps
To summarize all steps in a visual way.
Ready to level up your AI knowledge? Don’t forget to like, share, and subscribe to our channel for more exciting content on mastering Large Language Models like ChatGPT!
🔗 Connect with us:
YouTube
Medium
https://www.linkedin.com/company/mastering-llm-large-language-model
Stay tuned for more AI adventures! 🚀✨