|
| 1 | +--- |
| 2 | +type: rule |
| 3 | +tips: "" |
| 4 | +title: Do you handle AI Hallucinations the right way? |
| 5 | +seoDescription: AI hallucinations are inevitable, but with the right techniques, you can minimize their occurrence and impact. |
| 6 | +uri: avoid-ai-hallucinations |
| 7 | +authors: |
| 8 | + - title: Eddie Kranz |
| 9 | + url: https://www.ssw.com.au/people/eddie-kranz/ |
| 10 | +related: |
| 11 | + - rules-to-better-chatgpt-prompt-engineering |
| 12 | +created: 2025-03-17T14:57:00.000Z |
| 13 | +guid: e4e963e4-1568-4e47-b184-d2e96bc0f124 |
| 14 | +--- |
| 15 | +AI is a powerful tool, however, sometimes it simply makes things up, aka hallucinates. While hallucinating in your spare time is pretty cool, it is very bad for business! |
| 16 | + |
| 17 | +AI hallucinations are inevitable, but with the right techniques, you can minimize their occurrence and impact. Learn how SSW tackles this challenge using proven methods like clean data tagging, multi-step prompting, and validation workflows. |
| 18 | + |
| 19 | +<!--endintro--> |
| 20 | + |
| 21 | +**Let's face it. AI will always hallucinate.** |
| 22 | + |
| 23 | +AI models like GPT-4 are powerful but imperfect. They generate plausible-sounding but incorrect or nonsensical outputs (hallucinations) due to training limitations, ambiguous prompts, or flawed data retrieval. While you can’t eliminate hallucinations entirely, you can **reduce their frequency** and **mitigate risks**. |
| 24 | + |
| 25 | +- - - |
| 26 | + |
| 27 | +## Use Clean, Tagged Data for RAG |
| 28 | + |
| 29 | +❌ Bad: Untagged data in a RAG system |
| 30 | + |
| 31 | +```phyton |
| 32 | +documents = ["Sales grew 10% in 2023", "Server downtime: 5hrs in Q2"] |
| 33 | +``` |
| 34 | + |
| 35 | +::: greybox |
| 36 | +**Query:** "What was the server uptime in Q2?" |
| 37 | +**Hallucination:** "Server uptime was 95%." (Wrong: No uptime data exists!) |
| 38 | +::: |
| 39 | +::: bad |
| 40 | +Figure: Bad example - Untagged, messy data leads to garbage outputs |
| 41 | +::: |
| 42 | + |
| 43 | +✅ Good: Properly tagged data |
| 44 | + |
| 45 | +```phyton |
| 46 | +documents = [ |
| 47 | + {"text": "Sales grew 10% in 2023", "tags": ["finance", "sales"]}, |
| 48 | + {"text": "Server downtime: 5hrs in Q2", "tags": ["IT", "downtime"]} |
| 49 | +] |
| 50 | +``` |
| 51 | + |
| 52 | +::: greybox |
| 53 | +# Query: "What was the server uptime in Q2?" |
| 54 | +# Output: "No uptime data found. Available data: 5hrs downtime." ✅ |
| 55 | +``` |
| 56 | +::: good |
| 57 | +Figure: Good example - Properly tagged data reduces the risk of incorrect retrieval |
| 58 | +::: |
| 59 | +
|
| 60 | +## Break Workflows into Multi-Step Prompts |
| 61 | +
|
| 62 | +Use a **chain-of-thought** approach to split tasks into smaller, validated steps |
| 63 | +
|
| 64 | +::: greybox |
| 65 | +**User:** "Write a blog about quantum computing benefits for SMEs." |
| 66 | +**AI:** (Hallucinates fictional case studies and stats) |
| 67 | +::: |
| 68 | +::: bad |
| 69 | +Figure: Bad example - A single-step prompt invites hallucinations |
| 70 | +::: |
| 71 | +
|
| 72 | +::: greybox |
| 73 | +**User:** "Generate a blog draft about quantum computing for SMEs."\ |
| 74 | +"Verify all claims in this draft against trusted sources."\ |
| 75 | +"Compare the final draft to the original query. Did you answer the question?" |
| 76 | +::: |
| 77 | +::: good |
| 78 | +Figure: Good example - Multi-step validation reduces errors |
| 79 | +::: |
| 80 | +
|
| 81 | +## Force the AI to Justify Its Reasoning |
| 82 | +
|
| 83 | +Always prompt the AI to **cite sources** and **flag uncertainty**. |
| 84 | +
|
| 85 | +::: greybox |
| 86 | +**User:** "Why should SMEs adopt quantum computing?" |
| 87 | +**AI:** "It boosts efficiency by 200%." (Source? None!) |
| 88 | +::: |
| 89 | +::: bad |
| 90 | +Figure: Bad example - No justification = unchecked errors |
| 91 | +::: |
| 92 | +
|
| 93 | +::: greybox |
| 94 | +**System Prompt:** "Answer the question and cite sources. If uncertain, say 'I don’t know'." |
| 95 | +**User:** "Why should SMEs adopt quantum computing?" |
| 96 | +**AI:** "Quantum computing can optimize logistics (Source: IBM, 2023). However, adoption costs may be prohibitive for SMEs." |
| 97 | +::: |
| 98 | +::: good |
| 99 | +Figure: Good example - Require citations and self-reflection |
| 100 | +::: |
| 101 | +
|
| 102 | +## Validate Outputs Against the Original Question |
| 103 | +Use a **validation layer** to ensure outputs align with the original query. |
| 104 | +
|
| 105 | +::: greybox |
| 106 | +**User:** "How does Azure Kubernetes Service (AKS) simplify deployment?" |
| 107 | +**AI:** Explains Kubernetes basics (ignores AKS specifics). |
| 108 | +::: |
| 109 | +::: bad |
| 110 | +Figure: Bad example - No final check = off-topic answers |
| 111 | +::: |
| 112 | +
|
| 113 | +::: greybox |
| 114 | +System Prompt: "Compare your answer to the user’s question. Did you address AKS?" |
| 115 | +AI: "Revised answer: AKS simplifies deployment by integrating with Azure DevOps and..." ✅ |
| 116 | +::: |
| 117 | +::: good |
| 118 | +Figure: Good example - Add a final validation step |
| 119 | +::: |
| 120 | +
|
| 121 | +### **Other techniques to minimize hallucinations |
| 122 | +
|
| 123 | +* **Lower temperature settings**: Reduce creativity (e.g., `temperature=0.3`) for factual tasks |
| 124 | +* **Human-in-the-loop**: Flag low-confidence responses for manual review |
| 125 | +* **Predefined constraints**: Example: "Do not speculate beyond the provided data" |
| 126 | +
|
| 127 | +--- |
| 128 | +
|
| 129 | +AI hallucinations are unavoidable, but SSW’s proven techniques, like clean data tagging, multi-step validation, and forcing justification can keep them in check. By designing workflows that **anticipate errors** and **validate outputs**, you turn a risky limitation into a manageable challenge. |
| 130 | +
|
| 131 | +Always assume hallucinations **will** happen, so build systems to catch them! |
0 commit comments