inference-labs-inc · 0pendansor · Nov 27, 2025
diff --git a/docs/notebooks/TruthTorchLM_demo.ipynb b/docs/notebooks/TruthTorchLM_demo.ipynb
@@ -0,0 +1,393 @@
+{
+ "cells": [
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "# TruthTorchLM — Truthfulness Checks and Long‑form Generation Demo\n",
+    "\n",
+    "This notebook demonstrates two things with the `TruthTorchLM` package:\n",
+    "\n",
+    "1. Multiple‑LLM truthfulness checking for factual claims.\n",
+    "2. Long‑form generation with a truth value/score.\n",
+    "\n",
+    "Notes:\n",
+    "- You said the package is already installed. If not, uncomment the install cell below.\n",
+    "- You will likely need API keys for whichever LLM backends the library uses (e.g., OpenAI, Anthropic, etc.). Set them in environment variables before running.\n",
+    "- Function names in the package may differ slightly. The notebook includes light introspection to locate the correct functions if they exist with a close name.\n"
+   ],
+   "id": "2ba1f40179361b70"
+  },
+  {
+   "metadata": {},
+   "cell_type": "code",
+   "source": [
+    "# If the package isn't installed in your environment, uncomment and run:\n",
+    "#!pip install -U TruthTorchLM\n"
+   ],
+   "id": "4f2eefa20af5c2c1",
+   "outputs": [],
+   "execution_count": null
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": "## Imports and package discovery\n",
+   "id": "799d20520d95d57d"
+  },
+  {
+   "metadata": {},
+   "cell_type": "code",
+   "outputs": [],
+   "execution_count": null,
+   "source": [
+    "import importlib, importlib.util, inspect, os, re, types, json, textwrap\n",
+    "\n",
+    "# Keep discovery simple and targeted to the intended package names\n",
+    "pkg_candidates = [\n",
+    "    'TruthTorchLM',\n",
+    "    'truthtorchlm',\n",
+    "]\n",
+    "\n",
+    "ttlm = None\n",
+    "PKG_NAME = None\n",
+    "for name in pkg_candidates:\n",
+    "    try:\n",
+    "        if importlib.util.find_spec(name) is not None:\n",
+    "            ttlm = importlib.import_module(name)\n",
+    "            PKG_NAME = name\n",
+    "            break\n",
+    "    except Exception:\n",
+    "        pass\n",
+    "\n",
+    "if ttlm is None:\n",
+    "    raise ImportError(\n",
+    "        \"TruthTorchLM package not found in this kernel.\\n\"\n",
+    "        \"Try installing it in this exact kernel/env:\\n\"\n",
+    "        \"  pip install -U TruthTorchLM\\n\"\n",
+    "        \"Or directly from GitHub if needed:\\n\"\n",
+    "        \"  pip install -U git+https://github.com/Ybakman/TruthTorchLM\\n\"\n",
+    "        \"Also ensure your Jupyter kernel uses the same virtualenv where you installed it.\"\n",
+    "    )\n",
+    "\n",
+    "print(f'Using package: {PKG_NAME}')\n",
+    "attrs = [a for a in dir(ttlm) if not a.startswith('_')]\n",
+    "print('Available attributes (filtered):')\n",
+    "print([a for a in attrs if re.search(r'(truth|long|gen|check|verify)', a, re.I)])\n",
+    "\n",
+    "# Try to get a version if exposed\n",
+    "version = getattr(ttlm, '__version__', None) or getattr(getattr(ttlm, 'version', None) or types.SimpleNamespace(), '__version__', None)\n",
+    "print('TruthTorchLM version:', version)\n"
+   ],
+   "id": "9b83b23715b4f0c1"
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "## Configure LLM providers (environment variables)\n",
+    "\n",
+    "Set API keys as environment variables before running, for example:\n",
+    "\n",
+    "- `OPENAI_API_KEY`\n",
+    "- `ANTHROPIC_API_KEY`\n",
+    "- `GOOGLE_API_KEY` (Gemini)\n",
+    "\n",
+    "Only the providers supported by `TruthTorchLM` are needed. If the library exposes its own client configuration, feel free to adapt this cell accordingly.\n"
+   ],
+   "id": "769347ee58028580"
+  },
+  {
+   "metadata": {},
+   "cell_type": "code",
+   "outputs": [],
+   "execution_count": null,
+   "source": [
+    "providers_present = {\n",
+    "    'OPENAI_API_KEY': bool(os.getenv('OPENAI_API_KEY')),\n",
+    "    'ANTHROPIC_API_KEY': bool(os.getenv('ANTHROPIC_API_KEY')),\n",
+    "    'GOOGLE_API_KEY': bool(os.getenv('GOOGLE_API_KEY')),\n",
+    "}\n",
+    "print('Provider keys present:', providers_present)\n",
+    "CAN_CALL = any(providers_present.values())\n",
+    "\n",
+    "if not CAN_CALL:\n",
+    "    print(textwrap.dedent('''\n",
+    "    No provider API keys detected. The demo will skip network calls.\n",
+    "    To enable full execution, set one or more of the following environment variables and re-run this notebook:\n",
+    "      - OPENAI_API_KEY\n",
+    "      - ANTHROPIC_API_KEY\n",
+    "      - GOOGLE_API_KEY\n",
+    "    '''))\n"
+   ],
+   "id": "250f9758df58ea4c"
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "## Helper: locate TruthTorchLM functions by name\n",
+    "\n",
+    "The notebook tries to find closely named functions if the exact names differ.\n"
+   ],
+   "id": "83f73cf983b9e16a"
+  },
+  {
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2025-10-29T05:46:13.341191Z",
+     "start_time": "2025-10-29T05:46:13.337833Z"
+    }
+   },
+   "cell_type": "code",
+   "source": [
+    "from typing import Callable, Optional, Tuple, Iterable\n",
+    "\n",
+    "def find_callable(module, candidates: Iterable[str]) -> Tuple[Optional[str], Optional[Callable]]:\n",
+    "    # Exact match first\n",
+    "    for name in candidates:\n",
+    "        fn = getattr(module, name, None)\n",
+    "        if callable(fn):\n",
+    "            return name, fn\n",
+    "    # Fuzzy search fallbacks\n",
+    "    lowered = [c.lower() for c in candidates]\n",
+    "    for a in dir(module):\n",
+    "        al = a.lower()\n",
+    "        if any(c in al for c in lowered):\n",
+    "            obj = getattr(module, a)\n",
+    "            if callable(obj):\n",
+    "                return a, obj\n",
+    "    return None, None\n",
+    "\n",
+    "def describe_signature(fn):\n",
+    "    try:\n",
+    "        return str(inspect.signature(fn))\n",
+    "    except Exception:\n",
+    "        return '(unknown signature)'\n"
+   ],
+   "id": "a0ea179f462330e3",
+   "outputs": [],
+   "execution_count": 6
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": "## 1) Multiple‑LLM truthfulness check\n",
+   "id": "c224e957383e16f4"
+  },
+  {
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2025-10-29T05:47:17.402385Z",
+     "start_time": "2025-10-29T05:47:17.394053Z"
+    }
+   },
+   "cell_type": "code",
+   "source": [
+    "# Candidate function names for multi‑LLM truthfulness checking\n",
+    "multi_truth_candidates = [\n",
+    "    'multiple_llm_truthfulness_check',\n",
+    "    'multi_llm_truthfulness_check',\n",
+    "    'multiple_llm_based_truthfulness_check',\n",
+    "    'truthfulness_check_multiple_llm',\n",
+    "    'truthfulness_across_llms',\n",
+    "    'check_truthfulness_multi_llm',\n",
+    "    'check_claim_truth_multi_llm',\n",
+    "    'check_truthfulness_across_models',\n",
+    "]\n",
+    "\n",
+    "multi_truth_name, multi_truth_fn = find_callable(ttlm, multi_truth_candidates)\n",
+    "if not multi_truth_fn:\n",
+    "    print('Could not find a multi‑LLM truthfulness function by the expected names.')\n",
+    "    print('Available attributes that look relevant:')\n",
+    "    possible = [a for a in dir(ttlm) if re.search(r'truth|check|verify', a, re.I)]\n",
+    "    print(possible)\n",
+    "else:\n",
+    "    print(f'Using function: {multi_truth_name}{describe_signature(multi_truth_fn)}')\n",
+    "\n",
+    "# Example claims to evaluate\n",
+    "claims = [\n",
+    "    'The Eiffel Tower is in Berlin.',  # false\n",
+    "    'Water boils at 100 degrees Celsius at standard atmospheric pressure.',  # true\n",
+    "]\n",
+    "\n",
+    "# Example list of model names to consult (adjust to what's supported in your environment)\n",
+    "llm_models = [\n",
+    "    'gpt-4o-mini',\n",
+    "    'gpt-4o',\n",
+    "]\n",
+    "\n",
+    "results_multi = []\n",
+    "if multi_truth_fn and CAN_CALL:\n",
+    "    for claim in claims:\n",
+    "        # Try to adaptively call the function based on its signature\n",
+    "        sig = None\n",
+    "        try:\n",
+    "            sig = inspect.signature(multi_truth_fn)\n",
+    "        except Exception:\n",
+    "            pass\n",
+    "        kwargs = {}\n",
+    "        if sig:\n",
+    "            for pname in sig.parameters:\n",
+    "                p = pname.lower()\n",
+    "                if 'claim' in p or 'statement' in p or 'text' in p or 'prompt' in p or 'question' in p:\n",
+    "                    kwargs[pname] = claim\n",
+    "                elif 'llm' in p or 'model' in p:\n",
+    "                    kwargs[pname] = llm_models\n",
+    "                elif 'provider' in p:\n",
+    "                    # provide empty or inferred providers list if needed\n",
+    "                    kwargs[pname] = None\n",
+    "        try:\n",
+    "            res = multi_truth_fn(**kwargs) if kwargs else multi_truth_fn(claim)\n",
+    "            results_multi.append({'claim': claim, 'result': res})\n",
+    "        except TypeError as te:\n",
+    "            print('Signature mismatch while calling', multi_truth_name, 'with kwargs', kwargs)\n",
+    "            print(te)\n",
+    "        except Exception as e:\n",
+    "            print('Error calling', multi_truth_name, '->', e)\n",
+    "else:\n",
+    "    if not CAN_CALL:\n",
+    "        print('Skipping multi‑LLM truthfulness checks due to missing provider API keys.')\n"
+   ],
+   "id": "5fdd093a4471f8b3",
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Could not find a multi‑LLM truthfulness function by the expected names.\n",
+      "Available attributes that look relevant:\n",
+      "['GOOGLE_CHECK_QUERY_SYSTEM_PROMPT', 'GOOGLE_CHECK_QUERY_USER_PROMPT', 'GOOGLE_CHECK_VERIFICATION_SYSTEM_PROMPT', 'GOOGLE_CHECK_VERIFICATION_USER_PROMPT', 'TruthMethod', 'calibrate_truth_method', 'evaluate_truth_method', 'generate_with_truth_value', 'truth_methods']\n",
+      "Skipping multi‑LLM truthfulness checks due to missing provider API keys.\n"
+     ]
+    }
+   ],
+   "execution_count": 7
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "## 2) Long‑form generation with truth value/score\n",
+    "\n",
+    "Some variants name this function `long_form_generation_with_truth_value`. The notebook will look for that or similarly named functions.\n"
+   ],
+   "id": "25d221230be9e555"
+  },
+  {
+   "metadata": {},
+   "cell_type": "code",
+   "source": [
+    "long_form_candidates = [\n",
+    "    'long_form_generation_with_truth_value',\n",
+    "    'long_form_generation_with_truthfulness',\n",
+    "    'generate_long_form_with_truth',\n",
+    "    'long_form_generate_with_truth_value',\n",
+    "    'longform_generation_with_truth_value',\n",
+    "]\n",
+    "\n",
+    "long_form_name, long_form_fn = find_callable(ttlm, long_form_candidates)\n",
+    "if not long_form_fn:\n",
+    "    print('Could not find the long‑form generation with truth value function by the expected names.')\n",
+    "    print('Available attributes that look relevant:')\n",
+    "    possible = [a for a in dir(ttlm) if re.search(r'long|truth|gen|compose|article', a, re.I)]\n",
+    "    print(possible)\n",
+    "else:\n",
+    "    print(f'Using function: {long_form_name}{describe_signature(long_form_fn)}')\n",
+    "\n",
+    "prompt = (\n",
+    "    'Explain the history of the Eiffel Tower in about 150-250 words, and include factual references. '\n",
+    "    'Return both the generated text and a truthfulness score if supported.'\n",
+    ")\n",
+    "\n",
+    "long_form_output = None\n",
+    "if long_form_fn and CAN_CALL:\n",
+    "    sig = None\n",
+    "    try:\n",
+    "        sig = inspect.signature(long_form_fn)\n",
+    "    except Exception:\n",
+    "        pass\n",
+    "    kwargs = {}\n",
+    "    if sig:\n",
+    "        for pname in sig.parameters:\n",
+    "            p = pname.lower()\n",
+    "            if 'prompt' in p or 'question' in p or 'topic' in p or 'query' in p or 'instruction' in p or 'text' in p:\n",
+    "                kwargs[pname] = prompt\n",
+    "            elif 'llm' in p or 'model' in p:\n",
+    "                kwargs[pname] = 'gpt-4o-mini'\n",
+    "            elif 'max_tokens' in p or 'length' in p or 'words' in p:\n",
+    "                kwargs[pname] = 300\n",
+    "            elif 'return' in p and 'score' in p:\n",
+    "                kwargs[pname] = True\n",
+    "    try:\n",
+    "        long_form_output = long_form_fn(**kwargs) if kwargs else long_form_fn(prompt)\n",
+    "    except TypeError as te:\n",
+    "        print('Signature mismatch while calling', long_form_name, 'with kwargs', kwargs)\n",
+    "        print(te)\n",
+    "    except Exception as e:\n",
+    "        print('Error calling', long_form_name, '->', e)\n",
+    "else:\n",
+    "    if not CAN_CALL:\n",
+    "        print('Skipping long‑form generation due to missing provider API keys.')\n"
+   ],
+   "id": "7c7a436a12a76820",
+   "outputs": [],
+   "execution_count": null
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": "## Pretty‑print results (if structures are returned)\n",
+   "id": "8f61c2c27b7ebe0a"
+  },
+  {
+   "metadata": {},
+   "cell_type": "code",
+   "source": [
+    "\n",
+    "def to_jsonable(obj):\n",
+    "    try:\n",
+    "        json.dumps(obj)\n",
+    "        return obj\n",
+    "    except Exception:\n",
+    "        if isinstance(obj, dict):\n",
+    "            return {k: to_jsonable(v) for k, v in obj.items()}\n",
+    "        elif isinstance(obj, (list, tuple, set)):\n",
+    "            return [to_jsonable(x) for x in obj]\n",
+    "        else:\n",
+    "            return str(obj)\n",
+    "\n",
+    "if 'results_multi' in globals() and results_multi:\n",
+    "    print('Multi‑LLM truthfulness results:')\n",
+    "    print(json.dumps(to_jsonable(results_multi), indent=2))\n",
+    "\n",
+    "if 'long_form_output' in globals() and long_form_output is not None:\n",
+    "    print('\\nLong‑form generation output:')\n",
+    "    print(json.dumps(to_jsonable(long_form_output), indent=2))\n"
+   ],
+   "id": "e551b5a75e0d622f",
+   "outputs": [],
+   "execution_count": null
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "## Tips\n",
+    "- If your package exposes classes instead of top‑level functions (e.g., a `TruthTorch` or `Verifier` class), instantiate it and re‑run the introspection for methods on that instance.\n",
+    "- If a function requires specific provider objects rather than model names, consult the library README and pass those objects instead of strings.\n",
+    "- For reproducibility, consider setting a seed if the library offers one.\n"
+   ],
+   "id": "1204cc2c1024c02b"
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "name": "python3",
+   "language": "python",
+   "display_name": "Python 3 (ipykernel)"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}