Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion bin/oracle-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ if (process.argv[2] === "oracle-mcp") {
}
import { resolveEngine, type EngineMode, defaultWaitPreference } from "../src/cli/engine.js";
import { shouldRequirePrompt } from "../src/cli/promptRequirement.js";
import { resolveDashPrompt } from "../src/cli/stdin.js";
import chalk from "chalk";
import type { SessionMetadata, SessionMode, BrowserSessionConfig } from "../src/sessionStore.js";
import { sessionStore, pruneOldSessions } from "../src/sessionStore.js";
Expand Down Expand Up @@ -216,7 +217,7 @@ program.hook("preAction", () => {
introPrinted = true;
});
applyHelpStyling(program, VERSION, isTty);
program.hook("preAction", (thisCommand) => {
program.hook("preAction", async (thisCommand) => {
if (thisCommand !== program) {
return;
}
Expand All @@ -234,6 +235,11 @@ program.hook("preAction", (thisCommand) => {
opts.prompt = positional;
thisCommand.setOptionValue("prompt", positional);
}
const resolvedPrompt = await resolveDashPrompt(opts.prompt);
if (resolvedPrompt !== opts.prompt) {
opts.prompt = resolvedPrompt;
thisCommand.setOptionValue("prompt", resolvedPrompt);
}
if (shouldRequirePrompt(userCliArgs, opts)) {
console.log(
chalk.yellow('Prompt is required. Provide it via --prompt "<text>" or positional [prompt].'),
Expand Down
2 changes: 1 addition & 1 deletion docs/browser-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ You can pass the same payload inline (`--browser-inline-cookies '<json or base64
- `--browser-inline-files`: alias for `--browser-attachments never` (forces inline paste; never uploads attachments).
- `--browser-bundle-files`: bundle all resolved attachments into a single temp file before uploading (only used when uploads are enabled/selected).
- sqlite bindings: automatic rebuilds now require `ORACLE_ALLOW_SQLITE_REBUILD=1`. Without it, the CLI logs instructions instead of running `pnpm rebuild` on your behalf.
- `--model`: the same flag used for API runs is accepted, but the ChatGPT automation path supports GPT-5.4 and GPT-5.2 variants. Use `gpt-5.4-pro`, `gpt-5.4`, `gpt-5.2`, `gpt-5.2-thinking`, `gpt-5.2-instant`, or `gpt-5.2-pro`. Legacy Pro aliases still resolve to the latest Pro picker target.
- `--model`: the same flag used for API runs is accepted, but the ChatGPT automation path supports GPT-5.4 and GPT-5.2 variants. Use `gpt-5.4-pro`, `gpt-5.4`, `gpt-5.2`, `gpt-5.2-thinking`, `gpt-5.2-instant`, or `gpt-5.2-pro`. Any ChatGPT Pro alias resolves to the current Pro picker target, so versioned Pro labels may briefly appear in the UI but settle on the single available Pro entry.
- Cookie sync is mandatory—if we can’t copy cookies from Chrome, the run exits early. Use the hidden `--browser-allow-cookie-errors` flag only when you’re intentionally running logged out (it skips the early exit but still warns).
- Experimental cookie controls (hidden flags/env):
- `--browser-cookie-names <comma-list>` or `ORACLE_BROWSER_COOKIE_NAMES`: allowlist which cookies to sync. Useful for “only NextAuth/Cloudflare, drop the rest.”
Expand Down
4 changes: 2 additions & 2 deletions docs/debug/remote-chrome.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Outcome:
- After wiring fix, logs showed “Routing browser automation to remote host …” but requests failed with:
- `ECONNREFUSED 192.168.64.2:49810` when no service listening.
- `busy` when a previous service process was still bound.
- Later run reached remote path but failed model switch: `Unable to find model option matching "GPT-5.2 Pro"` (remote Chrome not logged into ChatGPT / model picker mismatch).
- Later run reached remote path but failed model switch: `Unable to find model option matching "GPT-5.4 Pro"` (remote Chrome not logged into ChatGPT / model picker mismatch).
- After disabling cookie shipping and requiring host login, remote runs now fail earlier: service logs “Loading ChatGPT cookies from host Chrome profile…” then reports `Unhandled promise rejection ... Unknown error` when `loadChromeCookies` runs on the VM. Remote client sees `socket hang up` because the server doesn’t deliver a result.

### 2) Remote service on VM
Expand All @@ -52,7 +52,7 @@ Actions taken on VM (tmux `vmssh`):

- Environment PATH: bun not on PATH for non-interactive shells caused `./runner` to fail; need to `export PATH="$HOME/.bun/bin:$PATH"` before starting service.
- Port collisions: prior listeners on 49810 caused ECONNREFUSED/busy.
- Remote model switch failed: remote Chrome likely not signed into ChatGPT; model picker couldn’t find “GPT-5.2 Pro”.
- Remote model switch failed: remote Chrome likely not signed into ChatGPT; model picker couldn’t find the current Pro picker target (“GPT-5.4 Pro” at the time).
- Keychain/cookie read now failing on VM: `loadChromeCookies` throws “Unknown error” when invoked from the server process (Node 25, SSH shell). When `oracle serve` runs from GUI Terminal it starts fine; under nohup/SSH it logs the rejection and remote runs hang.
- New behavior (post-fix): `oracle serve` exits early if it cannot load host ChatGPT cookies after opening chatgpt.com for login; sign in on the host and restart the service.

Expand Down
42 changes: 28 additions & 14 deletions docs/manual-tests.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,23 +141,37 @@ Document results (pass/fail, session IDs) in PR descriptions so reviewers can au

Run these four smoke tests whenever we touch browser automation:

1. **GPT-5.2 simple prompt**
`pnpm run oracle -- --engine browser --model "GPT-5.2" --prompt "Give me two short markdown bullet points about tables"`
Expect two markdown bullets, no files/search referenced. Note the session ID (e.g., `give-me-two-short-markdown`).
Fast-path note:
- Tests 1-4 below are quick browser-path checks only. They use `gpt-5.2-instant`, which currently targets the ChatGPT Instant 5.3 picker. They are not a substitute for Pro validation.

2. **GPT-5.2 simple prompt**
`pnpm run oracle -- --engine browser --model gpt-5.2 --prompt "List two reasons Markdown is handy"`
Confirm the answer arrives (and only once) even if it takes ~2–3 minutes.
1. **Fast browser simple prompt**
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly one line and nothing else: pro-ok"`
Expect the answer body to contain `pro-ok` verbatim on its own line. Note the session ID.

3. **GPT-5.2 + attachment**
2. **Fast browser exact-line prompt**
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly these three lines and nothing else:\n\`\`\`js\nconsole.log('thinking-ok')\n\`\`\`"`
Confirm the answer includes the fenced `js` code block and `console.log('thinking-ok')` verbatim.

3. **Fast browser + attachment**
Prepare `/tmp/browser-md.txt` with a short note, then run
`pnpm run oracle -- --engine browser --model "GPT-5.2" --prompt "Summarize the key idea from the attached note" --file /tmp/browser-md.txt`
Ensure upload logs show “Attachment queued” and the answer references the file contents explicitly.
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly one line and nothing else: note=<paste the file contents exactly>" --file /tmp/browser-md.txt`
Ensure upload logs show “Attachment queued” and the answer contains `note=` plus the attached file contents exactly.

4. **GPT-5.2 + attachment (verbose)**
4. **Fast browser + attachment (verbose)**
Prepare `/tmp/browser-report.txt` with faux metrics, then run
`pnpm run oracle -- --engine browser --model gpt-5.2 --prompt "Use the attachment to report current CPU and memory figures" --file /tmp/browser-report.txt --verbose`
Verify verbose logs show attachment upload and the final answer matches the file data.
`pnpm run oracle -- --engine browser --model gpt-5.2-instant --prompt "Return exactly these two lines and nothing else:\nCPU=<value from file>\nMEMORY=<value from file>" --file /tmp/browser-report.txt --verbose`
Verify verbose logs show attachment upload and the final answer contains the exact CPU and memory values from the file.

### Pro browser smoke

Run these when the change might affect Pro-specific behavior, long thinking, or reattach.

1. **Pro markdown capture**
`pnpm run oracle -- --engine browser --model gpt-5.4-pro --prompt "Return exactly these three lines and nothing else:\n\`\`\`js\nconsole.log('thinking-ok')\n\`\`\`"`
Confirm the answer preserves the fenced `js` code block.

2. **Pro reattach flow**
Use `scripts/browser-smoke.sh` or run a manual `--browser-keep-browser` session with `gpt-5.4-pro`, then kill the controller and verify `oracle session <slug> --render-plain` still shows the expected answer.

Record session IDs and outcomes in the PR description (pass/fail, notable delays). This ensures reviewers can audit real runs.

Expand Down Expand Up @@ -248,8 +262,8 @@ These Vitest cases hit the real OpenAI API to exercise both transports:
export ORACLE_LIVE_TEST=1
pnpm vitest run tests/live/openai-live.test.ts
```
2. The first two tests target the standard GPT-5 (`gpt-5.1` / `gpt-5.2`) foreground
streaming paths. The later background tests send `gpt-5.4-pro` and `gpt-5.2-pro`
2. The first two tests target the current fast browser picker path (`gpt-5.2-instant` aliasing
to Instant 5.3). The later background tests send `gpt-5.4-pro` and `gpt-5.2-pro`
prompts and expect the CLI to stay in background mode until OpenAI finishes
(up to 30 minutes).
3. Watch the console for `Reconnected to OpenAI background response...` if
Expand Down
2 changes: 1 addition & 1 deletion docs/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

- Unit/type tests: `pnpm test` (Vitest) and `pnpm run check` (typecheck).
- Gemini unit/regression: `pnpm vitest run tests/gemini.test.ts tests/gemini-web`.
- Browser smokes: `pnpm test:browser` (builds, checks DevTools port 45871, then runs headful browser smokes with GPT-5.2 for most cases and GPT-5.4 Pro for the reattach + markdown checks). Requires a signed-in Chrome profile; runs headful but hides the window by default unless Chrome forces focus.
- Browser smokes: `pnpm test:browser` (builds, checks DevTools port 45871, then runs headful browser smokes with the `gpt-5.2-instant` alias for fast path checks and GPT-5.4 Pro for the actual Pro/reattach + markdown checks). Requires a signed-in Chrome profile; runs headful but hides the window by default unless Chrome forces focus.
- Live API smokes: `ORACLE_LIVE_TEST=1 OPENAI_API_KEY=… pnpm test:live` (excludes OpenAI pro), `ORACLE_LIVE_TEST=1 OPENAI_API_KEY=… pnpm test:pro` (OpenAI pro live). Expect real usage/cost.
- Gemini web (cookie) live smoke: `ORACLE_LIVE_TEST=1 pnpm vitest run tests/live/gemini-web-live.test.ts` (requires a signed-in Chrome profile at `gemini.google.com`).
- MCP focused: `pnpm test:mcp` (builds then stdio smoke via mcporter).
Expand Down
31 changes: 29 additions & 2 deletions scripts/browser-smoke-upload-only.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,39 @@ set -euo pipefail

ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
CMD=(node "$ROOT/dist/bin/oracle-cli.js" --engine browser --wait --heartbeat 0 --timeout 900 --browser-input-timeout 120000)
FAST_MODEL="gpt-5.2"
FAST_MODEL="gpt-5.2-instant"

run_and_check_contains() {
local label="$1"
local expected="$2"
shift 2
local logfile
logfile="$(mktemp -t oracle-browser-smoke-log)"
if ! "$@" >"$logfile" 2>&1; then
echo "[browser-smoke-upload-only] ${label}: command failed"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
if ! grep -Fq -- "$expected" "$logfile"; then
echo "[browser-smoke-upload-only] ${label}: expected output missing: $expected"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
cat "$logfile"
rm -f "$logfile"
}

tmpfile="$(mktemp -t oracle-browser-smoke)"
echo "smoke-attachment" >"$tmpfile"

echo "[browser-smoke-upload-only] fast upload attachment (non-inline)"
"${CMD[@]}" --model "$FAST_MODEL" --prompt "Read the attached file and return exactly one markdown bullet '- upload: <content>' where <content> is the file text." --file "$tmpfile" --slug browser-smoke-upload --force
run_and_check_contains \
"fast upload attachment (non-inline)" \
"upload=smoke-attachment" \
"${CMD[@]}" --model "$FAST_MODEL" \
--prompt "Return exactly one line and nothing else: upload=smoke-attachment" \
--file "$tmpfile" --slug browser-smoke-upload --force

rm -f "$tmpfile"
92 changes: 78 additions & 14 deletions scripts/browser-smoke.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,89 @@ set -euo pipefail

ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
CMD=(node "$ROOT/dist/bin/oracle-cli.js" --engine browser --wait --heartbeat 0 --timeout 900 --browser-input-timeout 120000)
FAST_MODEL="${ORACLE_BROWSER_SMOKE_FAST_MODEL:-gpt-5.2}"
FAST_MODEL="${ORACLE_BROWSER_SMOKE_FAST_MODEL:-gpt-5.2-instant}"
PRO_MODEL="${ORACLE_BROWSER_SMOKE_PRO_MODEL:-gpt-5.4-pro}"
# FAST_MODEL is for quick browser-path health checks only.
# PRO_MODEL is kept separate for real Pro/reattach coverage.

assert_output_contains() {
local label="$1"
local logfile="$2"
shift 2
for needle in "$@"; do
if ! grep -Fq -- "$needle" "$logfile"; then
echo "[browser-smoke] ${label}: expected output missing: $needle"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
done
}

run_and_check_contains() {
local label="$1"
shift
local expectations=()
while [ "$#" -gt 0 ] && [ "$1" != "--" ]; do
expectations+=("$1")
shift
done
shift
local logfile
logfile="$(mktemp -t oracle-browser-smoke-log)"
if ! "$@" >"$logfile" 2>&1; then
echo "[browser-smoke] ${label}: command failed"
cat "$logfile"
rm -f "$logfile"
exit 1
fi
assert_output_contains "$label" "$logfile" "${expectations[@]}"
cat "$logfile"
rm -f "$logfile"
}

tmpfile="$(mktemp -t oracle-browser-smoke)"
echo "smoke-attachment" >"$tmpfile"

echo "[browser-smoke] fast upload attachment (non-inline)"
"${CMD[@]}" --model "$FAST_MODEL" --browser-attachments always --prompt "Read the attached file and return exactly one markdown bullet '- upload: <content>' where <content> is the file text." --file "$tmpfile" --slug browser-smoke-upload --force

echo "[browser-smoke] fast simple"
"${CMD[@]}" --model "$FAST_MODEL" --prompt "Return exactly one markdown bullet: '- pro-ok'." --slug browser-smoke-pro --force

echo "[browser-smoke] fast with attachment preview (inline)"
"${CMD[@]}" --model "$FAST_MODEL" --browser-inline-files --prompt "Read the attached file and return exactly one markdown bullet '- file: <content>' where <content> is the file text." --file "$tmpfile" --slug browser-smoke-file --preview --force

echo "[browser-smoke] pro standard markdown check"
"${CMD[@]}" --model "$PRO_MODEL" --prompt "Return two markdown bullets and a fenced code block labeled js that logs 'thinking-ok'." --slug browser-smoke-thinking --force

echo "[browser-smoke] reattach flow after controller loss"
echo "[browser-smoke][fast] upload attachment (non-inline)"
run_and_check_contains \
"fast upload attachment (non-inline)" \
"upload=smoke-attachment" \
-- \
"${CMD[@]}" --model "$FAST_MODEL" --browser-attachments always \
--prompt "Return exactly one line and nothing else: upload=smoke-attachment" \
--file "$tmpfile" --slug browser-smoke-upload --force

echo "[browser-smoke][fast] simple"
run_and_check_contains \
"fast simple" \
"pro-ok" \
-- \
"${CMD[@]}" --model "$FAST_MODEL" \
--prompt "Return exactly one line and nothing else: pro-ok" \
--slug browser-smoke-pro --force

echo "[browser-smoke][fast] attachment preview (inline)"
run_and_check_contains \
"fast with attachment preview (inline)" \
"file=smoke-attachment" \
-- \
"${CMD[@]}" --model "$FAST_MODEL" --browser-inline-files \
--prompt "Return exactly one line and nothing else: file=smoke-attachment" \
--file "$tmpfile" --slug browser-smoke-file --preview --force

echo "[browser-smoke][pro] standard markdown check"
run_and_check_contains \
"pro standard markdown check" \
'```js' \
"console.log('thinking-ok')" \
'```' \
-- \
"${CMD[@]}" --model "$PRO_MODEL" \
--prompt $'Return exactly these three lines and nothing else:\n```js\nconsole.log('\''thinking-ok'\'')\n```' \
--slug browser-smoke-thinking --force

echo "[browser-smoke][pro] reattach flow after controller loss"
slug="browser-reattach-smoke"
meta="$HOME/.oracle/sessions/$slug/meta.json"
logfile="$(mktemp -t oracle-browser-reattach)"
Expand Down
Loading