SPIKE-06 Native Codex CLI Baseline¶
This experiment note records the standalone native Codex CLI baseline before orchestration comparisons were run.
Experiment ID¶
SPIKE-06
Question¶
Does the standalone native Codex CLI at /opt/homebrew/bin/codex support the baseline headless and kernel-mediated flows cleanly enough to proceed to the larger prompt-dir comparison?
Setup¶
Native CLI:
- executable:
/opt/homebrew/bin/codex - version:
codex-cli 0.120.0
Baseline checks:
codex --versioncodex --helpcodex exec --help- direct
codex exec --json --ephemeral -p collabACK prompt tnh-conductor --helptnh-conductor run --help- focused runner/conductor test set
- no-edit
tnh-conductorACK workflow using the native Codex executable
Primary artifacts:
tmp/codex-native-version.txttmp/codex-native-help.txttmp/codex-native-exec-help.txttmp/codex-native-ack-stdout.jsonltmp/codex-native-ack-stderr.logtmp/tnh-conductor-help.txttmp/tnh-conductor-run-help.txttmp/codex-native-runner-tests-after-response-path-fix.logtmp/codex-native-kernel-summary-after-response-path-fix.json.tnh-conductor/runs/20260415T210728Z/
Result¶
The standalone native Codex CLI is usable for the next prompt-dir comparison after two runner fixes.
Confirmed behavior:
codex --version,codex --help, andcodex exec --helpexited cleanly with empty stderr.- direct headless
codex execreturned valid JSONL and the expected final messageACK_NATIVE_CODEX_BASELINE. - direct headless
codex execstill emitted plugin-manifest warnings to stderr from the user Codex home, so stdout/stderr split remains required. poetry run tnh-conductor --helpandpoetry run tnh-conductor run --helpwork cleanly.poetry run python -m tnh_scholar.cli_tools.tnh_conductor.tnh_conductor --helpis not a useful help path in this environment; it exits with only a runpy warning.- the focused runner/conductor tests passed after fixes:
22 passed in 5.72s. - the no-edit kernel run completed through
tnh-conductorusing/opt/homebrew/bin/codex. - the final kernel ACK run returned
ACK_NATIVE_CODEX_KERNEL_BASELINE. - the final kernel ACK run left the managed worktree clean.
Issues Found And Fixed¶
Forced Model Was Too Specific¶
The maintained Codex runner forced -m gpt-5.2-codex.
That failed under the native CLI with ChatGPT account auth:
The runner now omits -m by default so Codex uses repo-local CLI configuration. Explicit model override remains supported for configured callers.
Final Response Capture Dirtied Worktrees¶
The maintained runner previously wrote codex-last-message.txt inside the managed worktree.
That made a no-edit run appear dirty:
The runner now captures --output-last-message in a temporary path outside the worktree and persists the final response through the normal run-artifact path.
Useful Artifacts¶
Most useful artifacts:
.tnh-conductor/runs/20260415T210728Z/artifacts/ack/runner_metadata.json.tnh-conductor/runs/20260415T210728Z/artifacts/ack/transcript.ndjson.tnh-conductor/runs/20260415T210728Z/artifacts/ack/final_response.txt.tnh-conductor/runs/20260415T210728Z/artifacts/ack/workspace_status.json
The final workspace status shows:
Next Action¶
Proceed to the prompt-dir comparison with these constraints:
- use
/opt/homebrew/bin/codexor$(command -v codex)after confirming it resolves to the native CLI - use
poetry run tnh-conductor, not the module form - keep stdout/stderr split for all direct Codex calls
- inspect managed worktree cleanliness as part of the kernel arm review