SPIKE-02 Execution Context Comparison¶
SPIKE-02 compares the practical headless execution contexts already exercised during the OA01.x communication experiments.
Experiment ID¶
SPIKE-02
Question¶
Which execution context is the most reliable and least noisy for headless Codex use?
Setup¶
Contexts compared:
- direct user shell
- agent-launched shell from repo root
- agent-launched shell from repo subdirectory
- agent-launched shell outside the repo
- repo-local wrapper path
Primary artifacts:
tmp/codex-user-stdout.jsonltmp/codex-user-stderr.logtmp/codex-matrix-root-stdout.jsonltmp/codex-matrix-root-stderr.logtmp/codex-matrix-subdir-stdout.jsonltmp/codex-matrix-subdir-stderr.logtmp/codex-matrix-outside-stdout.jsonltmp/codex-matrix-outside-stderr.logtmp/codex-script-stdout.jsonltmp/codex-script-stderr.log
Result¶
Best practical context:
- direct user shell
Best machine-oriented context available from the live agent environment:
- wrapper or direct agent-launched path with
stdout/stderrseparation
Observed ranking:
- direct user shell
- wrapper-assisted agent launch
- direct agent-launched shell from repo root
- direct agent-launched shell from repo subdirectory
- direct agent-launched shell outside repo
Key observations:
- the direct user shell produced clean JSON on
stdoutand emptystderr - the agent-launched contexts all remained workable, but they carried noisy state-db and plugin warnings
- repo-root versus subdirectory versus outside-repo changed details, but not the main conclusion
- the wrapper improved normalization and capture, but it did not recreate user-shell cleanliness
Useful Artifacts¶
tmp/codex-user-stderr.logis the clearest evidence that user-shell launch is materially cleanertmp/codex-matrix-root-stderr.log,tmp/codex-matrix-subdir-stderr.log, andtmp/codex-matrix-outside-stderr.logshow the recurring warning pattern in tool-launched contextstmp/codex-script-stdout.jsonlandtmp/codex-script-stderr.logshow the wrapper's value as a stable capture surface
Next Action¶
Treat direct user-shell launch as the best high-value comparison path when testing supervisory behavior.
For automation from the live agent environment, continue using:
- explicit
stdout/stderrseparation - the small wrapper when capture normalization helps
- narrow tasks rather than broad autonomy experiments