Introduction
VexCoder is a local coding assistant you run as a binary. It connects to the model endpoint you configure, supports interactive CLI sessions and non-interactive batch runs, and keeps its setup lightweight enough to build from source on macOS, Linux, and Windows.
This book focuses on the public user surface:
- building the binary
- understanding the current runtime, application, and transport layout
- creating a workspace with
vex init - configuring the model endpoint and token
- using the current CLI flags and interactive commands
The shortest path to a running session is in Quick Start. For the current code layout, see Architecture Overview.
Architecture Overview
VexCoder currently has two operator-facing surfaces in the source tree:
- the interactive CLI UI started by
src/bin/vex.rs - the non-interactive batch runner in
src/batch_mode.rs
Most interactive application coordination is rooted at src/app.rs and its
split submodules under src/app/ (for example commands/,
slash_commands.rs, and layout.rs). The runtime core is found under
src/runtime/, including context assembly, the edit loop, command execution,
validation, and task state.
Current code layout
src/bin/vex.rsparses CLI arguments, loads config, and routes startup into the interactive UI, batch mode, export, compatibility helpers, and other CLI paths.src/app.rsis the interactive application module root. The full-screen TUI command surface is found acrosssrc/app.rs,src/app/commands/,src/app/slash_commands.rs, and related helper modules undersrc/app/.src/ui/render/owns the ratatui-native task-surface renderer. It renders the task surface throughrender_task_layout()using ratatuiFramewidgets, with one compact status row above the transcript body and composer. The rendering and infrastructure stack includes:unicode-widthandunicode-segmentationfor grapheme-aware display width calculations;textwrapfor paragraph wrapping;ansi-to-tuifor converting raw ANSI escape sequences into ratatuiSpan/Linestructures;arboardfor programmatic clipboard access via the/copyslash command;pulldown-cmarkandsyntectfor the shared markdown rendering helpers with inline markdown styling active today and fenced-code highlighting handled inside the shared conversion module;ratatui-macrosfor line and span construction helpers;similarfor unified diff rendering in edit previews (generic diff algorithm for inline structured diffs);color-eyrefor structured panic hooks and pretty backtraces;dirsfor cross-platform XDG/config directory resolution;tracing-appenderfor daily-rotated file logging whenRUST_LOGis set;indicatifandconsolefor progress spinners in headless batch mode;ignoreandglobsetfor gitignore-compliant workspace traversal and glob matching;pathdiffandduncefor cross-platform relative path computation;chronofor ISO 8601 timestamps in SSE event streams and all internal timestamp generation;base64for binary content encoding in exports;indexmapfor ordered insertion-preserving maps used in streaming tool-call accumulation (DerivedTurnState.pending_tool_calls), ensuring tool calls are serialized in the order they were opened;tower-httpfor theTraceLayer::new_for_http()middleware wired intobuild_http_router(), providing structured request/response tracing for authorized and unauthorized HTTP requests.crosstermis configured withbracketed-paste(prevents input corruption on multi-line pastes; active insrc/terminal.rs) andevent-stream(async terminal event integration).ratatuienablesunstable-rendered-line-info,unstable-widget-ref, andunstable-backend-writerfor scroll-offset tracking, efficient widget updates, and backend-writer parity. Tool calls, waiting-state telemetry, and assistant responses stream into transcript paragraphs on the shared body instead of a dedicated visible timeline strip. Short transcript bodies now start directly below the status row and grow downward until the body fills; only then does the live bottom-follow window scroll older rows upward. The fullscreen composer auto-fits against the current display row and column budget, keeps wrapped/command,@path, and pasted prompt text editable in place, and turns@pathsuggestions into a repo-wide interactive picker:Up/Downtraverse ranked matches across the full workspace tree,Enterinserts the selected workspace-relative path, andEscdismisses the picker so the raw mention token can still be submitted unchanged. The picker keeps a bounded ranked candidate set per keystroke so large workspaces do not pay a full-tree sort cost on every input edit. Free-form slash commands such as/edit,/plan, and/reviewconsume those selected@pathmentions as inline context before the model turn starts, while/explaintreats@pathas the requested file target./editand/fixalso seed task-scoped edit grants (write-file,apply-patch,run-command) so the mutation workflow remains active after the slash command starts without downgrading broader session grants. Outside picker mode, the composer still supports visual-rowUp/Down/Home/Endnavigation instead of forcing the operator out of task mode, while cli selection and copy gestures stay with the cli because the UI does not enable mouse capture. While timeline follow mode is active, the output pane stays on the accumulated transcript so each new server response appends to the existing scrollback instead of replacing it. Manual timeline navigation can still switch that pane into per-step detail, andAlt+Endreturns the surface to live follow mode without restoring a dedicated activity strip. Theexpand_rows_for_displayhelper insrc/ui/render/transcript.rssplits embedded newlines before word-wrapping each sub-line, so server responses containing literal\nsequences render as separate visual rows.is_structural_transcript_rowrecognises bullet list items (-,*) and numbered list items (1.,2)) as structural, passing them through the render path without word-wrap reflow. Scroll-offset clamping inapply_output_scroll_actionandpreserve_transcript_scroll_on_growthuses the expanded (word-wrapped) row count instead of the raw output row count, so the viewport range matches the render path and all rows are reachable.src/app/model_update.rspushes a verb-first one-liner into the transcript as each tool result arrives (e.g. "Searched …", "Read …", "Edited …") so the operator sees immediate progress instead of a blank screen while the model produces its response text. Consecutive completed read-only tools (codebase_search,read_file,search,search_files,search_content,find_files,list_files,list_dir,list_directory,glob_files,git_status,git_diff,git_log,git_show) now fold into a single[tool]paragraph regardless of tool name, keeping the transcript compact during multi-tool exploration sequences. Pending and completededit_filerows also keep the structured multiline diff preview instead of collapsing the change into one JSON line, which preserves per-hunk evidence and add/remove color feedback in both renderers.src/batch_mode.rsruns the same runtime headlessly forvex execand writes JSONL or text output.src/runtime/contains the reusable runtime machinery: context assembly, the edit loop, command and sandbox plumbing, project instructions, task state, and validation. The Phase 1 ADR-038 split addssrc/runtime/context_cache.rsfor bounded in-memory file-rollup reuse andsrc/runtime/git_rollup.rsfor opt-in git status/diff capture, so automatic turn assembly no longer has to pay synchronous git overhead by default.src/state/conversation/owns the conversation loop safeguards that sit above raw tool execution. Alongside the existing read-only and mutating-tool guards, it now short-circuits malformedread_filecalls with missing paths and asks for a concrete file target or a repo-overview flow (list_files/codebase_search) instead of replaying the same raw tool error, including mixed parallel read-only rounds where a goodlist_filescall and a malformedread_filearrive together. Write guards enforceVEX_DIFF_PREFERRED_ABOVE_LINES(warning) andVEX_WRITE_FILE_MAX_LINES(rejection) thresholds, steering the model towardapply_patchoredit_filefor large files. Conversation history older thanVEX_HISTORY_KEEP_TURNSturns (default 10) is condensed: tool results keep their first 5 lines plus a line-count indicator to stay within the context budget.src/server/owns the ADR-026 transport plumbing: HTTP routing and auth middleware (http.rs), SSE response framing (sse.rs), Unix socket binding (socket.rs), request handlers (handlers/mod.rs,handlers/session.rs), TLS helpers and config resolution (util.rs). Transport code reaches the runtime only through facade entrypoints insrc/app/.src/local_api.rscontains theLocalApiMode(RuntimeMode) andLocalApiFrontend(FrontendAdapter) that bridge the local API surface to the runtime engine. The local API surface is transcript-first: live assistant text is normalized intofinal_texttranscript blocks so downstream consumers can render one enriched stream instead of stitching together separate assistant delta/message events.src/tools/search.rsimplements thecodebase_searchtool using a Tree-sitter-based structural index for Rust source files. The index extracts functions, structs, enums, impls, traits, modules, constants, and type aliases, and ranks results by exact name match, substring match, parent-scope match, and content keyword match.src/tools/semantic.rsmanages the optional semantic vector index persisted at.vex/index/. WhenVEX_EMBEDDING_PROVIDERis configured, chunks are embedded at logical boundaries and results are reranked by cosine similarity merged with structural scores.src/tools/embed.rsprovides the embedding client for the/v1/embeddings-compatible endpoint used by semantic search.src/tools/workspace_explore.rsprovides thelist_dirandglob_filestools for workspace exploration. Both are workspace-confined,.gitignore-aware, and bounded to prevent unbounded output.src/tools/workspace_ignore.rsimplementsWorkspaceIgnoreon top of theignorecrate's gitignore matcher so thatsearch_files,list_dir,glob_files, andfind_filesall skip ignored paths with gitignore-compatible directory semantics.
Streaming protocol coverage
The shared SSE parser in src/api/stream.rs and the normalized type surface in
src/types/api_types.rs preserve documented streaming values from both
messages-v1 and chat-compat backends.
- heartbeats and structured stream errors
- text, input-json, thinking, and signature deltas
- citations, server-tool blocks, and web-search tool results
- normalized usage totals plus cache, geography, and token-detail metadata
- chat-compat chunk metadata such as service tier, system fingerprint, refusal text, logprobs, choice indexes, and tool-call type
Not every metadata field is rendered in the interactive transcript today, but the parser keeps those values in the normalized event surface instead of dropping them during protocol conversion.
A StreamTextNormaliser layer at the forward_conversation_update boundary
intercepts embedded tool call markup (XML-like tags from local inference
servers) and converts them into structured [tool]/[detail] transcript
lines before they reach the TUI. This prevents raw SSE event data from leaking
to the display and ensures all tool invocations render as paragraph blocks in
the scrolling transcript pane. The local API handoff in
src/runtime/json_handoff.rs and src/local_api.rs preserves those transcript
rows plus transcript block start/delta/complete updates as canonical
RuntimeEnvelope JSON events, so downstream clients can stay transcript-first
over SSE without reparsing a flattened assistant text stream. The
normaliser buffers chunk-split <tool_call>, <function=...>, and
<parameter=...> fragments until they are complete enough to classify,
so transcript-first consumers follow the backend's JSON delta stream
without showing raw wrapper or partial tag text when the server breaks
markup across arbitrary chunk boundaries.
The current ratatui surface keeps the composer pinned at the bottom edge and
scrolls transcript paragraphs upward from that anchor, but the live turn state
is still assembled from three sources: history_state.lines,
current_turn_stream_segments, and active_stream_blocks. That split is the
remaining complexity boundary for the tool-call cutover. The current repair
work keeps scroll ownership on the ratatui transcript, fixes net-growth
preservation when pending tool paragraphs are replaced by completed results,
and defaults local text-protocol parsing to the hybrid tagged-plus-XML chain.
The larger single-document cutover plan is recorded in
docs/src/tool-call-cutover.md.
The live parser path for interactive turns remains the shared stream parser,
the tool-call parser selected by the conversation loop, and the
StreamTextNormaliser boundary that converts malformed inline tool markup into
transcript-safe rows. The structured_parser module is present in tree as an
optional framework and does not replace the live runtime parser path unless the
ADR-043 adoption gates are satisfied.
A transcript buffering foundation (src/state/transcript_delta.rs) provides
StreamingBlockBuffer plus TranscriptBlockKind for active structured-stream
blocks. The buffer map is keyed by block index in TuiMode and runs in
parallel with the transcript-first line path: transcript_display_rows()
reads the block kind to gate the live streaming cursor, while
task_output_view_with() reads buffered byte counts to expose a compact live
throughput indicator in the output title during structured streaming. Bounded
suffix deduplication still routes through
bounded_incremental_suffix() in the shared streaming path, but the render
surface no longer carries the earlier staged delta-consumer helpers that never
landed in production.
The runtime envelope schema (schemas/runtime_envelope_v1.json) accepts tool
names matching [a-z][a-z0-9_-]* and MCP-namespaced tools
(mcp.<provider>.<tool>), covering all built-in and external tool
registrations.
Crate design boundaries -- text processing
VexCoder uses several crates that touch text at different abstraction layers. Each crate occupies a distinct role with no overlap. The boundary rule is: never use a search/indexing crate for internal text processing, and never use a text-processing crate for file-content search or structural parsing.
Non-overlapping crate roles
| Crate | Role | Scope | NOT used for |
|---|---|---|---|
aho-corasick | Multi-pattern literal matching | File content search, keyword extraction from source text | Git output parsing, secret redaction |
regex-lite | Lightweight internal text processing | Git output parsing, secret redaction, rate-limit extraction, format validation | Code search, RAG, semantic indexing, codebase search |
tree-sitter | Structural AST indexing | Language-aware parsing of source files into syntax trees | Text processing, log parsing, redaction |
globset / ignore | Filesystem traversal | .gitignore-aware path matching and directory walking | File content search, string processing |
quick-xml | XML tool-call parsing | Structured extraction of <function=...> / <parameter=...> tags from model output | Git parsing, log analysis |
indexmap | Ordered insertion-preserving maps | Streaming tool-call accumulation preserving insertion order | Search indexing, text processing |
tower-http | HTTP middleware | Request/response tracing for the local API server | Application logic, text processing |
regex-lite -- ASCII-only internal text processing
regex-lite is the only regex crate in the dependency tree. All patterns
are ASCII-only (\d = [0-9], \w = [0-9A-Za-z_]). Non-ASCII characters
are not supported in regex-lite patterns. This is intentional -- vexcoder's
regex-lite usage exclusively targets machine-readable ASCII output from git,
HTTP headers, and API responses.
Conventional use cases DISTINCT from RAG/semantic search/codebase_search:
- Parsing structured output from external tools (git status, git diff, git apply, git log)
- Extracting known fields from semi-structured strings (retry delays, durations)
- Sanitizing/redacting sensitive data from logs, transcripts, and telemetry
- Format validation (API key formats, token patterns, connection strings)
None of these overlap with codebase search, RAG, or semantic indexing.
The regex-lite modules live under src/runtime/ as three focused files:
git_parse.rs-- Structured parsing ofgit status --porcelain,git diff --stat,git diff --name-status,git log --oneline, andgit applyoutput into typed enums and structs. Patterns compile once viaOnceLock<regex_lite::Regex>and are reused across calls.secrets.rs-- Output redaction for vendor API keys (sk-...), AWS access keys (AKIA...), GitHub PATs (ghp_/gho_/ghu_/ghs_/ghr_), PEM private key headers, bearer tokens, connection strings with embedded credentials, and generic secret assignments. Wired intosanitize_assistant_textso secrets never leak into the transcript or logs.rate_limit.rs-- Extracts retry delay hints fromRetry-Afterheader values and error response body text ("try again in N seconds"). The header path is wired intomap_api_status_errorin the API client with fallback to body text for 429 detection.
Design rationale: regex-lite was chosen over the full regex crate because
(a) vexcoder does not allow non-ASCII characters in these internal patterns,
(b) the ~94 KB binary size overhead vs ~373 KB for full regex is meaningful
for a CLI binary, and (c) the O(m*n) execution guarantee is the same.
Stream parser -- no regex
The stream parser (src/api/stream.rs) and text normaliser
(src/api/stream/text_normaliser.rs) handle SSE framing, JSON delta
parsing, and embedded XML-like tool call markup using zero-regex string
scanning (starts_with, contains, manual index arithmetic). quick-xml
handles structured XML extraction. regex-lite is not used in the streaming
path.
Full git parsing stack
The git parsing stack is the foundation of vexcoder's value as a CLI tool working with git repos. The following git output formats are parsed:
| Command | Parser | Output type |
|---|---|---|
git status --porcelain | parse_git_status | ParsedGitStatus with per-file status entries |
git diff --stat | parse_diff_stat | ParsedDiffStat with per-file changes and summary |
git diff --name-status | parse_name_status | ParsedNameStatus with status chars and rename detection |
git log --oneline | parse_git_log_oneline | ParsedGitLog with hash + subject entries |
git apply (stdout+stderr) | parse_git_apply | ParsedGitApply with outcome classification per line |
All parsers live in src/runtime/git_parse.rs and are re-exported from
src/runtime.rs. git_rollup.rs orchestrates git command execution with
timeout and cancellation support, using parse_git_status to produce
structured rollups for context assembly.
Secret redaction -- always on
Secret redaction runs on every assistant text output through
sanitize_assistant_text in src/runtime/policy.rs. The following
patterns are detected and replaced with [REDACTED]:
- Vendor API keys (
sk-prefix, 20+ chars) - AWS access key IDs (
AKIAprefix, 16 uppercase alphanumeric) - GitHub personal access tokens (
ghp_,gho_,ghu_,ghs_,ghr_prefixes, 36+ chars) - PEM private key headers (
-----BEGIN ... PRIVATE KEY-----) - Bearer tokens (preserving the
Bearerprefix) - Connection strings with embedded passwords (
protocol://user:password@host) - Generic secret assignments (
API_KEY=...,token: "...", etc.)
Structured tool call design
The stream parser handles three tool-call markup formats from model output:
-
XML tags (
<function=name>,<parameter=key>value</parameter>) -- extracted byquick-xmlin the text normaliser. The normaliser uses zero-regex string scanning (starts_with,contains, manual index arithmetic) to detect tag boundaries, then delegates structured extraction toquick-xml. -
JSON tool calls -- parsed via
serde_jsonfromtool_callsarrays in chat-completion deltas. Streamed deltas accumulate intoindexmap::IndexMapentries preserving insertion order. -
Structured content blocks --
tool_useblocks withid,name, andinputfields parsed from content-block deltas.
No regex is used in the streaming tool-call path. regex-lite is reserved
for post-hoc processing of git output and secret redaction, never for
real-time stream parsing.
Crate expansion decisions
The following crates appear in comparable open-source Rust CLI toolchains but are not yet in vexcoder's dependency tree. Each is either accepted for the next batch or rejected with rationale.
Accepted now means the design choice is settled in the repo. It does not mean the crate is added immediately without a live integration seam. vexcoder keeps dependency additions coupled to real code paths and tests so the tree does not accumulate unused crates.
| Crate | Comparable CLI usage | vexcoder decision | Rationale |
|---|---|---|---|
bm25 | Text ranking for code search results | Next batch planned (ADR-033 Phase 5) | Ranked retrieval improves codebase_search relevance. Will sit behind the aho-corasick literal-match layer, not in the regex-lite text-processing layer. |
similar | Diff algorithm for computing inline text diffs | Active (replaces diffy) | Generic diff algorithm now wired into src/edit_diff.rs. No branding dependency. |
which | Locating executables on $PATH | Next batch planned | git_rollup.rs currently assumes git is on PATH. which::which("git") provides a clear error when git is missing. |
walkdir | Recursive directory traversal | Design rejects | vexcoder uses ignore (from the ripgrep ecosystem) which already provides recursive traversal with .gitignore support. Adding walkdir would duplicate traversal logic. ignore is the conventional choice for git-aware CLI tools. |
notify | Filesystem event watching | Next batch planned | Enables watch-mode for git_rollup to detect working-tree changes without polling. Will integrate with the existing git_rollup.rs orchestration layer. |
Vexcoder-specific crates
The following crates are in vexcoder's tree but not in comparable CLI toolchains. Each serves a design need specific to vexcoder's architecture.
| Crate | vexcoder usage | Why comparable CLIs omit it | Design rationale |
|---|---|---|---|
axum | HTTP routing and handler composition for the local API server surface | Comparable CLIs may use a thinner direct HTTP surface or a different server seam. | axum is already the active server foundation in vexcoder; tower-http sits on top of it for request tracing, not in place of it. |
tower-http | TraceLayer HTTP middleware for the local API server (src/server/http.rs) | Comparable CLIs use axum directly without tower middleware. vexcoder's LocalApiServer (ADR-026) requires request/response tracing for debugging multi-agent sessions. | Conventional for axum-based servers needing observability. |
fs2 | File-locking for .vex/state/ durable writes | Comparable CLIs use a different persistence model. | Prevents concurrent vexcoder sessions from corrupting task-state files. write_json_safe uses temp+fsync+rename; fs2 adds advisory locking as a second safety layer. |
portable-pty | Pseudo-terminal allocation for sandboxed command execution | Comparable CLIs use platform-specific PTY code directly. | vexcoder's command runner needs PTY for interactive tool output (e.g., git commit with editor). portable-pty provides cross-platform PTY without platform-specific FFI. |
rmcp (1.2.x) | MCP (Model Context Protocol) client for external tool providers | Comparable CLIs implement MCP transport directly using earlier transport library versions (e.g., pre-1.0). | vexcoder supports [[mcp_servers]] config for connecting to external tool providers (ADR-024 PM-01). vexcoder pins rmcp 1.2.x to track the current stable MCP transport spec; the version boundary matters because the MCP wire protocol stabilized across the 1.x release series. |
quick-xml | XML tool-call tag parsing from model output | Comparable CLIs use string-based parsing for tool calls. | vexcoder's stream parser delegates structured XML extraction to quick-xml rather than hand-rolling an XML parser. Conventional for XML processing in Rust. |
Ongoing boundary work
The long-term architecture work is tracked in the ADR set under adr/.
-
ADR-025 defines the canonical machine-readable runtime request and event contract.
-
ADR-026 defines the proposed
LocalApiServertransport binding over that contract. -
ADR-028 is now active in the current tree: the facade helpers are stored under
src/app/, transport code has been extracted fromsrc/local_api.rsintosrc/server/submodules (http.rs,sse.rs,socket.rs,handlers/mod.rs,handlers/session.rs,util.rs), and dependency-direction enforcement tests verify inward-only import rules across all layers, including grouped, multiline, andsuper::-relativecrate::{server::...}/crate::{bin::...}imports. -
ADR-029 is now accepted: the stream parser covers all documented SSE event types (error envelopes, heartbeats, thinking/signature deltas, citations, server-tool blocks, web-search results, cache/geo/detail metadata) and TaskState persists plan, session notes, context compaction records, and cache usage stats for multi-agent handoff. ADR-029 is a declared dependency of ADR-030 and a prerequisite for full invariant compliance —
StreamEvent::Errorlets orchestrating agents detect sub-agent stream failures, and the TaskState extensions are the handoff payload that lets an orchestrator reconstruct a sub-agent's context on resume. -
ADR-030 is now accepted with an explicit six-point verification suite: provider events normalize into canonical runtime events, task state owns execution truth, the orchestrator decides whether the task continues or stops, and task handoff or resume consumers depend on that same runtime-owned control flow. ADR-030 is also load-bearing for multi-agent orchestration: Invariants 1, 4, and 5 are the semantic correctness guarantees that make agent handoffs coherent. Without these invariants proven end-to-end, multi-agent orchestration has undefined behaviour at handoff points.
-
ADR-031 extends the active operator surface with timeline selection, stable step identity, explicit approved/running/completed lifecycle rendering, prompt-anchored transcript scrolling, a larger multiline composer, direct ANSI task rendering during orchestration, and keyboard navigation for timeline selection and inspector detail. Each pending tool call carries a stable
step_idand compact input preview. The task-state timeline still derives pending rows asAwaitingApproval,Approved, orRunningfrom canonical state, and theApprovedstate is tracked for manual approvals, session auto-approvals, and capability-grant auto-approvals. Batches A through E are merged intomain. Batch C/D implemented viewport alignment (output-pane scroll ownership and six-line inspector cap) across both the direct ANSI and ratatui renderers. The fullscreen composer now also auto-fits to current display row and column changes, including narrower half-screen or quarter-screen display snaps. Batch E removed the legacyactivity_rowsderivation,draw_timeline_fallback(),draw_legacy_activity_row(), and thelegacy_rowfield fromTaskStepView, and the current ANSI path renders those task-state updates as transcript paragraphs instead of reserving a dedicated top strip. -
ADR-032 adds prompt-area interactivity: interactive
/slash command picker and@pathfile picker withUp/Down/Enter/Escnavigation and hierarchical directory drill-down,!commandshell execution, pasted-block handling, a responsive auto-fit composer surface that keeps those controls visible under display resize, and a context guard that limits project-instructions and notes token budgets. -
ADR-033 introduces the hybrid retrieval context architecture: a
codebase_searchtool (Phase 1) backed by structural keyword indexing, optional semantic vector search via an external embedding endpoint (Phase 2), write guards that steerwrite_filetowardapply_patch/edit_filefor large files (Phase 3), and history condensing that compresses older tool results to stay within the context budget (Phase 4). -
ADR-034 defines the proposed post-milestone multi-agent lane: worktree-isolated agent definitions, orchestrator-owned session-task lifecycle,
/agents,/watch, and explicit session-task release surfaces, plus delegation-time concurrency and prompt-size enforcement built on the canonical ADR-025/ADR-030 contracts. The current hardening pass makes the delegation cap serialized, adds release-route and concurrency-stress coverage, and normalizes parent-task watch rollups onto the same lowercase status surface used by session tasks. -
ADR-038 is now Accepted for memory-first TTFC work. Phase 1 is merged in-tree: context assembly reuses a bounded process-local cache for small file rollups, and automatic git status/diff capture is opt-in rather than mandatory. Phase 1a added search lane tightening (search config during index warmup, incremental refresh independence from auto_index). Phase 2 adds
src/disk_policy.rs(DiskPermission enum, check_path classifier, VEX_DISK_POLICY env) andsrc/config/cache.rs(OnceLock-based Config::load_cached). Batch C extractedsrc/config/load.rs(1361 lines) into a directory module:src/config/load/paths.rs(path discovery),src/config/load/merge.rs(layer merge helpers), andsrc/config/load/parse.rs(enum + header parsing), with orchestration and tests retained insrc/config/load/mod.rs. Batch D splitssrc/tools/operator.rs(865 lines) intosrc/tools/operator/mod.rs,core.rs,file_ops.rs,git_ops.rs, andsearch.rs, preserving behavior while isolating the later disk-policy enforcement seam. Batch E on PR #281 splitssrc/runtime/context_assembler.rsintosrc/runtime/context_assembler/mod.rs(orchestration + tests) andsrc/runtime/context_assembler/reads.rs(candidate-path extraction, rollup conversion, related-path inference). Batch F on the same PR addsenforce()/enforce_runtime()tosrc/disk_policy.rs,tests/disk_policy_tests.rs,make check-disk-policy, and thearch-contracts.ymlCI step. Batch G (PR #282) addssrc/tools/operator/policy.rsfor operator-boundary disk-policy assertions, wiresassert_durable_access()intoTaskState::save()andTaskState::load(), and fixes cross-platformcheck_path()for Windows backslash separators. Batch H (PR #283) extractssrc/runtime/task_state.rs(807 lines) intosrc/runtime/task_state/{mod.rs, persist.rs}, isolating all persistence logic (save/load, directory discovery, file listing, active summary reads) into a dedicated module. WAL evaluation concluded: not warranted because task-state saves are per-session andwrite_json_safealready performs crash-safe writes (temp + fsync + rename). ADR-038 is now Accepted with 0 remaining items.
The transport layer (src/server/) now reaches the runtime exclusively through the application facade (src/app/), and src/local_api.rs retains only the LocalApiMode / LocalApiFrontend runtime-mode bridge types.
Tool-Call Cutover
This note records the current tool-call and transcript rendering findings for the ratatui task surface, the deliberate cutover choices applied in PR 348, and the remaining architecture work after that cutover.
Current constraints
The ratatui task surface already keeps the composer pinned at the bottom edge. The remaining complexity is no longer the pane split; it is the live transcript state.
Today the transcript is assembled from three mutable sources:
history_state.linesfor committed transcript paragraphs and tool rows.current_turn_stream_segmentsfor in-progress assistant text.active_stream_blocksfor typed block metadata and live cursor state.
That split means paragraph replacement has to keep multiple structures in sync whenever a pending tool preview turns into a completed tool-result paragraph. It also means the renderer has to infer one live transcript from several buffers instead of reading one canonical document.
Research summary
The attached tool-call research compared three approaches.
1. Keep the current split model and patch individual bugs
This is the lowest-disruption option, but it keeps the same root problem: scroll math, parser normalization, and paragraph replacement all remain spread across unrelated buffers.
2. Normalize streamed events into an intermediate adapter layer
This improves protocol coverage, but it still leaves paragraph assembly split between the adapter and the ratatui transcript state. It reduces duplication without removing it.
3. Move to a unified document model with a block-aware virtual viewport
This is the recommended direction. A single paragraph/block store becomes the source of truth for:
- pending tool previews
- completed tool results
- final assistant text
- waiting-state telemetry
- wrapped-row viewport math
The viewport then consumes one ordered document instead of reconstructing rows from multiple mutable sources.
PR 348 cutover choices
PR 348 keeps the ratatui-native transcript surface and makes four explicit choices so the UI, parser, and API route all move in the same direction.
1. Viewport contract
- The composer stays pinned to the bottom edge.
- Short transcript bodies now start directly below the status row instead of being bottom-filled with blank space.
- As new rows arrive, the transcript grows downward until it fills the body. Once the body is full, the live window follows the bottom and older rows scroll upward out of view.
2. Transcript rendering contract
- Pending tool paragraphs still render directly into the transcript body instead of a separate timeline strip.
- Completed tool-result replacement preserves scroll position by using the net transcript growth across the full replacement, not the height of the inserted paragraph alone.
- Normalized
StreamDeltatext remains the single visible assistant-text path for downstream consumers. TextualStreamBlockDeltaupdates keep block identity and cursor metadata, but they do not form a second display-text stream.
3. API-route contract
- The local API/runtime envelope is now transcript-first.
- Plain
StreamDeltatext is normalized into syntheticfinal_texttranscript blocks (transcript_block_start,transcript_block_delta,transcript_block_complete) instead of emitting a separate liveassistant_delta/ terminalassistant_messagepair. - The
assistant_deltaandassistant_messageevents are removed. All downstream consumers must read transcript block events only.
4. Parser contract
- Local text-protocol turns default to the hybrid parser chain.
- Tagged
<function=...>parsing stays the fast path. - Generic
<tool_call>,<invoke>, and<tool_use>wrappers are accepted as fallback input, then normalized into the tagged text protocol for assistant history and the next tool round.
Next cutover
The next architecture step is to replace the split transcript state with one canonical task document. The API route has already cut over to the transcript-first shape; the remaining work is to make the in-process task state match that same model.
That cutover should:
- Store pending tool previews, completed tool results, waiting rows, and assistant text as one ordered paragraph list.
- Keep block identity stable so scroll math can reason about net insert, replace, and remove operations directly.
- Let the ratatui viewport render wrapped display rows from that paragraph
list without reconstructing state from
history_state.lines,current_turn_stream_segments, andactive_stream_blocks. - Remove the remaining split between
history_state.lines,current_turn_stream_segments, andactive_stream_blocksso the renderer and the runtime both consume one ordered document.
Until that larger cutover lands, the ratatui transcript path should continue to prefer paragraph-preserving repairs over additional side buffers.
Quick Start
This page gets you from clone to a running session in the fewest steps.
1. Build the binary
git clone https://github.com/aistar-au/vexcoder.git
cd vexcoder
cargo build --release
The binary will be at target/release/vex.
2. Create a workspace
./target/release/vex init
This scaffolds:
.vex/config.toml.vex/validate.tomlAGENTS.md
3. Configure your model endpoint
Local example:
# .vex/config.toml
model_url = "http://localhost:8080/v1"
model_name = "local/default"
model_profile = "models/local-balanced.toml"
For a local Messages-v1 server, use plain HTTP unless you have explicitly configured TLS:
# .vex/config.toml
model_url = "http://localhost:8000/v1/messages"
model_name = "your-model-name"
model_profile = "models/local-balanced.toml"
Remote example:
# .vex/config.toml
model_url = "https://your-endpoint.example/v1/messages"
model_name = "your-model-name"
model_profile = "models/api-structured.toml"
Export a token only when the endpoint requires one:
export VEX_MODEL_TOKEN="your-token"
4. Start the interactive UI
./target/release/vex
5. Run one-shot or batch commands
One-shot plain text:
./target/release/vex -p "summarise this repository"
Batch mode:
./target/release/vex exec --task "review src/app.rs" --format jsonl
6. Verify the local gate
make gate-fast
The local pre-push hook also runs cargo nextest run, which uses nextest's
default cross-platform concurrency. The CI workflow runs 8 parallel jobs with
cargo registry and build-artifact caching.
Once inside an interactive session, the model can explore the codebase using
codebase_search (for functions, types, and code patterns), list_files
(for directory structure), list_dir (non-recursive directory listing), and
glob_files (workspace-wide glob matching) before making targeted reads.
Next
Configuration
VexCoder reads configuration from layered TOML files plus environment variables. The normal starting point is:
vex init
Resolution order
Highest priority wins:
- Environment variables
- Repo-local
.vex/config.toml - User config:
~/.config/vex/config.tomlor~/.vex/config.toml - System config:
/etc/vex/config.toml - Built-in defaults
VEX_MODEL_TOKEN is environment-only. It is never read from config files.
Automatic context assembly now keeps small file rollups in a process-local
memory cache. Search indexes under .vex/index/ and task-state JSON under
.vex/state/ remain the intended disk-backed layers.
Active config keys
These keys are read by the current runtime from config files:
| Key | Purpose | Default |
|---|---|---|
model_url | Model endpoint URL | http://localhost:8080/v1 |
model_url_skip_tls_check | Skip HTTPS certificate validation for the model endpoint | false |
model_name | Model identifier | local/default |
working_dir | Workspace root for tool execution | current directory |
model_backend | local-runtime or api-server | inferred |
model_protocol | messages-v1 or chat-compat | inferred |
tool_call_mode | structured or tagged-fallback | inferred |
model_profile | Path to a repo-tracked profile under models/ | backend default profile |
max_project_instructions_tokens | Project instructions token budget | 4096 |
max_memory_tokens | Notes token budget | 2048 |
sandbox | Command sandbox driver: passthrough, macos-exec, or container | passthrough |
sandbox_profile | Sandbox profile path or container image name | unset |
sandbox_require | Abort startup instead of falling back to passthrough when the sandbox probe fails | false |
notes_path | Notes file used by /memory | unset |
notes_path is user-config only.
When model_profile is set, the runtime loads the profile at startup and uses
its request parameters (temperature, top_p, max_tokens, stop sequences,
reasoning budget, and structured-tool fallback). Relative paths are resolved
from the workspace repo root when one is available, otherwise from the current
working directory.
Tool-call formats
tool_call_mode controls how the runtime expects tool invocations to arrive
from the model layer.
| Mode | Meaning | Current parser boundary |
|---|---|---|
structured | Prefer native structured tool calls from the backend | JSON tool-call arrays and content-block tool-use payloads are parsed via serde_json; streamed fragments keep insertion order with indexmap |
tagged-fallback | Accept XML-like fallback tags from local runtimes that do not emit native structured deltas | Tagged <function=...> scanning remains the fast path, and the local-runtime fallback now defaults to a tagged-plus-XML parser chain that also accepts generic <tool_call> and <invoke> wrappers before normalizing them into the tagged text protocol |
The runtime currently documents three structured tool-call shapes:
- JSON
tool_callsarrays from chat-completion style APIs. - Content-block
tool_userecords from block-oriented APIs. - XML-like fallback tags such as
<function=name>and<parameter=key>.
These paths are distinct from regex-lite processing. regex-lite is used for
git output parsing, secret redaction, and rate-limit extraction; it is not used
for live tool-call parsing.
Feature config sections
[compaction]
Controls proactive conversation compaction. When enabled, the runtime compacts the conversation history when the estimated token count approaches the context budget, keeping recent turns verbatim and folding older context into a summary.
| Key | Purpose | Default |
|---|---|---|
enabled | Enable proactive compaction | false |
threshold_percent | Compact when token usage exceeds this percentage of the context window (10--99) | 80 |
keep_recent_turns | Number of most-recent turns kept verbatim after compaction (1--32) | 4 |
summary_max_tokens | Maximum tokens for the compaction summary (64--4096) | 1024 |
[compaction]
enabled = true
threshold_percent = 75
keep_recent_turns = 6
[undo]
Controls the in-memory checkpoint stack used by /undo.
| Key | Purpose | Default |
|---|---|---|
enabled | Whether /undo is available | true |
max_checkpoints | Maximum checkpoints kept per session | 20 |
[undo]
enabled = true
max_checkpoints = 30
[search]
Controls structural index builds and codebase_search behavior.
When enabled = false, both codebase_search and /reindex are unavailable.
| Key | Purpose | Default |
|---|---|---|
enabled | Enable codebase search indexing | true |
auto_index | Warm the structural index at interactive and batch session start | true |
exclude | Workspace-relative path prefixes to exclude from indexing | ["target/", "node_modules/", ".git/"] |
max_file_size | Skip files larger than this byte count | 1048576 (1 MiB) |
Incremental index updates triggered by file writes during a session always
apply exclude and max_file_size filters regardless of the auto_index
setting. auto_index only controls whether the index is pre-warmed at
session startup.
exclude entries are literal workspace-relative prefixes, not glob patterns.
Use trailing slashes for directory trees such as target/ or src/vendor/.
Entries missing a trailing slash are automatically normalized at config load
time (e.g. "src" becomes "src/").
[search]
enabled = true
auto_index = true
exclude = ["target/", "node_modules/", ".git/", "src/vendor/"]
max_file_size = 524288
[auto_memory]
Controls automatic memory extraction from assistant turns. When enabled, short
factual notes are extracted after each turn and appended to the notes file with
timestamped [auto] tags.
| Key | Purpose | Default |
|---|---|---|
enabled | Enable automatic extraction | false |
max_notes_per_turn | Maximum notes extracted per turn (1--10) | 3 |
[auto_memory]
enabled = true
max_notes_per_turn = 5
Environment variables
VEX_MODEL_URL
The full model endpoint URL.
- URLs containing
/chat/completionsor ending in/v1default tochat-compat. - Other URLs default to
messages-v1. - For plain local inference servers, prefer explicit HTTP
localhost URLs such as
http://localhost:8000/v1/messages. If you enter an HTTPS localhost URL in the interactive startup prompt,vexnow suggests the equivalent plain-HTTP localhost endpoint before the fullscreen session starts. - Same-machine local inference runtimes commonly expose only plain HTTP. That
remains supported when you connect via
localhost,127.x.x.x,::1, or0.0.0.0. LAN-reachable model servers on RFC 1918 private addresses (192.168.x.x,10.x.x.x,172.16–31.x.x) and link-local addresses (169.254.x.x) are also allowed over plain HTTP. Only truly remote (public-internet) endpoints require HTTPS. - If a local endpoint returns HTTP 400 due to context overflow, the error now
shows the server's message verbatim and suggests increasing
--ctx-sizeon the server or using/compactto reset the conversation. - For non-context-overflow 400s, the error includes the detected protocol (MessagesV1 vs ChatCompat) and suggests checking the model name, protocol format, and whether the server supports streaming.
VEX_MODEL_TOKEN
Bearer token for authenticated endpoints.
VEX_MODEL_URL_SKIP_TLS_CHECK
Development-only escape hatch for HTTPS model endpoints with self-signed or otherwise non-system-trusted certificates.
- Accepts
true,false,1, or0. - Emits a startup warning on every launch when enabled.
- Must not be committed in repo-local
.vex/config.toml.
For any model endpoint outside local and private networks, HTTPS is mandatory.
Plain http:// model URLs are rejected at startup for public-internet hosts so
prompts, repository context, and model responses are not sent over unencrypted
network paths. This rule does not block local inference servers reached via
localhost, 127.x.x.x, ::1, 0.0.0.0, or RFC 1918 / link-local LAN
addresses (192.168.x.x, 10.x.x.x, 172.16–31.x.x, 169.254.x.x).
VEX_MODEL_URL_SKIP_TLS_CHECK only relaxes certificate verification for HTTPS
endpoints; it does not permit plain HTTP for public-internet hosts.
VEX_MODEL_NAME
Model identifier sent to the API.
VEX_MODEL_PROTOCOL
Overrides protocol inference. Accepted values: messages-v1, chat-compat.
VEX_MODEL_BACKEND
Overrides backend inference. Accepted values: local-runtime, api-server.
VEX_TOOL_CALL_MODE
Overrides tool-call encoding. Accepted values: structured,
tagged-fallback.
VEX_TOOL_PARSER
Overrides the local text-protocol parser chain. Accepted values:
tagged, hybrid.
taggedkeeps the zero-regex<function=...>and<parameter=...>fast path only.hybridkeeps that fast path and falls back toquick-xmlextraction for generic<tool_call>,<invoke>, and<tool_use>wrappers.
Local endpoints default to hybrid so XML-style tool wrappers still execute
when the backend does not emit native structured tool deltas.
Example:
export VEX_TOOL_PARSER=tagged
VEX_MODEL_PROFILE
Selects a repo-tracked model profile such as models/api-structured.toml.
An invalid or missing path is a startup failure.
VEX_WORKDIR
Overrides the working directory used for tool execution.
VEX_MODEL_HEADERS_JSON
Adds extra request headers as a JSON object.
Example:
export VEX_MODEL_HEADERS_JSON='{"X-Client-Id":"vexcoder"}'
VEX_MAX_PROJECT_INSTRUCTIONS_TOKENS
Overrides the project instructions token budget.
VEX_MAX_MEMORY_TOKENS
Overrides the notes token budget.
VEX_CONTEXT_INCLUDE_GIT
Opt in to automatic git status and diff injection during context assembly.
- Accepts
true,false,1,0,yes,no,on, oroff. - Default:
false. - Explicit git tools and review flows still call git directly; this flag only controls the automatic context path used before a normal model turn.
VEX_CONTEXT_GIT_TIMEOUT_MS
Controls the timeout used by context-related git commands.
- Default:
2000. - Applies to automatic git context when
VEX_CONTEXT_INCLUDE_GIT=1and to the existing review helpers that call git through the shared runtime wrapper.
VEX_DISK_POLICY
Controls the disk-policy enforcement mode (ADR-038).
- Accepted values:
off,warn,strict. - Default:
off. - When set to
strict, forbidden disk access (anything outside.vex/index/and.vex/state/) causes a panic.warnlogs a warning instead. - Intended for CI gates; not typically set in interactive use.
VEX_SANDBOX
Selects the command sandbox driver. Accepted values: passthrough,
macos-exec, container.
passthroughpreserves the current process-spawn behavior.macos-execwraps commands withsandbox-execon macOS.containerwraps commands with the installed container runtime and requiresVEX_SANDBOX_PROFILEto name the container image.- The built-in
macos-execdefault is intentionally compatibility-first: it allows broad file access, network access, process spawning, IPC lookups, and signals so common development tools continue to work. Use a custom profile if you need stricter containment than process wrapping plus policy hooks.
VEX_SANDBOX_PROFILE
Optional sandbox driver parameter.
- For
macos-exec, this is a profile path. When unset, the runtime uses a built-in compatibility-focused policy string. - For
container, this is the image name passed to the container runtime. Startup runs a shortrun --rm <image> trueprobe through that runtime so the selected image is validated before the first wrapped command.
VEX_SANDBOX_REQUIRE
Controls startup fallback when the selected sandbox probe fails.
- Accepts
true,false,1, or0. - When
false, startup emits a warning and falls back topassthrough. - When
true, startup aborts instead of running without containment.
VEX_MAX_TOKENS
Upper bound override for the per-turn generation budget. When set, the value
is treated as the maximum max_tokens for a single turn. The runtime also
polls the local inference server's context size at startup and derives an
effective ceiling of 75% of n_ctx; the actual max_tokens sent is
min(VEX_MAX_TOKENS, n_ctx × 0.75). When not set, the model profile's
max_tokens value serves as the default, still bounded by the server cap.
The runtime also derives per-file read limits and search result budgets from
the effective token budget when explicit overrides are not set.
VEX_MAX_COMMAND_OUTPUT_BYTES
Maximum bytes kept in the accumulated stdout/stderr buffer returned to the
model after a run_command tool call. The full output is always streamed to
the TUI transcript. Default: 51200 (50 KiB).
VEX_READ_FILE_MAX_LINES
Maximum lines returned by the read_file tool when no explicit limit
parameter is provided. When not set, derives from VEX_MAX_TOKENS: roughly
10% of the context budget at ~20 tokens per line.
| Context budget | Auto-cap |
|---|---|
| 4 K tokens | ~50 lines |
| 32 K tokens | ~160 lines |
| 128 K tokens | ~640 lines |
| 1 M+ tokens | up to 10,000 lines |
The read_file tool also accepts offset (1-based line number) and limit
parameters for targeted partial reads.
VEX_DIFF_PREFERRED_ABOVE_LINES
Line threshold above which write_file emits a warning suggesting
apply_patch or edit_file instead. The model sees the warning in the tool
result and is expected to switch strategy on the next attempt. Default: 200.
VEX_WRITE_FILE_MAX_LINES
Hard line limit for write_file. Calls exceeding this are rejected outright
with an error directing the model to use apply_patch or edit_file.
Default: 500.
VEX_SEARCH_MAX_RESULTS
Maximum number of results returned by the codebase_search tool. Default:
10.
VEX_INDEX_MAX_FILES
Maximum number of files indexed for semantic search. Default: 5000.
VEX_EMBEDDING_PROVIDER
Embedding provider for semantic search. Accepted values: compat (standard
/v1/embeddings compatible endpoint) or native (single-text embedding
endpoint). Semantic search is disabled when this variable is unset.
VEX_EMBEDDING_MODEL
Model identifier sent to the embedding endpoint. Required when
VEX_EMBEDDING_PROVIDER is set.
VEX_EMBEDDING_URL
Base URL for the embedding endpoint. Required when VEX_EMBEDDING_PROVIDER
is set.
VEX_EMBEDDING_API_KEY
Bearer token for authenticated embedding endpoints. Set this explicitly for
the embedding endpoint when required; the runtime does not fall back to
VEX_MODEL_TOKEN.
VEX_EMBEDDING_BATCH_SIZE
Number of texts sent per embedding API call. Default: 32.
VEX_HISTORY_KEEP_TURNS
Number of recent conversation turns kept at full fidelity. Older turns are
condensed: tool results keep their first 5 lines plus a
(N more lines) indicator, keeping the conversation within the context
budget without losing the thread of earlier work. Default: 10.
VEX_MCP_TIMEOUT
MCP server connection timeout in seconds applied to every configured server
at session start. Each server entry may also set timeout_secs in the
config file; the per-server value takes priority over this environment
variable. Range: 1–300. Default: 30.
vex init scaffold
vex init writes a commented config skeleton. It includes some reserved
sections for future expansion.
- The active runtime keys are the top-level keys listed above.
[[hooks]]is active today.sandbox,sandbox_profile, andsandbox_requireare active runtime features and apply to TUI, batch mode, inline!command, hooks, and validation subprocesses.[[mcp_servers]]is active today. MCP servers are connected at session start, loaded from the user config layer, and merged into the runtime tool registry asmcp.<server>.<tool>names. Servers are explicitly shut down when the session ends (TUI exit, batch completion, or API server stop).- Commented
[api]remains a scaffold placeholder in config files.VEX_API_*environment variables (transport, host, port, socket, key, protocol, TLS paths) are active and functional for API server configuration. [[mcp_servers]]is rejected in repo-local and system config layers to avoid committed or machine-global auto-launch of arbitrary MCP processes.
MCP servers
Use [[mcp_servers]] only in the user config file. Each server is connected at
session start; load failures abort startup instead of leaving a partial MCP
registry in memory. Connected servers are explicitly cancelled at session end
via McpRegistry::shutdown().
HTTP headers may be written literally, as bare ${NAME} references, or as
templates that mix literal text with ${NAME} segments resolved from the
current process environment.
[[mcp_servers]]
name = "docs"
transport = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "."]
[[mcp_servers]]
name = "remote"
transport = "http"
url = "https://mcp.example.internal/mcp"
timeout_secs = 60
[mcp_servers.headers]
Authorization = "Bearer ${VEX_MCP_AUTH}"
When MCP servers are loaded successfully:
/mcp listshows the current server inventory./mcp show <server>shows the tool names exported by one server./toolsincludes both built-in tools and MCP tools.
Minimal examples
Local endpoint:
model_url = "http://localhost:8080/v1"
model_name = "local/default"
model_profile = "models/local-balanced.toml"
Local Messages-v1 endpoint example:
model_url = "http://localhost:8000/v1/messages"
model_name = "your-model-name"
model_profile = "models/local-balanced.toml"
Remote endpoint:
model_url = "https://api.example.internal/v1/messages"
model_name = "repo-assistant"
model_profile = "models/api-structured.toml"
Token for authenticated endpoints:
export VEX_MODEL_TOKEN="your-token"
CLI and TUI Commands
This page documents the commands and flags implemented in the current binary.
CLI
vex
Starts the interactive full-screen CLI UI. While a task is running, the task
surface uses the ratatui-native renderer for a human-readable header, optional
changed-file row, a full-height transcript body above the composer, and a
larger multiline composer. Tool calls, waiting-state telemetry, and assistant
responses stream into transcript paragraphs on that shared body instead of a
dedicated visible timeline strip. When completed turns record usage metadata, the
header appends a compact ~N.Nk ctx cumulative session indicator. The prompt
surface keeps active / command hints, active @path file suggestions, a current
character count and focus marker in the composer header, submit-time @path
expansion, pasted blocks, and multiline editing available in the same
fullscreen layout. The composer auto-fits to the available display rows and
columns as the window grows, shrinks, or snaps to smaller layouts, so the
prompt surface reflows instead of holding onto a stale fixed-height block. For
repo-overview prompts, the runtime now steers the model
toward list_files at the workspace root or codebase_search before any
targeted read_file; read_file itself requires an explicit non-empty path.
vex --resume [task-id]
Resumes a saved task. With no task id, VexCoder offers recent tasks for selection.
vex -p "PROMPT" or vex --print "PROMPT"
Runs one prompt turn and prints the result to stdout. If stdin is piped, the stdin content is prepended to the prompt.
vex exec --task "TEXT"
Runs a non-interactive batch task.
Useful flags:
--task-file PATH--max-turns N--auto-approve once|task--format jsonl|text--output PATH
Each JSONL turn record includes a tokens object with input, output, and
estimated fields.
vex doctor [--json]
Runs a read-only environment health check. It validates config loading, checks model endpoint reachability, reports sandbox fallback status, probes configured MCP servers without starting them, inspects state-directory writability, and verifies that any present policy file parses cleanly.
Exit code is non-zero only when one or more checks fail. --json emits a JSON
array of {check,status,message} objects.
vex export <task-id> [--format jsonl|markdown] [--output PATH] [--force]
Exports a saved task from .vex/state (or VEX_STATE_DIR).
jsonlmatches the batch-turn schema used byvex execmarkdownomits full assistant response text and only includes tool outcomes--output PATHwrites to a file instead of stdout--forceallows overwriting an existing output file
vex init [--dir PATH]
Creates .vex/config.toml, .vex/validate.toml, and AGENTS.md without
overwriting existing files.
vex branch <name>
Creates and switches to a new git branch from HEAD.
If a saved task state exists, VexCoder records the branch name on the most
recent task file in .vex/state (or VEX_STATE_DIR).
vex pr-summary
Builds a diff from the current branch against the merge-base of the default
remote branch (origin/HEAD) and runs one model turn to draft a PR title and
body.
The result prints to stdout. The current template starts with a Title: line
followed by a Markdown body, so you can review it locally or pipe it into your
own git-hosting CLI workflow.
vex migrate config [--output PATH]
Writes a TOML fragment based on legacy environment variables.
vex completions <bash|zsh|fish|powershell>
Writes shell completion scripts to stdout.
vex install-hooks and vex uninstall-hooks
Installs or removes the repository prepare-commit-msg hook.
vex skills list
Lists installed skills.
vex skills install SOURCE [--subdir PATH]
Installs a skill from a git URL or tarball URL.
vex skills remove NAME
Removes an installed skill by name.
TUI slash commands
Commands entered inside the interactive UI start with /.
Session and task state
/new— save the current task and start a fresh session with a new task ID./resume [task-id]— restore a previously saved task. Lists recent tasks when no ID is given./compact— reset conversation history, turn evidence, and token counters while keeping the current task ID and permission grants. Use this to recover from context-window overflow or to free up context budget./fork [label]— save the current task and start a new task seeded with the same grants./undo— revert the last file-modifying tool call from the in-memory checkpoint stack. Binary-safe: restores raw bytes for text and binary files and removes rename destinations when applicable. Returns a diagnostic when the stack is empty or when undo is disabled via[undo] enabled = false./quit//exit— end the session./about— show version and build info.
Memory
/memory/memory add <note>/memory clear/memory auto on— enable automatic memory extraction for the current session. After each assistant turn, short factual notes are extracted and appended to the notes file with[auto]tags./memory auto off— disable automatic memory extraction for the current session./memory auto clear— remove all[auto]-tagged notes from the notes file.
Permissions
/permissions/allow <capability> [once|session]/deny <capability>
Model and diff helpers
/model/model <name>/diff/diff --staged
Edit loop
/edit <instruction>- Expands
@pathmentions inside the instruction before the edit loop starts so picked files can be inlined as context. - Grants task-scoped
write-file,apply-patch, andrun-commandpermissions for the active edit workflow unless that capability is already session-scoped.
- Expands
/fix- Restores the edit loop from the last validation failure and re-seeds the same task-scoped edit permissions without narrowing existing session grants.
Read-only semantic turns
/explain [path]- Accepts either a plain workspace-relative path or
@path;@pathis normalized to the requested file target before context assembly runs.
- Accepts either a plain workspace-relative path or
/review [--base <git-ref>] [--files <glob>] [<instruction>]- Starts a single review turn without entering the edit loop.
- With no flags, reviews
git diff HEAD. --base <git-ref>reviewsgit diff <git-ref>after validating the ref.--files <glob>assembles matching workspace files instead of a diff and cannot be combined with--base.- Expands
@pathmentions inside the free-form review instruction before the review turn starts. When--filesreceives@glob, the leading@is stripped before file matching. - Patch requests are silently denied during the turn.
/plan <instruction>- Generates a concise implementation plan for the given instruction.
- Assembles workspace context via
ContextAssembler; rendersplan_template.txt. - Expands
@pathmentions inside the instruction before the plan turn starts. - Never enters the edit loop; patch requests are silently denied during the turn.
/init [environment]- Scaffolds
.vex/config.toml,.vex/validate.toml, andAGENTS.mdin the current workspace. - Reports the selected environment label in the transcript when one is supplied.
- Scaffolds
/context/mcp [list|show <server>]- Zero-turn MCP inspection surface.
/mcpand/mcp listshow loaded servers, transports, and tool counts./mcp show <server>lists the server's fully qualifiedmcp.<server>.<tool>names.- If no servers are loaded, the transcript shows
[mcp] no MCP servers loaded.
/tools [desc]- Zero-turn tool inventory.
- Always shows built-in tools and retrieval/mutation guidance.
- Includes loaded MCP tools under a dedicated
[tools:mcp]section. /tools descadds one-line descriptions from the tool schemas.
/usage/commands/help
When a read-only turn asks for a repo summary instead of a specific file, the
runtime prefers list_files and codebase_search first. If the model emits a
read_file call without a concrete path, VexCoder returns a clarification
instead of looping the raw tool error, even when the malformed read_file
arrives in the same parallel tool round as other read-only calls.
/usage prints the most recent turn's token counts and the cumulative session
totals. If the runtime does not return usage metadata, the values are estimated
from character counts and marked (estimated). /new and /compact reset the
session totals.
Test generation
/generate-tests [path] [--framework <name>]- Starts a single semantic turn using the test-generation prompt template.
- Assembles context for the requested path, or the most recently assembled file when no path is provided.
- Only test-file mutations are allowed; source-file edits must use
/edit.
Custom commands
/.vex/commands/*.toml~/.config/vex/commands/*.toml- Custom slash commands load at session start from project and user command directories.
- Project-scoped commands override user-scoped commands with the same name.
- Templates support
{{context}}and{{input}}substitution.
Validation helpers
/run [command]/test- Run without starting a model turn.
- Command output is captured for the transcript, with per-command stdout, stderr, and exit status summarized after each command completes.
/reindex- Rebuilds the codebase structural index in the background without blocking the TUI. Reports completion back to the transcript when finished.
- Refuses to run when
[search].enabled = false.
Free-form input transforms
@path- Expands a workspace-relative file or directory into the prompt when the turn is submitted.
- While composing, the prompt footer searches the entire repo tree, including nested subdirectories, ranks matches by basename and path relevance, and keeps a bounded top-ranked candidate set per keystroke instead of sorting the full workspace on every keypress.
- When a file mention is active,
UpandDownmove the suggestion picker through the full match list,Enterinserts the selected workspace-relative path into the composer, andEscdismisses the picker so the raw mention can still be submitted unchanged. - Files are inlined as fenced text blocks. Missing paths are annotated inline instead of aborting the turn.
- Directories render a compact workspace-relative listing.
- Slash commands with free-form instructions (
/edit,/plan,/review) expand selected@pathmentions before the model turn starts./explaintreats@pathas the requested file target. - Repo summaries still need tool evidence: use a plain prompt when you want the model to start with
list_filesorcodebase_search, and use@pathonly when you already know the file or directory you want to inline.
!command- Runs a shell command immediately from the workspace without starting a model turn when the composer is submitted.
- Uses the same
run_commandapproval gate as tool calls. - Starts a captured command session inside the managed TUI instead of yielding control back to the parent CLI session.
- The transcript records the command, PID, streamed output, and final
[command session exit: N]status.
Tool inventory
The model can invoke the following tools during a turn. Read-only tools run without confirmation; mutating tools require operator approval (or a session/capability auto-approval grant).
Read-only tools
| Tool | Purpose |
|---|---|
read_file | Read file content from an explicit non-empty path. Accepts offset (1-based line) and limit for partial reads. For repo overviews, use list_files or codebase_search first. |
list_files | List files and directories under a path, or the workspace root when omitted. Prefer this for initial repo exploration. |
list_directory | Alias for list_files. |
search_files | Search text across files and return matching lines. |
search | Alias for search_files. |
find_files | Find files by name pattern (glob) within the workspace. |
list_dir | Non-recursive directory listing. Workspace-confined and .gitignore-aware. Optional path (defaults to workspace root); optional max_entries (default 200, hard cap 500). |
glob_files | Workspace-wide glob matching. .gitignore-aware with bounded results. Required pattern (supports *, **, ?, [abc], [a-z], [^x]); optional max_results (default 50, hard cap 200). |
codebase_search | Search the structural index for functions, types, and code patterns by name or keyword. Returns ranked code snippets with file paths and line numbers. When embeddings are configured, also performs semantic reranking. Prefer this over read_file for exploring unfamiliar code. |
git_status | Show git repository status. |
git_diff | Show git diff output. |
Mutating tools
| Tool | Purpose |
|---|---|
write_file | Write full file content. Files above VEX_DIFF_PREFERRED_ABOVE_LINES (default 200) trigger a warning suggesting apply_patch or edit_file. Files above VEX_WRITE_FILE_MAX_LINES (default 500) are rejected. |
edit_file | Replace one exact unique snippet (old_str → new_str). Preferred for targeted edits. Transcript previews keep multiline diff hunks so added and removed rows stay visible during review. |
apply_patch | Apply full-file content as a patch. Preferred for large-scale changes where edit_file is impractical. |
rename_file | Rename or move a file within the workspace. |
run_command | Execute a shell command in the workspace. |
Search ranking
codebase_search uses a Tree-sitter-based structural index that extracts
functions, structs, enums, impls, traits, modules, constants, and type
aliases from Rust source files. The index is built at session start and
updated incrementally on file writes.
Results are scored by:
- Exact name match: highest priority
- Substring / fuzzy name match
- Parent scope match
- Content keyword match (per word)
Results are capped at VEX_SEARCH_MAX_RESULTS (default 10). When an
embedding provider is configured (VEX_EMBEDDING_PROVIDER), results are
additionally reranked by semantic similarity using the persisted vector index
at .vex/index/.
Error handling
Context-overflow recovery
When the conversation exceeds the server's context window, VexCoder detects the overflow from the HTTP 400 response body and provides actionable guidance:
- Local endpoints: suggests restarting the server with a larger context
size (e.g.
--ctx-size 8192) or using/compactto reset the conversation. - Remote endpoints: suggests using
/compactto reset the conversation.
The server's error message is shown verbatim, capped at 300 characters.
For non-context-overflow HTTP 400 errors from local endpoints, the error includes the detected protocol (MessagesV1 vs ChatCompat) and suggests checking the model name, protocol format, and whether the server supports streaming.
Keyboard notes
Ctrl+Crequests cancellation for the active turn.Alt+UpandAlt+Downmove the selected entry in the adaptive task timeline.TabandShift+Tabalso move timeline selection forward and backward while the task surface is active.- The visible timeline window scales with display height instead of staying fixed at six rows.
- The composer auto-fits to the current display row and column budget, so snapping the display to half-screen or quarter-screen sizes reflows the prompt surface instead of overflowing or leaving empty space.
PageUp,PageDown,Ctrl+Up, andCtrl+Downscroll the transcript/output pane upward from the prompt edge instead of moving the cursor.Ctrl+Homejumps to the oldest visible transcript content, andCtrl+Endreturns to the current bottom edge.- The transcript pane keeps the full session scrollback visible while follow mode is on; new model responses append at the bottom instead of replacing the prior response view.
- Transcript scrolling follows wrapped display rows, so long paragraphs, embedded newlines, and multiline diff previews remain reachable in both fullscreen and fallback transcript views.
- Selecting older timeline entries manually switches the output pane into inspector detail for that step until follow mode resumes.
Shift+Enterinserts a newline without submitting the turn.- Pasted text is inserted into the larger multiline prompt surface during normal editing.
- The composer header shows a current focus indicator (
focused/unfocused) and a character count that updates as you type.
Legacy Config Note
VexCoder keeps vex migrate config as a small compatibility helper for older
local setups that still export legacy VEX_* values.
There is no separate migration workflow documented for normal installs. For current setup guidance, use the main docs instead:
If you do need the compatibility helper, run vex migrate config --help to see
its current CLI surface.