What It Is For
Thread episodic context is useful when the user refers to prior discussion instead of a durable fact:- “continue the implementation plan from earlier”;
- “what did you answer yesterday in the research thread?”;
- “use the API constraints we discussed above”;
- “why did you say that the image showed this model?”;
- “find the message where we decided to remove that fallback.”
What It Is Not
Thread episodic context is not a raw transcript dump. It is also not a replacement for durable memory.| Need | Use |
|---|---|
| ”Remember that my birthday is May 6.” | Durable memory. |
| ”What did we discuss three turns ago?” | Retained recent history or thread episodic context. |
| ”What was the plan from yesterday’s architecture discussion?” | Thread/workspace episodic recall. |
| ”Always answer me in Russian.” | Durable memory or explicit current instruction. |
| ”Read the PDF I uploaded in that old message.” | Thread episodic recall may recover the message; artifact_read loads the file only if needed. |
Storage Model
Thread episodic context uses a database control plane plus memvid searchable capsules. The database stores the authoritative metadata:- workspace id;
- thread id;
- turn id;
- item id;
- chunk index;
- role/source context;
- status;
- text hash;
- capsule/indexing state;
- retry and repair state.
Why Chunks Exist
Some messages are short. A single user message might be one sentence, and the whole message can be indexed as one chunk. Some messages are long. An assistant answer can contain a full research report, multiple alternatives, code, tool summaries, and conclusions. Searching that as one giant block gives poor recall: one relevant sentence is buried inside unrelated text, and one hit can consume too much prompt budget. Chunks give Pioneer smaller searchable units with exact provenance:What Gets Indexed
Thread episodic context indexes visible conversation material that can help future turns. The normal sources are user messages and assistant messages. Tool and task output are not dumped wholesale into the index; when they become useful, they should be summarized or projected into a human-meaningful text surface first. This matters because raw tool output can be huge, noisy, or sensitive. The thread context layer is for recallable conversation context, not for storing every byte that passed through a tool.Recall Flow
Thread episodic recall is usually planned before the main model answer: The preflight planner does not read memvid directly. It returns a structured plan. Rust validates that plan and runs the search through the gateway service. If the plan is invalid, unsupported, too broad, or too ambiguous, the gateway can skip episodic recall safely.Current Thread And Workspace Recall
There are two useful recall shapes:| Shape | Meaning |
|---|---|
| Current-thread recall | Search earlier indexed context from the same thread. Useful for long threads and local follow-ups. |
| Workspace-thread recall | Search indexed context across threads in the same workspace. Useful for “what did we discuss yesterday?” or “find the prior research thread.” |
Artifact References
Conversation history often includes files: photos, screenshots, PDFs, CSVs, logs, and generated reports. Thread episodic context does not inline those file bytes into indexed chunks. Instead, if a recalled snippet came from a message with artifacts, Pioneer can render a scoped artifact-reference block for that snippet:artifact tool domain and call artifact_read if it actually needs the image or file content. If the question can be answered from the text snippet alone, no artifact bytes are loaded.
Configuration
Thread context is controlled undergateway.thread_episodic:
Operational Notes
Indexing is asynchronous and retryable. A message may be visible in the timeline before its thread-context chunk is searchable. That is expected. Recent conversation history still goes into the provider request directly, so the assistant does not need episodic recall for the last few messages. Recall is best-effort. If the query is too vague, the indexed text is not ready, or the prompt budget is too small, no snippet may be injected. The assistant should still answer from the current prompt and available tools rather than assuming recall is perfect.Developer Rules
- Keep thread episodic storage separate from durable memory storage.
- Do not turn every thread chunk into an active durable memory record.
- Do not add a fallback that simply dumps recent history again; recent history is already part of normal conversation messages.
- Filter memvid hits through the database control plane before prompt injection.
- Keep artifact refs scoped to the recalled snippet that produced them.
- Do not send artifact bytes unless the model explicitly reads a specific artifact through
artifact_read. - Keep search bounded by workspace/thread filters, candidate limits, and prompt budgets.
Related Pages
- Memory Architecture explains durable memory and active recall.
- Prompt And Context explains how recalled snippets enter the model prompt.
- Artifact Store explains artifact refs and
artifact_read. - Configuration Reference lists thread episodic settings.