Clanker Activity Model
Scope: Current shipped mapping from the shared attention contract into text and voice behavior. Shared attention model:
presence-and-attention.mdUnified text initiative cycle:initiative.mdVoice pipeline and stage settings:../voice/voice-provider-abstraction.mdArchitecture overview:overview.md
This document is the source of truth for how the current runtime surfaces one shared conversational continuity contract through text and voice. It documents the shipped spokes and guardrails, not a giant transport-level finite-state machine.
Autonomy Principle
Settings shape context, budgets, and transport. They do not script the bot's creative behavior. Admission gates are cost/noise gates, not relevance gates. The model can always choose silence through [SKIP], and tools are capabilities it may use when they help.


Mental Model
The canonical behavioral picture is:
- one shared continuity contract with
ACTIVEandAMBIENT - a text spoke that decides how attention surfaces in channels
- a voice spoke that decides how attention surfaces in VC
- orthogonal overlays such as music playback, wake latch, and assistant output lock
What matters operationally is continuity, not a literal central module. Text and voice may read different local signals as long as they preserve the same product contract.
What is currently shipped under that model:
Text has four shipped surfaces:
Directly addressed replyRecent-window follow-up replyCold ambient reactive replyUnified initiative cycle
Voice has four shipped surfaces:
Directed or engaged voice replyVoice admission and floor ownershipVoice thought engineTool-assisted reply inside the active turn
Discovery feed items are optional context inside the ambient text surface. Music playback is an overlay on top of attention, not a third attention mode.
At the product level, the desired text split is simple:
- if text attention is
ACTIVE, new turns should go straight to immediate reply evaluation - if text attention is
AMBIENT, the ambient text thought cycle should own whether Clanker surfaces a follow-up or initiative post
Current shipped text behavior is slightly messier than that ideal because there is still a separate cold ambient reactive reply lane alongside the ambient thought cycle.
Text Activity Paths
1. Directly Addressed Reply
This is the clearest text-side promotion into ACTIVE.
Triggers include:
- the user mentions the bot
- the user replies to a bot message
- the user clearly uses the configured bot name or alias
Behavior:
- the message enters the reply pipeline immediately
- direct address promotes conversational attention, but it does not mark the turn as response-required
- the model still decides the exact reply or
[SKIP] - global blockers still apply, including permissions, rate limits, cooldowns, and runtime failures
Main settings:
permissions.replies.allowRepliesinteraction.activity.minSecondsBetweenMessagesinteraction.activity.replyCoalesceWindowSeconds
Direct address bypasses the ambient and response-window admission gates. Those knobs still shape prompt context, but they do not decide whether an explicitly addressed turn enters the reply pipeline.
Startup catchup reuses this same direct-address path for missed addressed turns after downtime. It scans the full reply-eligible startup channel set, hydrates recent history into the store, and replays addressed turns into the normal decision loop rather than forcing a guaranteed reply.
Relevant code:
src/bot/replyAdmission.tssrc/bot/replyPipeline.tssrc/prompts/promptText.ts
2. Recent-Window Follow-Up Reply
If a user replies directly to a recent bot message, or the latest recent bot reply was aimed at the same author and no other human has already moved the room on, a later non-addressed text turn can still enter reply admission as a follow-up. This is the text spoke's current shipped version of sustained ACTIVE attention.
Behavior:
- this is still a reactive reply path, not ambient cold-start posting
permissions.replies.allowUnsolicitedRepliesstill mattersinteraction.activity.responseWindowEagernesscontrols how wide the recent-message follow-up window is for this text-side active-thread check- a conflicting newer human message collapses the turn back to
AMBIENT - the model can still
[SKIP] - this is message-window continuity, not a voice-style real-time timer
Relevant code:
src/bot/replyAdmission.tssrc/bot/replyPipeline.tssrc/prompts/promptText.ts
3. Cold Ambient Reactive Reply
If there is no recent thread to continue, the bot can still consider a non-addressed text reply as an ambient chime-in. This is a reactive ambient surface, not an engaged-thread surface.
Behavior:
- this is separate from both direct address and recent-window follow-ups
interaction.activity.ambientReplyEagernessis the admission dial for these cold reactive turns- reply-channel membership adds a deterministic admission bonus, but the model still decides whether to speak or
[SKIP] - the model still decides whether to reply or
[SKIP] - this is the main place where current shipped text behavior still goes beyond the cleaner
ACTIVEimmediate-reply vsAMBIENTthought-loop split
Relevant code:
src/bot/replyAdmission.tssrc/bot/replyPipeline.tssrc/prompts/promptText.ts
4. Unified Initiative Cycle
This is the cold-start ambient text surface. It handles both conversational chime-ins and standalone proactive posts.
Behavior:
- runs on an in-process 60 second tick
- applies deterministic gates first: enabled, cooldown, daily cap, and eagerness probability
initiative.text.eagernessbelongs to this ambient surface only; it does not gate immediateACTIVEtext replies- builds context for eligible channels, passive discovery feed items, source performance, and memory
- gives the model a bounded tool loop
- the model decides whether to post, which channel fits, whether to use tools, whether to include links, and whether to request media
Canonical channel pool:
permissions.replies.replyChannelIds
initiative.discovery.* controls feed and media infrastructure for the same ambient cycle.
Main settings:
initiative.text.enabledinitiative.text.eagernessinitiative.text.minMinutesBetweenPostsinitiative.text.maxPostsPerDayinitiative.text.lookbackMessagesinitiative.text.allowActiveCuriosityinitiative.text.maxToolStepsinitiative.text.maxToolCallsinitiative.discovery.*
Relevant code:
src/bot/initiativeEngine.tssrc/services/discovery.tssrc/prompts/promptText.tssrc/store/settingsNormalization.ts
Voice Activity Paths
1. Directed Or Engaged Voice Reply
Voice behavior runs only inside an active voice session. The reply path considers direct address, command follow-up state, speaker ownership, recent room context, and output locking. This is the main voice-side ACTIVE surface.
Behavior:
- deterministic gates run first
- if the turn survives, the active reply path handles generation
- the generation layer decides the actual words or silence
Main settings:
voice.enabledvoice.conversationPolicy.ambientReplyEagernessinteraction.activity.responseWindowEagernessinteraction.activity.reactivityvoice.soundboard.eagernessvoice.conversationPolicy.commandOnlyModevoice.conversationPolicy.replyPathvoice.admission.modevoice.admission.musicWakeLatchSeconds
Relevant code:
src/voice/voiceReplyDecision.tssrc/voice/turnProcessor.tssrc/voice/voiceConfigResolver.ts
2. Voice Admission Gate
Voice admission is not a separate mind. It is the voice spoke's cost/floor gate informed by the current continuity state.
Voice admission currently has two layers:
- deterministic gates
- classifier or generation-owned skip behavior, depending on reply path
Current public surface:
voice.admission.modevoice.admission.musicWakeLatchSecondsagentStack.overrides.voiceAdmissionClassifier
Runtime behavior:
bridgereply path always behaves as classifier-first, because the text-to-realtime bridge has no native[SKIP]brainreply path defaults togeneration_decides, but can optionally useclassifier_gateas a cost gate before the main brainnativereply path does not use the text classifier pathvoice.conversationPolicy.commandOnlyModestays an overlay on top of shared attention: explicit wakeups, direct-address turns, and owned command follow-ups can still enter, but command-only does not create a third attention mode- internal runtime labels like
hard_classifierandgeneration_onlyare implementation details, not canonical settings names
Relevant code:
src/voice/voiceReplyDecision.tssrc/settings/agentStack.ts
3. Voice Thought Engine
This is the ambient voice surface. It runs only while a voice session is active and the room is quiet enough.
Behavior:
- gated by shared ambient attention, silence, and spacing, not by a global cron
- the thought loop waits for voice attention to decay back to
AMBIENT; a warm active exchange is not treated as ambient just because the room is briefly quiet - eagerness is a probability gate before generation
- even after the gate passes, the generated thought can still be rejected or skipped
- delivery uses the active voice output transport
Canonical cadence settings:
initiative.voice.enabledinitiative.voice.eagernessinitiative.voice.minSilenceSecondsinitiative.voice.minSecondsBetweenThoughts
Implementation note:
- the live thought generator resolves provider/model from
initiative.voice.executionvia the resolved voice-initiative binding
Relevant code:
src/voice/thoughtEngine.tssrc/voice/voiceThoughtGeneration.tssrc/settings/agentStack.ts
4. Voice Runtime Modes
Voice uses three reply-path shapes under the same attention model:
native: provider-native audio in and audio outbridge: local transcription, then labeled text into the realtime providerbrain: orchestrator-owned text generation, then realtime or API TTS delivery
The operator-facing knobs are:
voice.conversationPolicy.replyPathvoice.conversationPolicy.ttsModeagentStack.runtimeConfig.voice.runtimeModeagentStack.runtimeConfig.voice.generation
Channel Scope Rules
Text Scope
Reactive replies and ambient text delivery respect the text permission surfaces:
permissions.replies.replyChannelIds— unsolicited reply channels where the bot vibes (eagerness bonus, standalone posts, softer prompt tone)permissions.replies.discoveryChannelIds— discovery channels where the bot can post freely on its own initiative (proactive thoughts, feed shares)permissions.replies.allowedChannelIdspermissions.replies.blockedChannelIdspermissions.replies.blockedUserIds
discoveryChannelIds is the canonical initiative pool for proactive posts. If it is empty, the initiative cycle has no eligible text channels. A channel can appear in both lists to get both behaviors (vibe bonus + proactive posting).
Being active in VC can inform text prompt context, but it does not by itself promote unrelated text turns into ACTIVE.
Voice Scope
Voice session eligibility is controlled separately. Shared attention can span text and voice in the same social context, but transport access is still modality-specific:
voice.channelPolicy.allowedChannelIdsvoice.channelPolicy.blockedChannelIdsvoice.channelPolicy.blockedUserIdsvoice.sessionLimits.*
Text channel permissions do not determine which voice channels the bot may join. Likewise, being active in text can inform VC continuity without automatically granting voice floor ownership.
Setting Map
Shared Activity Axes
| Surface | What it controls |
|---|---|
interaction.activity.ambientReplyEagerness | Current shipped cold ambient text-reply admission when there is no direct address or active follow-up thread |
interaction.activity.responseWindowEagerness | How wide text recent-message follow-up admission is; in voice today it mainly biases follow-up prompting/classification rather than directly changing the runtime recency timer |
interaction.activity.reactivity | Shared quick reactions such as emoji responses and other lightweight acknowledgements |
voice.conversationPolicy.ambientReplyEagerness | Ambient voice replies when the bot is in VC but not directly addressed |
Text Reply Controls
| Surface | What it controls |
|---|---|
permissions.replies.allowReplies | Global text reply master switch |
permissions.replies.allowUnsolicitedReplies | Whether non-addressed reactive follow-up replies may enter admission |
interaction.activity.minSecondsBetweenMessages | Global spacing between bot text messages |
Text Initiative Controls
| Surface | What it controls |
|---|---|
initiative.text.enabled | Master switch for ambient text delivery |
initiative.text.eagerness | Probability gate before the ambient text LLM call; does not govern immediate ACTIVE text turns |
initiative.text.minMinutesBetweenPosts | Minimum gap between ambient text considerations |
initiative.text.maxPostsPerDay | Daily ambient text budget |
initiative.text.lookbackMessages | Per-channel context window size |
initiative.text.allowActiveCuriosity | Whether web_search, web_scrape, and browser_browse are available |
initiative.text.maxToolSteps / initiative.text.maxToolCalls | Ambient text tool-loop budgets |
initiative.discovery.* | Passive feed collection, source curation, and media budgets |
Voice Controls
| Surface | What it controls |
|---|---|
voice.conversationPolicy.commandOnlyMode | Restricts voice behavior toward commands and explicit wakeups |
voice.conversationPolicy.replyPath | native, bridge, or brain |
voice.conversationPolicy.ttsMode | Realtime or API TTS output for brain path |
voice.admission.mode | Public voice-side admission surface |
voice.admission.musicWakeLatchSeconds | Follow-up wake window during music playback |
voice.soundboard.eagerness | How readily the bot uses Discord soundboard beats when they fit |
voice.transcription.* | ASR enablement and language hinting |
voice.sessionLimits.* | Session duration and concurrency limits |
agentStack.runtimeConfig.voice.* | Provider/runtime-specific transport and generation config |
initiative.voice.* | Ambient voice thought cadence |
Tool Calling Model
Text and voice share most of the same conversational tool surface, but voice keeps durable-memory search automatic rather than model-invoked. The tools are capabilities the model may choose when they help.
Voice exposes the actual per-turn tool set in the turn prompt. The static capability text is only a reference; the Tools: line names the schemas mounted for that specific LLM call. Full-brain voice turns keep ordinary recall/research/browser tools available when their runtimes exist, but prune media/control tools from casual turns unless the transcript, active playback state, pending media choice, or session-ending context makes them relevant. Provider-native realtime sessions use the same local registry for executable realtime tools. Tools without a voice executor, such as video_context, stay on the text/reply-loop surface until voice execution is wired.
Core shared conversational tools include:
conversation_searchmemory_writeweb_searchweb_scrapebrowser_browsespawn_code_worker
Reply-loop conditional tools include:
memory_searchimage_lookupstart_screen_watch
Voice-only or voice-centric tools include:
music_*play_soundboard- join / leave voice-channel actions
- provider-native realtime function calls
Relevant code:
src/tools/replyTools.tssrc/voice/voiceToolCallInfra.tssrc/voice/voiceToolCallDispatch.ts
Defaults Worth Remembering
Base initiative defaults from settingsSchema.ts:
interaction.activity.ambientReplyEagerness = 20interaction.activity.responseWindowEagerness = 55interaction.activity.reactivity = 40voice.conversationPolicy.ambientReplyEagerness = 50initiative.text.enabled = trueinitiative.text.eagerness = 20initiative.text.minMinutesBetweenPosts = 360initiative.text.maxPostsPerDay = 3initiative.voice.enabled = trueinitiative.voice.eagerness = 50initiative.voice.minSilenceSeconds = 45initiative.voice.minSecondsBetweenThoughts = 60
Base voice defaults from settingsSchema.ts:
voice.conversationPolicy.replyPath = "brain"voice.conversationPolicy.ttsMode = "realtime"voice.admission.mode = "generation_decides"voice.soundboard.eagerness = 40
Preset resolution can override parts of the effective voice runtime, so the active preset still matters.
Source Files
src/bot/replyAdmission.tssrc/bot/replyPipeline.tssrc/bot/initiativeEngine.tssrc/services/discovery.tssrc/voice/voiceReplyDecision.tssrc/voice/thoughtEngine.tssrc/voice/voiceThoughtGeneration.tssrc/voice/voiceConfigResolver.tssrc/prompts/promptText.ts
