docs/tmp/realtime-full-brain-parity-plan.md

Realtime vs Full-Brain Parity Plan

Goal

Bring provider-native realtime voice behavior closer to the full-brain/orchestrator path without a rewrite.

Current Assessment

Tool access is fairly close to parity.
Screen watch start/use is fairly close to parity.
Context richness, memory behavior, reasoning quality, and debugging visibility are not yet in parity.
Same tools does not currently mean same judgment.

Recommended Plan

1. Unify turn context first

Build one shared voice turn context layer used by both full-brain and provider-native paths.

Include:

active sharers
screen-watch notes
membership events
memory slice
recent tool outcomes
music state
addressing context

Primary integration targets:

src/bot/voiceReplies.ts
src/voice/voiceReplyPipeline.ts
realtime instruction refresh / provider-native tool path

2. Make screen-watch context identical

Provider-native should receive the same screen-watch capability summary and rolling note buffer the brain path sees, not just start_screen_watch.

Expose the same view of:

who is sharing
who can be watched
what notes already exist
what screen-watch state is active now

This is likely the most user-visible parity win.

3. Normalize tool loop semantics

Keep one canonical tool descriptor set and one canonical tool-result summary shape for both paths.

After provider-native tool execution, inject a compact structured result summary back into live context so follow-up reasoning matches brain behavior more closely.

Goal:

same tools
same result understanding
same retry/failure behavior

4. Bring memory loading into parity

Make provider-native use the same continuity and behavioral-memory policy as the full-brain path where feasible.

Even if transport differs, memory selection should not.

This improves:

follow-up coherence
identity continuity
room-state awareness

5. Unify capability logging and operator replay

Add the same capability snapshots and decision breadcrumbs to provider-native turns that the full-brain path already emits.

Especially:

screen watch capability
tool availability
selected target
memory load summary
post-tool result summary

This reduces cases where behavior differs but the reason is unclear.

6. Add parity golden tests

Create side-by-side scenarios that run under both:

replyPath="brain"
provider-native realtime tool ownership

Start with:

watch X's stream
search and answer
tool failure recovery
follow-up after recent bot reply
multi-speaker interruption and resume

Recommended Order

Shared turn context
Screen-watch parity
Tool-result parity
Memory parity
Logging/replay parity
Golden parity suite

Impact vs Effort

High impact / medium effort

shared turn context
screen-watch parity

High impact / low-medium effort

logging parity

Medium-high impact / medium effort

tool-result parity

High impact / higher effort

memory parity

Critical for safety

golden parity tests

Suggested First Implementation Pass

Extract a shared builder for:

active stream roster
screen-watch capability
rolling screen notes
recent room events
compact memory slice

Then consume it from both:

src/bot/voiceReplies.ts
realtime instruction refresh / provider-native tool path

Product Language

Make both paths see the same room before asking them to behave the same.