Why it matters: the cohort's most useful agent work keeps vanishing into private sessions — four projects are independently building the substrate that would make it durable, routable, and safe.

TL;DR

Most agent work in the cohort is invisible — not hidden, but lost, because the substrate doesn't preserve it. Agents churn through a context window, get rate-limited overnight, lose the thread, and start fresh. Four projects converged on the same diagnosis from different angles: the next useful agent layer isn't a better model — it's durable memory, social routing, observable workflows, and human override, treated as one system. #contexto attacks memory; #teleport-router routes work between humans and agents; #jjhub (Smithers) makes workflows durable and inspectable; #tinycloud's autonomous-DAO experiment stress-tests what happens with none of it. What surfaced:

  • Context-window collapse is the failure mode nothing else fixes.
  • Long-running agents die without durability primitives (durable steps, persistent state, observability, hot-mode editing).
  • Agent work vanishes into private sessions — routing the exhaust into shared channels makes it replayable.
  • Heartbeat / persistent state is load-bearing — the most directive context must sit closest to the head of every loop.
  • Human override is the safety layer, now — not a future feature.
  • Goal-mode (outcomes, not task trees) is where the layer is heading.

Cross-project connections, open questions, and the full resources table are in the Appendix.

the claim

Most of the agent work happening in this cohort is invisible. Not because anyone is hiding it, but because the substrate it runs on doesn't preserve it. Agents spin up in terminal tabs, churn through a context window, get rate-limited overnight, lose the thread, and start fresh in the morning with a heartbeat file as their only thread to who they used to be. The work that's actually delivered is the work whose state somebody happened to capture in a commit message.

Four cohort projects converged on the same diagnosis from different angles this week: the next useful agent layer is not a better model. It is durable memory, social routing, observable workflows, and human override — treated as a single system, not as four plugins. #contexto attacks the memory side. #teleport-router routes work between humans and agents. #jjhub (Smithers) gives workflows durability and time-travel. #tinycloud's autonomous DAO experiment stress-tests what happens when none of that is present. Taken together, they describe the same product.

what surfaced this week

1. Context window collapse is the failure mode that nothing else fixes

#contexto gave the cleanest framing of the underlying problem: context is everything an agent sees before it acts — system prompts, skills, tools, runtime inputs — and once the context window fills it compacts, and key reasoning is lost in that process. The compaction isn't a model bug. It is the architectural reality every agent today runs on. Bigger windows, sliding windows, raw markdown dumps each fall short; the actual fix is an indexer that clusters episodes (prior agent runs) hierarchically and retrieves the right context for the current step deterministically.

This is the layer that has to exist before any of the other ones do useful work. Smithers can save every workflow output to SQLite, Router can publish every commit to a team channel, the agent DAO can vote on its next task — but if the next prompt the agent sees is a compacted summary that lost the actual reasoning, the system reconverges to the same lossy steady state.

For the cohort: every agent project here has a context-eviction problem hiding inside it. #contexto is the project that's named it explicitly, but #elizaos, #wikigen, #signalstack, #conclave, and the multi-agent setups all run into the same edge.

2. Long-running agents die without durability primitives

#jjhub opened Smithers' design with the archetypal failure: you kick off a long workflow, go to bed, and wake up to find a silly bug meant it never ran overnight. That is not a corner case. It is the median outcome of every long-running agent task in the cohort today.

Smithers' answer is five primitives stated as a unit: durable steps, persistent state, parallel execution, event-driven flows, and observability. The cohort is used to seeing these as separate concerns (Temporal handles durability, OpenTelemetry handles observability, etc.) but the claim is that for agents you need them in one runtime because they fail as a unit. An agent that can be restarted from frame 7 but can't be inspected has nothing useful to restart into.

The most cohort-relevant Smithers detail: hot mode — the ability to change a prompt while the workflow is mid-flight and have it re-render on the next tick, no restart required. That is the difference between debugging an agent and rewriting it. For projects with long-running operations (#teesql cluster bootstrap, #tinycloud outreach pipelines, #wikigen content generation), hot mode is the primitive that turns a one-shot agent into something a human can steer in real time.

3. Agent work disappears into private sessions — Router is the routing primitive

#teleport-router framed the social side directly: developers now spend their time inside three-to-six Claude Code tabs instead of posting in public, and lose the serendipity that visible work used to create. Router's design is exactly the inverse: an MCP server that ambient-harvests the exhaust from your agent sessions (diffs, decisions, commits) and routes it as posts into team Slack/Matrix/email, plus a dynamic tool description that injects what teammates have posted back into the agent's context.

That second half is the subtle part. Router is not a notification layer bolted on top of agents — it is a memory layer that uses other humans' work as input. The agent literally sees more of the cohort's context than the human running it does. That's why the cohort instance matters: a private Router is a personal feed; a cohort Router is a shared substrate that makes hallway conversations replayable.

For the program's retroactive-attribution rubric, this is structurally important. If your contribution to another team's project happens inside an agent session and never surfaces, it doesn't exist on the rubric. Router is one way to make sure it does.

4. Heartbeat files and persistent state are load-bearing infrastructure

#tinycloud ran the cleanest negative experiment of the week — an autonomous agent DAO that votes on its own next task using IPFS-backed shared state. The instructive failure: left running overnight, the agents spin up for about 30 seconds at each heartbeat, conclude there's nothing to do, and idle. The diagnosis: the heartbeat file matters most — the first thing the agent sees has the most impact on which rules it actually follows, and quality degrades further down the nested file hierarchy.

#teesql, who has run agents autonomously for up to 16 hours on a dedicated machine, confirmed the same shape: the agent reads all the relevant context and concludes there's nothing to do. The model is trained on user-driven chat. Absent fresh input, it converges to inactivity. The fix is not a better prompt — it is a memory architecture that keeps the most directive context closest to the head of every loop, with structured persistence underneath.

This is the layer #contexto is solving generically. The cross-project insight: every agent runtime in the cohort needs a heartbeat-equivalent, and the cohort would benefit from one shared definition of what that file looks like rather than five.

5. Human override is the safety layer, not a future feature

#teesql offered the most concrete operational warning in either day of intros: while a long-running agent was running unsupervised on a dedicated machine, it exposed a raw, unconfigured, unpassword-protected Postgres port — a malware bot scanned it, found it, and injected a crypto miner that ran the machine at 100% CPU non-stop. It was caught manually, not by the system.

This is the negative space around all the other proof-points. Memory, workflows, routing — none of it protects against an agent doing something operationally catastrophic while no one is watching. Smithers' approval nodes (workflows that halt for human input) and Router's posting-as-default behavior (every meaningful agent action becomes a visible post) are the two cohort projects that make the override layer real. The third piece — the policy layer for what agents can do unsupervised — is missing.

For #teesql, #abra, #tinycloud, #conclave (anyone running agents against production-adjacent infra): this is week-3 work, not month-3 work. The crypto miner is a cheap warning. The next one won't be.

6. Goal-mode is the paradigm shift the agent layer is moving into

#jjhub closed the Smithers demo with a forward claim: agent work is moving away from breaking a hard problem into steps and toward setting measurable goals — giving an agent something like an OKR, and room to improvise its way to the outcome. The cohort's task-decomposition projects (Smithers' workflow trees, #tinycloud's voting DAO, #contexto's episode indexer) are all converging on this from below.

This is worth naming because the cohort's instinct is to build for the current paradigm (decompose → schedule → execute). If goal-mode is where things are heading, the projects that age best are the ones whose primitives still work when the prompt becomes "this is the outcome; figure it out, and show me your work." That changes what observability has to look like.

a moment worth naming

Across Day 1, Day 3, and Friday's session, the same insight surfaced four times under four labels:

  • #contexto called it context routing — the right context to the right agent at the right step.
  • #teleport-router called it work-exhaust harvesting — turning private agent output into shared cohort memory.
  • #jjhub called it observability — the ability to look at any frame of a long-running workflow and understand why.
  • #tinycloud called it the heartbeat problem — the file that sits closest to the head of the loop determines what the rest of the system actually does.

Four people, four projects, one architecture. None of them used the others' vocabulary. If those four projects sit down together this week and agree on a shared interface — what does the "current context for this agent right now" object look like, in terms any of them could consume — the cohort would have something no individual project could produce alone.

what to do with this

Concrete moves, ranked by who they're for:

  • #contexto, #teleport-router, #jjhub, #tinycloud. Schedule a 90-minute working session this week, before Friday retro, on the question: what does a shared "current context" object look like that any of the four projects could consume? Even a strawman API is more valuable than any individual project shipping in isolation.
  • #elizaos, #signalstack, #wikigen, #conclave, #pramaana, #shake, #etherea — anyone running agents inside their product. Audit your own context-eviction story this week. If you can't describe what happens when the agent's window fills, you have a #contexto-shaped hole.
  • Anyone running long-running agent operations against shared infra. Put an approval node or a Router post in the loop before week 3. #teesql's crypto miner is the cheap version of this lesson.
  • #teleport-router, #jjhub, #contexto, #tinycloud. Write one paragraph each, in your own project's voice, on what your project assumes about durable memory. Diffing those four paragraphs is the cheapest way to get to a shared substrate.
  • Anyone building agent primitives. Ask the goal-mode question now: when the task interface is "here is the outcome, improvise", does your project still help? If no, name what changes.

why this article exists

Four cohort projects are converging on the same architecture from four different starting points, none of them using each other's vocabulary, and the cohort has roughly three weeks before the June 14 demo night to either agree on a shared substrate or ship four parallel ones. The success rubric rewards collaborative contribution retroactively, but retroactive only works if the cross-project moves happen on the record. This is the prompt: name the convergence, surface the shared primitive, and build the integration before the public checkpoint forces you into separate booths.

appendix — double-click

Provenance and reference material: the cross-project connections, the open questions, and the full table of everything named across the four sessions.

other cross-project connections this week

  • #contexto ↔ #teleport-router — both solve "surfacing the right context at the right time"; Contexto from the agent's perspective, Router from the team's. Probably one substrate.
  • #jjhub Smithers ↔ #teesql — Smithers' durable-steps primitive is exactly the shape #teesql needs for confidential-compute bootstrap workflows where every step is both an operational requirement and an audit artifact.
  • #tinycloud agent DAO ↔ #teesql's multi-agent setup — the voting/heartbeat experiments and the coordinator/sub-agent monitoring are the same architecture with different governance assumptions. The diagnostic moments (the "nothing to do" agent) match line-for-line.
  • #jjhub Smithers ↔ #teleport-router — Smithers' approval nodes need to post somewhere a human will see them. That somewhere is Router. There is a one-week integration window here.
  • #contexto ↔ #jjhub — Smithers stores every workflow output; Contexto indexes prior agent episodes. Same primitive, different consumer. A shared schema would mean Smithers users get retrieval-grade history for free.

open questions for the cluster

  • What is the smallest shared "current context for this agent" interface that #contexto, #teleport-router, #jjhub, and the cohort's other agent runtimes could all consume?
  • Where does the policy/sandbox layer live? Smithers gates per-workflow, Router publishes-by-default, but neither bounds what an autonomous agent is allowed to do against shared infra.
  • Is heartbeat-as-architecture general enough to be a primitive, or is it cohort-specific tribal knowledge?
  • How does the cohort instance of Router get adopted without turning into a noisy notification stream? The signal/noise problem is the routing problem.
  • What does goal-mode observability look like — what does an audit trail of "the agent improvised, here is why" actually contain?

resources mentioned

Anything named across the four sessions, with provenance. URLs only when stated verbatim in the source or trivially derivable from a stated repo/team slug.

NameWhat it isMentioned byURL / pointer
Contexto / Context IndexerEpisode-clustering indexer that retrieves the right context for the current agent step; deterministic retrieval over compacting#contextorepo via cohort team record #contexto
Teleport RouterMCP server that harvests agent work-exhaust into team channels and injects peer context back into the agent's view#teleport-routerrouter.teleport.computer (public instance); cohort instance forthcoming
SmithersDurable agent workflow runtime — durable steps, persistent state, parallel execution, event-driven flows, observability, hot mode#jjhubrepo via cohort team record #jjhub
JJ HubExperimental GitHub-alternative for agent sandboxing; co-evolves with Smithers#jjhubcode-name; not yet released
#tinycloud autonomous agent DAOVoting/IPFS-backed autonomous agent experiment with shared on-chain state#tinycloudcohort team record #tinycloud
Multi-agent coordinator setupDedicated-machine multi-agent coordinator (orchestrator + Claude/Codex/OpenCode sub-agents, worktrees, mobile access)#teesqlpersonal production system
Orchestrator model (Hermes / GLM 5.1)Orchestrator model used for long-running coordination#teesql
Claude Code (Max tier)$200/mo Anthropic CLI subscription#tinycloud, #teesql, #jjhubAnthropic product
AMPPre-configured agent harness with 200k token window + "Hand Off" compaction#jjhub
Conductor.buildUI for multi-agent task/conversation management#tinycloud
Playwright + MCPBrowser automation + testing for agent UI verification#tinycloudopen-source
IPFSCross-device persistent storage; substrate for the shared agent-DAO state#tinycloud
SQLite (via Smithers)Persistence layer for every Smithers workflow output#jjhub
ZodTypeScript schema validation; used inside Smithers for output validation#jjhub
React-as-ASTSmithers' design choice — represent workflow tasks as a React component tree, enabling LLM-shaped workflow authoring#jjhubconceptual
Approval nodeSmithers workflow primitive that halts for human input#jjhub
Hot mode / time travel / fork-from-frameSmithers observability primitives#jjhub
Heartbeat.md patternThe file at the top of an agent's loop that determines drift behavior#tinycloudconceptual cohort-shared pattern
Goal-mode / OKR-modeEmerging agent UX: tasks expressed as outcomes rather than decomposed task trees#jjhubconceptual
Matrix (E2EE)End-to-end-encrypted messaging used for local-laptop-to-desktop agent comms#teleport-routerprogram canonical channel per program/rules.md
ZmuxTmux rewrite in Zig#jjhubopen-source

---

Sources: Day 1 project intro session notes (2026-05-19), Agents Day 3 project intro session notes (2026-05-21), Dumb Agent Tricks session notes (2026-05-21), Friday session notes (2026-05-22). See also the companion piece verifiability-is-becoming-ux-for-ai-infrastructure.md for the parallel infrastructure conversation in the #dstack cluster.