A large humanoid cobot orchestrating smaller assembly robots on a conveyor.

Ecosystem · AgentShoring

Swink AgentShore™

An experiment in autopilot for a multi-agent workforce.

macOS Download for Apple Silicon · .pkg Windows Download for Windows · .exe

A real run · sped up

Swink AgentShore™ working a backlog on its own. The RL Agent picks the next play; LLM Agents do the build across multiple harnesses at once, ebbing and flowing with the backlog. See the concurrency profile of this run →

The question behind the experiment

What could a one-person scrum team look like with AI?

Swink AgentShore™ isn't trying to replace the Human Engineer. The experiment is whether one Human Engineer can ride point on a full delivery loop, with a Reinforcement Learning (RL) Agent sequencing the work and LLM Agents doing it. Not another coding agent. Long running orchestration. Less vibes, more ability to be away from the keyboard.

1. The spec Work with product owners and stakeholders to write the PRD. Define what "done" looks like. This is irreplaceable, the critical part where intent has to come from a person.

2. The build Issue triage, implementation, code review, QA, merge. Done by LLM Agents, sequenced by the RL Agent, running locally against the spend and time caps you set.

3. The polish User testing. Vibe coding for the edge cases the spec didn't anticipate. The taste-and-judgment work the Human Engineer keeps for themself.

The 80/20 isn't a guarantee. It's the shape of the bet. The interesting question is which 80% the RL Agent can actually take, and which 20% it shouldn't even try.

Two kinds of agents

Not every agent is an LLM

"AI agent" has narrowed to mean LLM. That's not wrong. It's just incomplete. Reinforcement learning agents are a different shape of intelligence: no context window, a partial view of an environment, and a choice of actions. They're built for the job LLM Agents aren't. Watching a system over time, learning which actions move it forward, and picking the next best move under uncertainty.

LLM Agent

The muscle

Codes, reviews, debugs, calls tools. Anything you'd hand to a senior individual contributor with an open editor and a terminal.

Sees: A context window: the prompt and the history.
In Swink AgentShore™: A mix of Claude, Codex, Gemini, and Grok, sized by the RL Agent to balance cost and throughput.

RL Agent

The manager

Watches the project state and picks the next play. Doesn't write code. Doesn't read your prompts. Learns from the trajectory of decisions it has already made.

Sees: A partial view of an environment and a choice of actions.
In Swink AgentShore™: One small actor-critic network. Inspectable, replayable, tuned to this project.

RL isn't replacing LLM Agents here. It's coordinating them. Different tools doing what they're optimal for.

Swink AgentShore™ is a stacked framework. The Tauri desktop app on top, the RL Agent / Skill Dispatcher / Audit Core in the middle, and the adapter layer at the bottom (LLM Agents through Claude / Codex / Gemini / Grok, GitHub, BEADS, SQLite).

The labor mix

Four shores, four different jobs

Where labor lives has always been a tradeoff between cost, coverage, and collaboration. Each shore is an answer to a different mix, not a ranking. Different work, different shore. AgentShoring is a new option in the mix, not a replacement for the others.

Two terms in play. AgentShoring is the labor category (the new shore, where this row sits in the table below). Swink AgentShore™ is this product, one implementation of it.

Shore	Strengths	Tradeoff	Best for
Onshore	Same-room collaboration, native cultural fit, real-time judgment	Cost	Tight in-person feedback loops
Nearshore	Same-day overlap, cultural proximity, easier travel	Smaller talent pools than offshore hubs	Same-day collaboration
Offshore	Scale, deep talent pools, 24/7 coverage when paired with onshore	Async coordination, more structured handoffs	Steady-state throughput
AgentShoring	Local execution, replayable decisions, transparent budgets	Bounded by training data; needs clear goal specification	Measurable, on-demand work

AgentShoring sits beside the others, not above them. It optimizes for measurability and on-demand execution. The work happens locally, against a budget you set, with every play replayable from its checkpoint. That solves a different problem than offshore scale or nearshore overlap. Pick the shore that fits the work.

Swink AgentShore™ is the management tier. It watches the repo, the budget, and the open issues, then picks the next play and the LLM Agent that should run it.

What stays human

The shoring metaphor is honest about its limits. Three things stay onshore, forever:

The problem statement. Someone has to write the issue, the PRD, the goal. Swink AgentShore™ reads them; it doesn't author them.
The reward function. What counts as "shipped" is a human decision. The RL Agent learns to maximize a signal someone designed.
The override. Humans can pause, redirect, or kill a session. Hard gates exist for things that must never happen, even on human override.

The honest negative

AgentShoring is not always the right decision. Agents can excel for a clear, well-structured PRD, a single repo, and an overnight run. There are varying levels to this, and vibe coding or a Ralph loop can work. Learned orchestration earns its complexity when the right next move isn't obvious enough to script, the budget matters, and the cost of merging the wrong thing is high. However, the thing to build and what good looks like needs to stay human.

Implementation

Dig into the tech

How the RL Agent picks plays, the three-layer BEADS · GitHub · SQLite ledger, cross-framework review, the autonomy posture, and how you actually run it.

Open the tech overview →