Skip to content

Sync & Data Transport

How operational and cognition data moves between machines in the ProtocolWarden fleet, and which repo owns which part of that movement.

The organizing principle: group by manifest, never by data-type. The manifest is already the scoping authority — it declares the ecosystem, hosts cognition, and anchors sessions. "What data syncs, from where, to where, organized how" is one more thing the manifest declares. That keeps the design from sprawling into a repo-per-data-type-per-project explosion: there is one sync surface per manifest, and new data is a declaration, not a new repo.

Layers

Three layers, following the ecosystem's public-mechanism / private-binding split.

flowchart TD
    M["<b>Manifest</b> — declaration / contract<br/>PlatformManifest (public) · private manifests<br/>declares data classes + per-asset sync mode + folder layout<br/><i>public manifest = structure only, no destinations/secrets</i>"]

    SM["<b>Sync Mechanism</b> (public)<br/>Syncthing schema / vocabulary<br/>setup · backup · restore orchestration + shims<br/>reads shape + binding → emits Syncthing config"]

    PB["<b>Private binding</b><br/>private fleet layer: device IDs, keys, machine-link setup<br/>private manifests: destination folders, per-machine routing"]

    PL["<b>Payload</b> — backup/restore implementations live in <i>each</i> repo<br/>(only a repo knows how to dump/restore its own data)<br/>operational data · per-project cognition / CLI memory"]

    M -- shape --> SM
    PB -- identity + destinations --> SM
    SM -- invokes via shim contract --> PL

Layers and ownership

Layer Role Visibility
Manifests Declare sync layout (data classes, modes, folder layout) for their scope Public / Private
Sync Mechanism Syncthing schema/vocabulary; setup/backup/restore orchestration + shim contract Public, configurable
Private fleet layer Machine identity keys, machine-link setup scripts Private
Each participating repo Owns its backup/restore implementation behind the shim contract per-repo

The private fleet layer (machine identity keys + link-setup scripts) is kept distinct from the SSH key index, which only tracks SSH keys and their lifecycle (in use / rotated / retired).

Sync modes

Defined in the public Sync Mechanism repo, chosen per-asset in the manifest:

Mode Meaning Example
copy Small things snapshotted into a sync/ directory configs, work items, campaigns
in-repo Large things synced in place large model / media assets
external Cannot live in-repo; synced from an out-of-tree location large backup archives

Invariants

  • The manifest is the only place grouping is defined; new data is a declaration in an existing manifest, never a new data-type repo.
  • A public manifest declares structure only — never destinations, device IDs, or secrets. Those inject from the private layer.
  • Backup/restore implementations live in their own repo, behind a shim contract owned by the public mechanism.
  • Payload location and sync mode are declared per-asset; the mechanism's vocabulary defines the legal modes.
  • Coverage is observable — an asset with no declared sync mode is a detectable gap, not a silent omission.