Tillered Arctic

Clustering

Understanding peer discovery and state synchronization

Clustering

This article explains how Arctic agents discover each other, establish trust, and synchronize state across a distributed cluster.

Cluster Identity

Every Arctic cluster has a unique identifier generated during bootstrap. Peers with the same license belong to the same logical cluster.

License-Based Clustering

Clusters are defined by license:

  • All peers bootstrapped with the same license join the same cluster
  • Peer connections verify license signatures
  • Different licenses cannot join the same cluster

This design ensures only authorized nodes can participate.

Peer Discovery

Peers discover each other through explicit addition or transitive discovery.

Explicit Addition

When you run arctic peers add:

  1. Local peer contacts the remote peer's API
  2. Both peers exchange identity information
  3. Both verify signatures against the shared license
  4. Both store the peer information locally

Transitive Discovery

Through synchronization, peers learn about each other:

  1. Peer A knows about Peer B
  2. Peer B knows about Peer C
  3. During sync, B tells A about C
  4. A initiates connection with C

This allows the cluster to grow without manually connecting every pair.

Cluster Credential

The cluster shares a credential for CLI access across all peers.

Shared Credential

  • Created during first peer's bootstrap
  • Propagated to new peers automatically
  • Allows CLI to work with any cluster peer

Credential Rotation

When you rotate the credential:

  1. New secret is generated
  2. Old secret valid for 24-hour grace period
  3. New credential syncs to all peers
  4. All peers accept both during grace period

Peer Lifecycle

Joining

  1. Agent bootstrapped with license
  2. Added to cluster via peers add
  3. Receives cluster state via sync
  4. Participates in ongoing synchronization

Leaving

  1. Peer calls peers remove-self
  2. Broadcasts signed deactivation announcement
  3. Other peers mark peer as inactive
  4. Peer stops participating in sync

Failure

If a peer becomes unreachable:

  • Connection attempts to that peer fail
  • Peer marked as unhealthy locally
  • Services involving that peer may fail
  • Peer can rejoin when connectivity restored

Consistency Model

Arctic provides eventual consistency:

  • Updates are applied immediately locally
  • Propagation to other peers takes seconds
  • All peers converge to same state
  • No central coordinator required

Implications

  • Reads may see stale data briefly
  • Concurrent updates are resolved automatically
  • Deletions are permanent

Troubleshooting

Peers Not Syncing

# Check connectivity
arctic health --all-contexts

# Force sync
arctic cluster sync

# Check logs
journalctl -u arctic-agent | grep gossip

State Divergence

If peers have inconsistent state:

  1. Verify network connectivity between peers
  2. Check for clock skew (NTP recommended)
  3. Trigger manual sync on each peer
  4. Review logs for reconciliation errors

See Also