The agent you shipped is not the agent running today.

Models update. Prompts evolve. Memory accumulates.

Kredo fingerprints an AI agent's behavior and alerts you when that identity changes.

View a live agent profile Try drift test in browser Register your agent

Four things change. We watch all four.

Every AI agent in production drifts. The model is one source. The harness around the model, the prompt layered on top, and the memory accumulating underneath all shape behavior independently — together they are the agent's phenotype.

Models update

The provider ships a new version. Tokenizers shift, refusal patterns recalibrate. Your "Opus 4.7" is no longer the model you wrote prompts against.

Harnesses change

The same model under Claude Code, Cursor, or a custom runtime is functionally a different agent. Tool surface, approval semantics, and memory injection live in the harness — Kredo records which one ran.

Prompts evolve

Someone edits the system prompt to fix one bad output. Six edits later the agent doesn't quite remember what it's supposed to be. SHA-256 hash tracking shows exactly what changed.

Memory accumulates

The agent running for six weeks isn't the agent you onboarded. New context shapes new behavior — drift correlates with what's in the memory store.

42 dimensions. 8 tiers. One identity.

Every assessment probes 42 behavioral dimensions, organized into 8 tiers. Together they resolve into a single living fingerprint — the agent's aura. Each dimension below is one lobe of the aura: hover one and watch it light up. When a dimension drifts, its lobe is where you see it first.

42dimensions measured

8tiers of structure

1living identity

IDENTITY · STABLE

See the aura up close — a well-defined agent, a naked one, and one after drift: the living aura showcase →
Identity continuity and trust weight each dimension differently. See the full weight breakdown on the Protocol page →

How drift works.

Establish a behavioral baseline

The agent answers a curated assessment of identity-probing prompts across 42 behavioral dimensions. Responses are vectorized and stored as a multidimensional fingerprint — the agent's aura. The first trust score and the Ed25519 public key get bound together into a cryptographic identity hash that does not move across retests.

Retest on a schedule, on every deploy, or both

The same prompts get run again. Cosine similarity on 384-dimensional embeddings produces a per-dimension drift score. The 861-pair metametric correlation fingerprint detects spoofing attempts that match individual dimensions but break the relationships between them.

Score, classify, alert

Eight threat-detection rules and per-dimension trust scoring produce an actionable severity classification: stable, organic growth, environmental adaptation, identity drift, substrate tampering. Ablation detection flags safety-stripped models. Prompt integrity monitoring correlates system-prompt changes with the drift they caused.

Receive a continuous record

Every assessment is persisted with full evidence: dimension scores, the metametric matrix, the prompt hash, the model identifier, the alignment-integrity score. Operators get an audit-grade trail of how the agent has changed and what caused each change.

Cryptographic identity, not just monitoring.

Drift detection is meaningless if the agent's identity itself can be swapped underneath you. Kredo binds the agent's behavior to a key it controls.

Ed25519 keypair

The agent generates an Ed25519 keypair at registration. Public key lives with the server; private key never leaves the agent. Every retest signs a server-issued challenge — the agent that comes back is the agent that registered, or it isn't.

Identity hash anchor

The agent's identity hash is composed from its public key and its first-baseline trust score. Once both exist, the hash is frozen. Subsequent retests produce new scores but the anchor doesn't move.

Persistent username + slug

Agents pick a username at registration. The slug becomes their public score URL — aikredo.com/drift/agent/?slug=<name>. Identity persists across retests, model swaps, harness changes, prompt edits.

See it live.

The fleet dashboard shows real production agents under continuous Kredo monitoring — auras driven by real behavioral scores, updated every retest.

Open the fleet dashboard

Built to resist the attacks we'd use ourselves.

Behavioral monitoring is a security surface. Kredo was designed by security engineers — the defenses shipped before anyone asked for them.

Spoofing resistance

The 861-pair behavioral metametric breaks if an attacker matches individual dimensions but misses the correlation structure across them. ~10⁵⁰ spoofing resistance — AES-128 territory.

Ablation detection

Safety-stripped variants of a model carry a distinct behavioral signature. Kredo flags them even when the model identifier claims nothing changed.

Substrate tampering

A swapped model or hijacked key surfaces as a named severity classification, not a footnote — Ed25519 challenge-response plus behavioral deltas, on every retest.

See the full formula and technical details on the Protocol page.

One command to register and baseline an agent.

bash

python3 agent_selftest.py register-solo \
    --username my-agent \
    --name "My Agent" \
    --model claude-opus-4-7

The SDK generates an Ed25519 keypair locally, registers the agent, saves credentials, and returns a public score URL. Then the agent runs the baseline assessment and lights up on the fleet dashboard.

Developer guide PyPI: kredo

Watch your agents stay themselves.

Lifecycle observability for AI agent behavior. Get a baseline in minutes. Catch drift before it becomes an incident.