Mining the Loop: How Changes Become Institutional Memory
> Git commits become structured changelog entries and architectural decision records, then feed back into AI agents as queryable institutional memory.
Numbers in this post reflect the system at publication (February 2026). See our team page for current figures.
Every engineering team faces the same challenge: changes happen constantly, but the why behind those changes disappears. Six months later, someone asks “why did we adopt DuckDB for pipeline stages?” and the answer lives only in the head of whoever made that call—if they’re still around.
We built a mining workflow that closes this loop. Changes flow through git commits, get processed by our mining pipeline, become structured changelog entries and architectural decision records, and then feed back into our AI agents through CLI queries. The result: institutional memory that both humans and AI can access.
The Problem: Decisions Evaporate
Consider a typical scenario. A developer commits:
feat(canonical): add DuckDB runtime for pipeline stages
This commit represents a significant architectural choice. The team evaluated options, considered trade-offs, and landed on DuckDB for specific reasons. But all that context lives in:
- A Slack thread (probably deleted)
- Someone’s memory (definitely fading)
- A comment in the code (maybe, if you’re lucky)
Three months later, a new team member asks: “Should I use DuckDB or SQLite for this new stage?” Without institutional memory, they either reinvent the wheel or make inconsistent choices.
The Loop: From Commits to Context
Our mining workflow transforms git history into queryable knowledge:
Git Commits
│
▼
┌─────────────────────┐
│ mine sync │ ← Build index from git history
└─────────────────────┘
│
▼
┌─────────────────────┐
│ mine candidates │ ← Surface commits for review
└─────────────────────┘
│
▼
┌─────────────────────┐
│ Classification │ ← Human or LLM assessment
│ (changelog or ADR) │
└─────────────────────┘
│
├──────────────────────┐
▼ ▼
┌─────────────┐ ┌───────────────┐
│ Changelog │ │ Decisions │
│ Ledger │ │ Registry │
│ (JSONL) │ │ (YAML files) │
└─────────────┘ └───────────────┘
│ │
▼ ▼
┌─────────────┐ ┌───────────────┐
│ CHANGELOG.md│ │ orkestra CLI │
│ per package │ │ queries │
└─────────────┘ └───────────────┘
│ │
└──────────────────────┘
│
▼
┌───────────────┐
│ AI Agents │
│ (via CLI) │
└───────────────┘
The key insight: both changelogs and architectural decisions flow from the same git history, processed through a unified pipeline. This ensures nothing falls through the cracks.
How Mining Works
Step 1: Sync the Index
uv run orkestra mine sync
This command scans git history and builds an index of all commits. It extracts structured signals from each commit:
- Conventional commit type (
feat,fix,chore,docs) - Scope (which package or area)
- Breaking change markers
- Files touched and complexity metrics
Step 2: Check Coverage Status
uv run orkestra mine status
Here’s what our current status looks like:
Mining Status
=============
Decisions
---------
Coverage: 100.0%
Processed: 15637 (of 15637)
Extracted: 476
Skipped: 15161
Changelog
---------
Coverage: 100.0%
Processed: 15637 (of 15637)
Released: 6799
Skipped: 8838
15,637 commits processed. 476 became architectural decisions. 6,799 became changelog entries. Every commit classified.
Step 3: Get Candidates for Review
uv run orkestra mine candidates --limit 50 --full
This surfaces commits that haven’t been processed yet, with full context for classification:
on
{
"sha": "90571786d99166c0039ca2e80e4d9cd96184bd85",
"date": "2026-01-26",
"subject": "feat(canonical): add DuckDB runtime for pipeline stages",
"signals": {
"commit_type": "feat",
"scope": "canonical",
"breaking": false,
"is_releasable_type": true,
"domains_affected": ["pipeline", "data-architecture"]
},
"body": "Establishes DuckDB as canonical in-process analytical database...",
"files_changed": ["packages/canonical/pipelines/stages/duckdb_runtime.py", "..."],
"stats": {"files": 8, "insertions": 450, "deletions": 120}
}
The signals help guide classification: is_releasable_type: true suggests this should appear in the changelog. The large insertion count and infrastructure files suggest it might also be an architectural decision.
Step 4: Classify Commits
Two paths diverge here: changelog entries and architectural decisions.
For changelog entries:
uv run orkestra mine classify abc123 --changelog added
This records that commit abc123 should appear in the changelog under the “Added” category.
For architectural decisions:
First, get a real decision ID:
uv run orkestra decisions new --domain pipeline --dry-run
# Returns: DEC-PL-143
Then classify with the decision ID:
uv run orkestra mine classify abc123 --decision DEC-PL-143
This links the commit to a decision record that will be created or updated.
For batch processing (what we actually do):
# Generate classifications file with LLM assistance
uv run orkestra mine commit --input classifications.jsonl
The JSONL format supports both domains in one pass:
on
l
{"sha":"abc123","domain":"changelog","action":"release","category":"added","summary":"Add DuckDB runtime for pipeline stages"}
{"sha":"abc123","domain":"decisions","action":"extract","decision_id":"DEC-PL-142"}
{"sha":"def456","domain":"changelog","action":"skip","reason":"Chore: dependency update"}
Step 5: Render Outputs
uv run orkestra changelog render --package <pkg>
This generates per-package CHANGELOG.md files from the ledger. The changelogs are derived artifacts—delete them and they regenerate perfectly from the source ledger.
The Decision Record Structure
Extracted decisions become YAML files with rich metadata:
id: DEC-PL-142
title: Adopt DuckDB as Canonical Processing Runtime for Pipeline Stages
domain: pipeline
status: active
created: '2026-01-26'
summary: |
Establishes DuckDB as the canonical in-process analytical database for pipeline
stage transformations. Provides a shared runtime module that resolves settings
from pipeline defaults with stage-level overrides.
context: |
Pipeline stages performing data transformations each independently configured
DuckDB connections. This led to inconsistent settings, duplicated configuration
code, and no way to tune DuckDB globally for a pipeline run.
rationale:
- DuckDB provides efficient in-process OLAP with zero configuration deployment
- Centralized runtime module eliminates duplicated DuckDB setup across stages
- Hierarchical settings enable global tuning with stage-level overrides
- Memory limits and thread counts can be adjusted per-pipeline
impact:
positive:
- Consistent DuckDB configuration across all pipeline stages
- Single point of control for memory/thread tuning
- Reduced code duplication in conversion and export stages
negative:
- Adds dependency on shared runtime module
- Stages must adopt new configuration pattern
source_commits:
- sha: 90571786d99166c0039ca2e80e4d9cd96184bd85
message: 'feat(canonical): add DuckDB runtime for pipeline stages'
date: '2026-01-26'
role: primary
files:
- packages/canonical/pipelines/stages/duckdb_runtime.py
- packages/canonical/pipelines/runner.py
- packages/canonical/pipelines/stages/convert_hdx_admin_boundaries_to_geoparquet.py
related:
- DEC-DA-014 # Data architecture decisions that influenced this
Every decision links back to its source commits. Every decision specifies which files it affects. Relationships between decisions are explicit.
CLI Integration: Querying Institutional Memory
This is where the loop closes. Agents can query decisions through the CLI:
# Search by topic
uv run orkestra decisions search --query "retry"
Returns decisions about retry logic, error handling, recovery patterns.
# Get full details on a specific decision
uv run orkestra decisions info DEC-PL-142
Returns the complete decision record with context, rationale, and impact.
# List recent decisions for context
uv run orkestra decisions list --limit 15
Shows what architectural choices were made recently.
How Agents Use This
Our orchestrator’s baseline instructions include:
**Essential CLI commands:**
- `orkestra decisions search "X"` — Find architectural decisions
When an agent is asked to implement something related to DuckDB, it can first check:
uv run orkestra decisions search --query "DuckDB"
And discover DEC-PL-142, learning:
- Why we chose DuckDB (context)
- How to use it correctly (agent_guidance)
- What files to look at (files)
- What related decisions exist (related)
The agent doesn’t reinvent the wheel. It builds on established patterns.
The Three Questions Test
Not every commit deserves a decision record. We use the Three Questions Test to filter:
- Was this hard to make? Did it require significant analysis, trade-off evaluation, or debate?
- Is it costly to change? Would reversing this decision require significant rework?
- Does it have system-wide impact? Does it affect multiple packages or establish patterns others will follow?
If a commit answers “yes” to at least one of these questions, it’s a candidate for decision extraction. Our typical rate: 1-4 decisions per 100 commits (about 1-4%).
For changelog entries, the bar is lower: any user-facing change (features, fixes, improvements) gets recorded. Internal chores, documentation updates, and refactors typically get skipped. Our typical rate: 30-50 changelog entries per 100 commits.
Data Storage: Append-Only Ledgers
The mining system uses append-only JSONL ledgers for conflict-free multi-agent operation:
packages/orchestration/ai_assets/reference/changelog/
├── commits_processed.jsonl # Classification ledger (both domains)
├── release_notes.jsonl # Changelog entries
└── commits_index.yaml # Derived index (gitignored)
packages/orchestration/ai_assets/reference/decisions/
├── registry.yaml # Decision index
└── records/
├── DEC-AD-001.yaml
├── DEC-AD-002.yaml
└── ...
The JSONL format with merge=union in .gitattributes means multiple agents can classify commits simultaneously without merge conflicts. Each line is independent.
Validation Gates
Before any mining session, we run validation:
uv run orkestra mine validate --quick
This checks:
- SHA format validity
- Decision ID format compliance
- No duplicate entries for the same SHA
- Referenced decisions actually exist
After classification, we validate again before committing changes.
Why This Matters
The feedback loop we’ve built solves several problems:
For new team members: Instead of asking “why did we do X?”, they can search the decisions registry. The context is preserved.
For AI agents: They don’t operate in a vacuum. They can query institutional knowledge before making recommendations. When asked to add a new pipeline stage, they can discover the DuckDB pattern and follow it.
For architectural consistency: Decisions are explicit and searchable. When someone proposes an approach that contradicts an existing decision, the system can surface the conflict.
For changelog generation: Release notes aren’t a last-minute scramble. They’re a byproduct of continuous classification during development.
For onboarding: New agents inherit the full context of the codebase. They don’t just see the code—they see the decisions that shaped it.
Current State
As of today:
- 15,637 commits processed through the pipeline
- 476 architectural decisions extracted and documented
- 6,799 changelog entries recorded
- 100% coverage across both domains
Every commit since we started has been classified. The institutional memory is complete and queryable.
Getting Started
If you want to implement something similar:
-
Start with conventional commits. The mining pipeline works best when commits have structured prefixes (
feat:,fix:,chore:). -
Define your domains. We use domains like
pipeline,agent-design,observability,data-modeling. These organize decisions by area. -
Build the classification habit. Mining works when teams regularly classify commits. Batch processing with LLM assistance helps scale.
-
Make decisions queryable. The value compounds when agents can search decisions via CLI. Structure your output for machine consumption.
-
Close the loop. Decisions should influence future work. Include decision references in agent instructions and code review checklists.
The goal isn’t perfect documentation. It’s making the why behind changes accessible to both humans and AI, today and six months from now. When changes become institutional memory, teams build on established patterns instead of reinventing them.
The mining workflow is part of our orchestration engine, specifically the context engine module in our orchestration package.
Related reading
More from the Maguyva build log
Agent Observability: Hooks, Alloy, and Grafana _
We wired Claude Code and Codex into one Grafana stack with OpenTelemetry and Alloy, then used traces and logs to find and fix agent behavior issues at the source.
Skill Mining: From 3,500 Candidates to 466 Capabilities _
We screened 3,500 skill candidates and adopted 466. A systematic mining and ingestion loop for building a coherent AI agent skill library at scale.
Progressive Disclosure: CLI Windows into Agent Systems _
Agent systems are opaque by default. Progressive disclosure gives operators layered CLI views from quick status checks to full agent internals and decision traces.