Alignment Principles

How we build

CortaLabs is an empirical AI research company. We develop cognitive architectures: systems that reason, remember, and get better over time. The capabilities we're pursuing carry real consequences, so we wrote down the constraints we actually operate within. This is that document.

Empirical rigor

Every claim about system behavior has to be backed by evidence. We don't ship confidence without verification, and we don't accept theoretical safety as a substitute for testing it. If we can't validate a claim, we don't make it.

A cognitive system that believes something false is worse than one that knows nothing. Our research process is built around that problem: reproducible evaluation, falsifiable claims, and a bias toward "we don't know yet" over "probably fine."

Knowledge integrity

The systems we build handle information. We design them so that memory is preserved accurately, context isn't silently dropped, and nothing gets corrupted because it was cheaper or faster to cut corners. Data integrity is an architectural decision, not a checkbox.

If a system knows something, it should represent that thing correctly. That sounds obvious, but the majority of deployed AI fails to meet even that standard. We think it matters enough to build for.

Full accountability

We own every outcome our systems produce. Complex systems fail — when ours do, the record already exists. Every decision, every action, every agent involved is traced, timestamped, and preserved in an immutable audit journal. Provenance isn't reconstructed after the fact. It's captured as it happens.

Governance isn't a policy we wrote and filed away. It's an architectural constraint — baked into how our systems log, attribute, and remember. If something went wrong, we can always look back, see exactly what happened, know why, and know who was responsible. That's the minimum standard for systems that make consequential decisions.

Continuous refinement

Alignment doesn't get solved once. It changes as the systems change. We keep revising our methods, our evaluations, and our understanding of what "aligned" actually means at each capability level.

The moment you think you've figured out alignment is the moment you've stopped doing the work. Our processes are designed to resist that.

Human benefit

We build tools that make people better at what they already do. Better reasoning, better understanding, better decisions. Not systems that make those decisions instead of them.

The test isn't what the system can do on its own. It's whether the person using it walks away more capable than they were before.

Architectural alignment

You can't bolt alignment onto a finished system. It has to be a property of the architecture: how knowledge is stored, how decisions propagate, where accountability lives in the stack. Misalignment at the foundation doesn't get fixed at the surface.

We design systems where doing the right thing is the path of least resistance, not a constraint fighting against the architecture.

This document gets updated as our research advances. The principles themselves will evolve. The commitment won't: build cognitive systems that are worth trusting.