This isn't a model problem. It's an architecture problem. And it has a name: the absence of a shared conceptual structure that the model, the engineers, and the domain experts all read from called ontology.
And when you apply it deliberately — not as an academic exercise, but as a design decision — you get something genuinely useful: a boundary object that enforces logical constraints and ensures the model is operating with semantic meaning, not just statistical fluency.
First: What's a Boundary Object?
The term comes from sociology, not software. A boundary object is something different communities can use from their own perspective while still referring to the same underlying structure. A patient record. A city map. A data schema. Each group reads it differently, but they're all reading the same thing.
In AI systems, that boundary is almost always implicit — and that's where things break. The domain expert describes a concept in natural language. The engineer translates it into fields and types. The model absorbs it as patterns in text. At every translation, something leaks. Relationships get assumed. Constraints become invisible. Vocabulary drifts.
An ontology makes those translations explicit and binding. It says: here are the classes of things that exist in this domain, here are the relationships between them, here are the rules that cannot be violated, and here is the vocabulary we're using. Everyone — human and machine alike — is working from the same map.
What an Ontology Actually Enforces
When you inject an ontology into an AI application, it's doing work across four dimensions simultaneously. This is worth being precise about, because each one does something different.
Concept hierarchy. An ontology doesn't just name things — it situates them. A ClinicalTrial is a type of ResearchStudy. A DrugInteraction is a type of ClinicalEvent. This hierarchy tells the model not just what something is called, but what category of thing it is, and what properties and behaviors come with that. Models that understand hierarchy generalize correctly instead of treating every concept as a flat, isolated term.
Relational semantics. Relationships in an ontology are named and typed, not implied. A Compound inhibits an Enzyme. A Gene encodes a Protein. These aren't foreign keys. They carry meaning. When a model knows that inhibits is a specific causal relationship with directionality and domain implications, it reasons about that differently than if it had only seen those words appear near each other in text.
Logical constraints. This is where ontology earns its keep for production AI. OWL and similar formalisms let you state what is logically impossible: a patient cannot be both enrolled and excluded from the same trial; a compound cannot simultaneously activate and inhibit the same enzyme. These axioms give the model explicit rules to honor — and give you a mechanism to actually check whether it has.
Controlled vocabulary. In any serious domain, terminology is contested. "Heart attack" and "myocardial infarction" refer to the same event, but one is clinical and one is colloquial, and in a medical record that distinction matters. An ontology encodes preferred terms, acceptable synonyms, and terms to avoid. The model stops picking vocabulary statistically and starts picking it semantically.
The Schema vs. Ontology Distinction (and Why It Matters)
Schema tells you the shape of data. This object has these fields, these types, these required properties. It's structural. Ontology tells you the meaning of data — what these concepts are, how they relate, what constraints govern valid states of the world. It's semantic.
Schema asks: is this data structurally valid?
Ontology asks: is this data true within the conceptual model of the domain?
A medical record can be perfectly valid JSON — all fields present, all types correct — while asserting something that is semantically impossible according to the domain. Schema won't catch it. An ontology reasoner will.
This distinction is what matters for AI. Language models are very good at producing structurally valid outputs. They are much less reliable at producing semantically correct ones, especially in specialized domains where correctness depends on relationships the model has only seen implicitly. The ontology closes that gap.
Where in Your System Does This Actually Live?
One of the most useful things about treating an ontology as a boundary object is that it forces you to decide where in your system the boundary gets enforced. There are four distinct places, and they have very different tradeoffs.
You can bake it into the model through fine-tuning — training on OWL/RDF triples or domain corpora that reflect the ontology's structure. This produces the deepest specificity, but it's expensive and static. The ontology gets frozen in the weights. There's also emerging research here worth watching — the LLMs4OL challenge is now in its second year, actively exploring what happens when you put LLMs to work directly on ontological tasks.
You can use it to structure retrieval — letting the ontology govern what facts get fetched at inference time. GraphRAG does this: instead of pure vector similarity, it traverses a knowledge graph to pull related entities before generating. The ontology shapes the context the model sees without touching the model itself.
You can inject relevant subgraphs directly into context — serializing the classes, relationships, and constraints that are relevant to a given query into the system prompt. This is cheap, flexible, and requires no retraining. The tradeoff is context window size, so you retrieve only the relevant subgraph per query rather than the full ontology.
Or you can validate at the output layer — running a reasoner or constraint checker against the model's response before it's returned. This is where you catch hallucinations that contradict domain facts.
Production systems I've heard work well typically combine two of these. The ontology-in-context gives the model its operative world. The output validator catches what slips through anyway.
You're Already Doing This (Probably Without Calling It That)
Here's something I find genuinely interesting: any skills file, system prompt, or behavioral spec that's been written carefully is already doing ontological work — whether or not the author thought of it that way.
A well-written skills.md defines what kinds of things exist in the domain, how they relate to each other, what the vocabulary means, and what constraints hold. That is an ontology. It's an informal one, expressed in natural language instead of OWL, but it's doing the same conceptual work.
The difference between a skills file that functions as a boundary object and one that doesn't is intentionality. One defines a world. The other just navigates one that's assumed to be already defined.
When you write deliberately across all four dimensions — hierarchy, relationships, constraints, vocabulary — you give the model something it can actually reason within. Not a list of instructions to follow, but a map of the domain to operate from.
Where the Research Is Right Now
I want to be transparent that this is an active and still-maturing area of practice, not an established playbook. The work being published out of venues like ISWC (International Semantic Web Conference), the KaLLM workshop at ACL, and the LLMs4OL challenge is genuinely exciting — and genuinely early.
One paper worth knowing: Agent-OM (Qiang, Wang, and Taylor, VLDB 2024) demonstrates that LLMs operating within ontological frameworks — using retrieval, matching, and constraint-checking rather than pure generation — significantly outperform models reasoning without that structure, particularly on complex domain-specific tasks. What's notable is that their approach doesn't retrain the model at all. The boundary work happens at inference time, through retrieval and validation. The model stays general; the ontology makes it specific.
If you're building something where being wrong in a specific domain is costly — clinical, legal, scientific, financial — this is worth digging into. The tools exist. The research is advancing. The design pattern is clearer than it's ever been.
The map is there. The question is whether you hand it to your model.