Philosophy
The reality
Citizens, lawyers, and organizations increasingly use AI to handle legal questions. This trend will not reverse. AI is becoming a primary interface to the law, worldwide.
The question is not whether people will use AI for law. They already do. The question is whether they will do it safely.
The problem
Black boxes
Current legal AI tools are black boxes. The data they rely on is opaque. Their reasoning is not auditable. Their confidentiality commitments are neither provable nor verifiable.
« Les engagements de confidentialité des fournisseurs de solutions d'IA sont ni prouvables ni vérifiables. »
— French National Bar Council (CNB), Guide de la déontologie et de l'intelligence artificielle, March 2026
No legal privilege
Attorney-client privilege does not extend to AI conversations. When a lawyer, a company, or an individual consults a cloud AI about a legal matter, those conversations are stored on someone else's servers. They can be:
- Seized in court proceedings or by regulators
- Subpoenaed via domestic or foreign discovery (Cloud Act, FISA)
- Exposed in a data breach
- Used for profiling or training, depending on the provider's terms
There is no privilege, no secrecy, no control. Every cloud-hosted legal conversation is a liability.
Hallucination
LLMs hallucinate on legal content. Even augmented by RAG, hallucination rates remain 17% (Lexis+ AI) to 43% (GPT-4) according to recent empirical studies. Up to 88% of legal citations can be invented. Courts have already started sanctioning AI-generated legal filings.
The mission
Dura Lex does not aim to prevent AI usage in law — it aims to make it safe. Our goal:
- Safety
- Strict guidelines, quality checks, content quality tracking on every document. The system should never hide uncertainty — it should express it. Gaps in coverage should be flagged, not concealed.
- Transparency
- Everything traceable and auditable, from source data to final answer. Every document carries its provenance. Every enrichment carries its method and confidence. This is where we are heading — not every link in the chain is fully auditable today, but the architecture is designed for it and we are building toward it.
- Sovereignty
- The entire stack deployable on-premise, on sovereign infrastructure, or fully air-gapped. The law comes to the user's data — not the user's data to someone else's cloud. No dependency on foreign providers.
- Professional secrecy
- Conversations, queries, and research under the user's control. The architecture is designed so that sensitive data never needs to leave the user's perimeter.
The answer: an open operating system for legal data
Dura Lex is architected as an operating system for law:
- A jurisdiction-agnostic kernel — protocols, data types, URI schema, independent of any country
- Jurisdiction drivers — one plugin per legal system (France and EU today, designed for any country)
- A robust ingestion pipeline — structured, versioned, reproducible
- Services — MCP server for AI agents, web portal for humans, full-text search with per-language stemming
France is the first implementation, not the scope. The schema follows the OpenStreetMap model: a single documents table with JSONB tags. Six universal structural kinds (legislation, decision, record, notice, section, chunk) cover every document type we have encountered across 25+ jurisdictions tested. Legal categories live in tags, not in the schema. Adding a jurisdiction requires zero schema migration.
Open source, open data
| Component | License | Rationale |
|---|---|---|
| Software | MIT | Anyone can use, fork, embed, commercialize without restriction |
| Enriched data | ODbL | Share-alike — improvements flow back to the commons, no one can close the data |
| Raw source data | Per-source (Licence Ouverte, CC0, etc.) | Government open data |
The OpenStreetMap model: permissive code, copyleft data.
Credibility by audit, not by reputation
Traditional legal publishing relies on editorial curation — selection, ranking, interpretation. This work has immense value, but it implies a filter.
Our approach: discard nothing, structure everything, and make every assertion traceable to its source. Authority comes from traceability, not from a name on the cover.
Doubt is always expressed
The system should never pretend to certainty it does not have. Missing data, low-quality sources, incomplete coverage, untested jurisdictions — all should be surfaced, never hidden.
A tool that tells you "here is the answer" without showing where it looked, what it found, and what it might have missed — that is a black box.
A tool where every step is inspectable, every limitation is stated, and every source is cited — that is a digital common.
Unique in the landscape
We have analyzed 80+ legal MCP servers across 40+ jurisdictions. The vast majority are simple API relays with no behavioral framing. Dura Lex is the only project with mandatory safety guidelines injected before every research session, and the only one with a quality feedback mechanism allowing the AI to report issues in the data.