<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.dura-lex.org/index.php?action=history&amp;feed=atom&amp;title=MCP%2FReference_resolution</id>
	<title>MCP/Reference resolution - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.dura-lex.org/index.php?action=history&amp;feed=atom&amp;title=MCP%2FReference_resolution"/>
	<link rel="alternate" type="text/html" href="https://wiki.dura-lex.org/index.php?title=MCP/Reference_resolution&amp;action=history"/>
	<updated>2026-04-23T05:36:08Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.3</generator>
	<entry>
		<id>https://wiki.dura-lex.org/index.php?title=MCP/Reference_resolution&amp;diff=44&amp;oldid=prev</id>
		<title>Nicolas: Create MCP/Reference resolution page from REFERENCE-RESOLUTION.md (via create-page on MediaWiki MCP Server)</title>
		<link rel="alternate" type="text/html" href="https://wiki.dura-lex.org/index.php?title=MCP/Reference_resolution&amp;diff=44&amp;oldid=prev"/>
		<updated>2026-04-23T02:06:25Z</updated>

		<summary type="html">&lt;p&gt;Create MCP/Reference resolution page from REFERENCE-RESOLUTION.md (via create-page on MediaWiki MCP Server)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;= Reference Resolution =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
Reference resolution transforms a natural language legal reference (or a document ID) into a query against the corpus. It is the bridge between how humans cite law and how the database stores it.&lt;br /&gt;
&lt;br /&gt;
The resolver is used by:&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;MCP &amp;lt;code&amp;gt;get&amp;lt;/code&amp;gt; tool&amp;#039;&amp;#039;&amp;#039;: user asks for &amp;quot;article 1240 du code civil&amp;quot; → find the document&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Knowledge graph compiler&amp;#039;&amp;#039;&amp;#039; (future): annotations reference articles by citation → resolve to corpus document IDs&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Edge resolution&amp;#039;&amp;#039;&amp;#039;: when &amp;lt;code&amp;gt;corpus.edges.target_id&amp;lt;/code&amp;gt; is NULL, a background job resolves &amp;lt;code&amp;gt;reference&amp;lt;/code&amp;gt; strings to document IDs&lt;br /&gt;
&lt;br /&gt;
== TagQuery ==&lt;br /&gt;
&lt;br /&gt;
The universal output of any reference parser. Jurisdiction-agnostic.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;python&amp;quot;&amp;gt;&lt;br /&gt;
@dataclass(frozen=True)&lt;br /&gt;
class TagQuery:&lt;br /&gt;
    language: str  # required, kw_only&lt;br /&gt;
    kind: str | None = None&lt;br /&gt;
    tag_filters: TagFilterSet = field(default_factory=TagFilterSet)&lt;br /&gt;
    should_sort_in_force_first: bool = False&lt;br /&gt;
    at_date: date | None = None&lt;br /&gt;
    hint: str | None = None&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Fields:&lt;br /&gt;
* &amp;lt;code&amp;gt;language&amp;lt;/code&amp;gt;: ISO 639-1 code, required (disambiguates language variants)&lt;br /&gt;
* &amp;lt;code&amp;gt;kind&amp;lt;/code&amp;gt;: filter on document kind (legislation, decision, record...)&lt;br /&gt;
* &amp;lt;code&amp;gt;tag_filters&amp;lt;/code&amp;gt;: TagFilterSet with all tag predicates (EQ, IN, NOT_IN, ILIKE, EXISTS, NOT_EXISTS, NORMALIZE)&lt;br /&gt;
* &amp;lt;code&amp;gt;should_sort_in_force_first&amp;lt;/code&amp;gt;: order results with &amp;lt;code&amp;gt;tags.in_force=true&amp;lt;/code&amp;gt; first, then by date descending&lt;br /&gt;
* &amp;lt;code&amp;gt;at_date&amp;lt;/code&amp;gt;: temporal version selection — &amp;lt;code&amp;gt;date &amp;lt;= at_date AND (date_end IS NULL OR date_end &amp;gt; at_date)&amp;lt;/code&amp;gt;&lt;br /&gt;
* &amp;lt;code&amp;gt;hint&amp;lt;/code&amp;gt;: optional human-readable interpretation label. TagQueries &amp;#039;&amp;#039;&amp;#039;with&amp;#039;&amp;#039;&amp;#039; &amp;lt;code&amp;gt;hint&amp;lt;/code&amp;gt; are &amp;#039;&amp;#039;candidates&amp;#039;&amp;#039; (collected from all plugins, disambiguated at MCP level); those &amp;#039;&amp;#039;&amp;#039;without&amp;#039;&amp;#039;&amp;#039; are &amp;#039;&amp;#039;confident matches&amp;#039;&amp;#039; (first hit wins)&lt;br /&gt;
&lt;br /&gt;
== TagFilterSet ==&lt;br /&gt;
&lt;br /&gt;
Unified immutable filter model. A tuple of &amp;lt;code&amp;gt;TagFilter&amp;lt;/code&amp;gt; predicates, AND-combined.&lt;br /&gt;
Replaces the previous scattered dict params (&amp;lt;code&amp;gt;tags&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;tags_ilike&amp;lt;/code&amp;gt;, &amp;lt;code&amp;gt;normalize&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;python&amp;quot;&amp;gt;&lt;br /&gt;
class TagFilterOp(enum.Enum):&lt;br /&gt;
    EQ          # tags @&amp;gt; &amp;#039;{&amp;quot;k&amp;quot;: &amp;quot;v&amp;quot;}&amp;#039; (JSONB containment)&lt;br /&gt;
    IN          # tags-&amp;gt;&amp;gt;&amp;#039;k&amp;#039; = ANY(...)&lt;br /&gt;
    NOT_IN      # tags ? &amp;#039;k&amp;#039; AND NOT (tags-&amp;gt;&amp;gt;&amp;#039;k&amp;#039; = ANY(...))&lt;br /&gt;
    ILIKE       # unaccent(tags-&amp;gt;&amp;gt;&amp;#039;k&amp;#039;) ILIKE unaccent(pattern)&lt;br /&gt;
    EXISTS      # tags ? &amp;#039;k&amp;#039;&lt;br /&gt;
    NOT_EXISTS  # NOT (tags ? &amp;#039;k&amp;#039;)&lt;br /&gt;
    NORMALIZE   # regexp_replace comparison (reference resolution only)&lt;br /&gt;
&lt;br /&gt;
@dataclass(frozen=True)&lt;br /&gt;
class TagFilter:&lt;br /&gt;
    key: str                              # tag key or virtual key (source/jurisdiction/language)&lt;br /&gt;
    op: TagFilterOp&lt;br /&gt;
    value: str | list[str] | None = None&lt;br /&gt;
    normalize_pattern: str | None = None  # only for NORMALIZE&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Convenience constructor: &amp;lt;code&amp;gt;TagFilterSet.from_tags({&amp;quot;k&amp;quot;: &amp;quot;v&amp;quot;})&amp;lt;/code&amp;gt; builds EQ filters from a dict.&lt;br /&gt;
&lt;br /&gt;
== Resolution pipeline ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
input string&lt;br /&gt;
    |&lt;br /&gt;
    v&lt;br /&gt;
1. Direct ID lookup (try as corpus.documents.id — covers all kinds in one query)&lt;br /&gt;
    |  found? → return (with CID-based version redirection if at_date is set)&lt;br /&gt;
    v&lt;br /&gt;
2. Jurisdiction parser (FR, EU, GB...) → TagQuery&lt;br /&gt;
    |  parsed? → execute against store&lt;br /&gt;
    |  Note: SIREN 9-digit lookup is a detector in the FR plugin (with Luhn validation)&lt;br /&gt;
    v&lt;br /&gt;
3. Disambiguation (bare article number matches multiple codes → error with suggestions)&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== TagQuery examples ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;python&amp;quot;&amp;gt;&lt;br /&gt;
# &amp;quot;article 1240 du code civil&amp;quot;&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;legislation&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet.from_tags({&amp;quot;article_number&amp;quot;: &amp;quot;1240&amp;quot;, &amp;quot;code&amp;quot;: &amp;quot;Code civil&amp;quot;}),&lt;br /&gt;
    should_sort_in_force_first=True)&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;article 1147 du code civil&amp;quot; (version in force in 2015)&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;legislation&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet.from_tags({&amp;quot;article_number&amp;quot;: &amp;quot;1147&amp;quot;, &amp;quot;code&amp;quot;: &amp;quot;Code civil&amp;quot;}),&lt;br /&gt;
    at_date=date(2015, 6, 15))&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;loi n 2021-1109&amp;quot;&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;legislation&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet.from_tags({&amp;quot;nature&amp;quot;: &amp;quot;LOI&amp;quot;, &amp;quot;number&amp;quot;: &amp;quot;2021-1109&amp;quot;}))&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;pourvoi 20-20.648&amp;quot; — case number with court filter and hint (candidate)&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;decision&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet(filters=(&lt;br /&gt;
        TagFilter(key=&amp;quot;case_number&amp;quot;, op=TagFilterOp.NORMALIZE,&lt;br /&gt;
                  value=&amp;quot;20-20.648&amp;quot;, normalize_pattern=r&amp;quot;[\s.\-/]&amp;quot;),&lt;br /&gt;
        TagFilter(key=&amp;quot;court&amp;quot;, op=TagFilterOp.EQ, value=&amp;quot;cour_cassation&amp;quot;),&lt;br /&gt;
    )),&lt;br /&gt;
    hint=&amp;quot;pourvoi Cour de cassation&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;486329&amp;quot; — CE request number&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;decision&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet(filters=(&lt;br /&gt;
        TagFilter(key=&amp;quot;case_number&amp;quot;, op=TagFilterOp.NORMALIZE,&lt;br /&gt;
                  value=&amp;quot;486329&amp;quot;, normalize_pattern=r&amp;quot;[\s.\-/]&amp;quot;),&lt;br /&gt;
        TagFilter(key=&amp;quot;court&amp;quot;, op=TagFilterOp.EQ, value=&amp;quot;conseil_etat&amp;quot;),&lt;br /&gt;
    )),&lt;br /&gt;
    hint=&amp;quot;requete Conseil d&amp;#039;Etat&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;21/00091&amp;quot; — CA/TJ RG number&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;decision&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet(filters=(&lt;br /&gt;
        TagFilter(key=&amp;quot;case_number&amp;quot;, op=TagFilterOp.NORMALIZE,&lt;br /&gt;
                  value=&amp;quot;21/00091&amp;quot;, normalize_pattern=r&amp;quot;[\s.\-/]&amp;quot;),&lt;br /&gt;
        TagFilter(key=&amp;quot;court&amp;quot;, op=TagFilterOp.IN, value=[&amp;quot;cour_appel&amp;quot;, &amp;quot;tribunal_judiciaire&amp;quot;]),&lt;br /&gt;
    )),&lt;br /&gt;
    hint=&amp;quot;RG cour d&amp;#039;appel ou tribunal judiciaire&amp;quot;)&lt;br /&gt;
&lt;br /&gt;
# ECLI&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;decision&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet.from_tags({&amp;quot;ecli&amp;quot;: &amp;quot;ECLI:FR:CCASS:2024:C100001&amp;quot;}))&lt;br /&gt;
&lt;br /&gt;
# &amp;quot;IDCC 3239&amp;quot;&lt;br /&gt;
TagQuery(language=&amp;quot;fr&amp;quot;, kind=&amp;quot;legislation&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet.from_tags({&amp;quot;idcc&amp;quot;: &amp;quot;3239&amp;quot;, &amp;quot;in_force&amp;quot;: &amp;quot;true&amp;quot;}))&lt;br /&gt;
&lt;br /&gt;
# UK: &amp;quot;section 1 of the Theft Act 1968&amp;quot; (ILIKE for fuzzy act title)&lt;br /&gt;
TagQuery(language=&amp;quot;en&amp;quot;, kind=&amp;quot;legislation&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet(filters=(&lt;br /&gt;
        TagFilter(key=&amp;quot;section_number&amp;quot;, op=TagFilterOp.EQ, value=&amp;quot;1&amp;quot;),&lt;br /&gt;
        TagFilter(key=&amp;quot;act_title&amp;quot;, op=TagFilterOp.ILIKE, value=&amp;quot;Theft Act 1968&amp;quot;),&lt;br /&gt;
    )))&lt;br /&gt;
&lt;br /&gt;
# DE: &amp;quot;§ 823 BGB&amp;quot;&lt;br /&gt;
TagQuery(language=&amp;quot;de&amp;quot;, kind=&amp;quot;legislation&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet.from_tags({&amp;quot;paragraph&amp;quot;: &amp;quot;823&amp;quot;, &amp;quot;code&amp;quot;: &amp;quot;BGB&amp;quot;}))&lt;br /&gt;
&lt;br /&gt;
# EU: &amp;quot;Article 101 TFEU&amp;quot;&lt;br /&gt;
TagQuery(language=&amp;quot;en&amp;quot;, kind=&amp;quot;legislation&amp;quot;,&lt;br /&gt;
    tag_filters=TagFilterSet.from_tags({&amp;quot;article_number&amp;quot;: &amp;quot;101&amp;quot;, &amp;quot;treaty&amp;quot;: &amp;quot;TFEU&amp;quot;}))&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Store execution ==&lt;br /&gt;
&lt;br /&gt;
The store translates TagQuery to SQL via the shared &amp;lt;code&amp;gt;build_tag_filter_conditions&amp;lt;/code&amp;gt;&lt;br /&gt;
helper. Zero jurisdiction knowledge. Each TagFilterOp maps to a specific SQL pattern:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;sql&amp;quot;&amp;gt;&lt;br /&gt;
SELECT * FROM corpus.documents&lt;br /&gt;
WHERE language = %(language)s                                       -- always required&lt;br /&gt;
  AND kind = %(kind)s&lt;br /&gt;
  AND tags @&amp;gt; %(eq_batch)s                                          -- coalesced EQ filters (GIN)&lt;br /&gt;
  AND tags-&amp;gt;&amp;gt;&amp;#039;k&amp;#039; = ANY(%(in_values)s)                               -- IN filter&lt;br /&gt;
  AND unaccent(tags-&amp;gt;&amp;gt;&amp;#039;code&amp;#039;) ILIKE unaccent(%(code_pattern)s)      -- ILIKE filter&lt;br /&gt;
  AND regexp_replace(tags-&amp;gt;&amp;gt;&amp;#039;case_number&amp;#039;, %(p)s, &amp;#039;&amp;#039;, &amp;#039;g&amp;#039;)          -- NORMALIZE filter&lt;br /&gt;
      = regexp_replace(%(v)s, %(p)s, &amp;#039;&amp;#039;, &amp;#039;g&amp;#039;)&lt;br /&gt;
  AND date &amp;lt;= %(at_date)s                                           -- temporal (if at_date)&lt;br /&gt;
  AND (date_end IS NULL OR date_end &amp;gt; %(at_date)s)&lt;br /&gt;
ORDER BY&lt;br /&gt;
    (tags-&amp;gt;&amp;gt;&amp;#039;in_force&amp;#039;)::boolean DESC NULLS LAST,                   -- should_sort_in_force_first&lt;br /&gt;
    date DESC NULLS LAST&lt;br /&gt;
LIMIT 10;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Multi-candidate resolution ==&lt;br /&gt;
&lt;br /&gt;
When a reference is ambiguous (e.g., a French case number that could match&lt;br /&gt;
multiple courts), the resolver returns multiple TagQuery instances, each with&lt;br /&gt;
a &amp;lt;code&amp;gt;hint&amp;lt;/code&amp;gt; describing the interpretation. The MCP &amp;lt;code&amp;gt;get_document&amp;lt;/code&amp;gt; tool:&lt;br /&gt;
&lt;br /&gt;
# Collects all TagQueries from all jurisdiction plugins&lt;br /&gt;
# Separates &amp;#039;&amp;#039;&amp;#039;confident&amp;#039;&amp;#039;&amp;#039; (no hint) from &amp;#039;&amp;#039;&amp;#039;candidates&amp;#039;&amp;#039;&amp;#039; (with hint)&lt;br /&gt;
# Tries confident matches first — first hit wins (existing behavior)&lt;br /&gt;
# Tries all candidates against the store:&lt;br /&gt;
#* 1 match → returns the document with a warning noting the interpretation&lt;br /&gt;
#* 2+ matches → returns an error listing all candidates with IDs&lt;br /&gt;
#* 0 matches → falls through to &amp;quot;not found&amp;quot;&lt;br /&gt;
&lt;br /&gt;
The &amp;lt;code&amp;gt;get_document&amp;lt;/code&amp;gt; tool also accepts an optional &amp;lt;code&amp;gt;tags&amp;lt;/code&amp;gt; parameter that merges&lt;br /&gt;
additional EQ filters into each TagQuery, narrowing the search (e.g.,&lt;br /&gt;
&amp;lt;code&amp;gt;tags={&amp;quot;court&amp;quot;: &amp;quot;conseil_etat&amp;quot;}&amp;lt;/code&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
== Jurisdiction parsers ==&lt;br /&gt;
&lt;br /&gt;
Each jurisdiction plugin provides a parser that recognizes its citation formats:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
! Plugin !! Recognizes&lt;br /&gt;
|-&lt;br /&gt;
| duralex-fr || Articles of codes, loi/decret/ordonnance by number, NOR codes, IDCC, ECLI, case numbers, BOFiP IDs, named laws&lt;br /&gt;
|-&lt;br /&gt;
| duralex-eu || CELEX numbers, ECLI, treaty articles, directive/regulation numbers&lt;br /&gt;
|-&lt;br /&gt;
| Future duralex-gb || Neutral citations ([2024] UKSC 1), Act + section, SI numbers&lt;br /&gt;
|-&lt;br /&gt;
| Future duralex-de || § + BGB/StGB/etc, Aktenzeichen, ECLI&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Parsers are composable: the MCP server chains all installed jurisdiction parsers. First match wins.&lt;br /&gt;
&lt;br /&gt;
== Batch resolution (for knowledge graph) ==&lt;br /&gt;
&lt;br /&gt;
The compiler needs to resolve millions of references. The same TagQuery mechanism is used, but with batch-friendly optimizations:&lt;br /&gt;
* Pre-filter by known patterns (regex on reference strings)&lt;br /&gt;
* Group by reference type and execute one query per group&lt;br /&gt;
* Cache resolved IDs in a lookup table for the duration of compilation&lt;br /&gt;
&lt;br /&gt;
== Edge resolution (background job) ==&lt;br /&gt;
&lt;br /&gt;
When &amp;lt;code&amp;gt;corpus.edges.target_id&amp;lt;/code&amp;gt; is NULL, a background job periodically attempts resolution:&lt;br /&gt;
&lt;br /&gt;
&amp;lt;syntaxhighlight lang=&amp;quot;sql&amp;quot;&amp;gt;&lt;br /&gt;
SELECT id, reference FROM corpus.edges WHERE target_id IS NULL;&lt;br /&gt;
-- For each: parse reference → TagQuery → execute → update target_id&lt;br /&gt;
UPDATE corpus.edges SET target_id = %(resolved_id)s WHERE id = %(edge_id)s;&lt;br /&gt;
&amp;lt;/syntaxhighlight&amp;gt;&lt;br /&gt;
&lt;br /&gt;
[[Category:MCP]]&lt;/div&gt;</summary>
		<author><name>Nicolas</name></author>
	</entry>
</feed>