Editing
Development/Python
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
= Python Conventions = All duralex-* packages follow these conventions. No exceptions. == Naming == === Casing (PEP 8) === {| class="wikitable" ! Element !! Case !! Example |- | Functions, methods, variables || <code>snake_case</code> || <code>parse_legislation_article()</code> |- | Classes || <code>PascalCase</code> || <code>LegislationArticle</code> |- | Constants || <code>UPPER_SNAKE</code> || <code>ALLOWED_TABLE_NAMES</code> |- | Enum members || <code>UPPER_SNAKE</code> || <code>Confidence.SOURCE_CHECKED</code> |- | Enum string values || <code>lowercase</code> || <code>"source_checked"</code> |- | Module filenames || <code>snake_case</code> || <code>connection_pool.py</code> |} === No abbreviations === The code is written and maintained by AI. The AI does not tire of typing. Full words always. {| class="wikitable" ! Wrong !! Right |- | <code>ref</code> || <code>reference</code> |- | <code>leg</code> || <code>legislation</code> |- | <code>dec</code> || <code>decision</code> |- | <code>doc</code> || <code>document</code> |- | <code>fts</code> || <code>full_text_search</code> |- | <code>tbl</code> || <code>table</code> |- | <code>q</code> || <code>query</code> |- | <code>lim</code> || <code>limit</code> |- | <code>flt</code> || <code>filter</code> |- | <code>el</code> || <code>element</code> |- | <code>ctx</code> || <code>context</code> |- | <code>conn</code> || <code>connection</code> |- | <code>cfg</code> || <code>configuration</code> |- | <code>num</code> || <code>number</code> |- | <code>idx</code> || <code>index</code> |- | <code>val</code> || <code>value</code> |} === Qualified names === Single-word names are ambiguous. Always qualify with the domain. {| class="wikitable" ! Wrong !! Right !! Why |- | <code>query</code> || <code>search_query</code> || Could be SQL, HTTP, FTS... |- | <code>text</code> || <code>article_text</code> || Could be anything |- | <code>content</code> || <code>html_content</code> || What kind? |- | <code>result</code> || <code>search_result</code> || Result of what? |- | <code>data</code> || <code>decision_data</code> || Meaningless alone |- | <code>items</code> || <code>matched_articles</code> || What items? |- | <code>response</code> || <code>search_response</code> || From where? |- | <code>path</code> || <code>file_path</code> or <code>concept_path</code> || Filesystem? URI? |} === Booleans read as phrases === A boolean variable or parameter must read as a true/false statement. <syntaxhighlight lang="python"> # Wrong active = True force = False recursive = True # Right is_in_force = True should_force_refresh = False is_recursive = True has_been_verified = False should_include_repealed = True </syntaxhighlight> === Classes: named for what they ARE === <syntaxhighlight lang="python"> class LegislationArticle: ... class CaseLawDecision: ... class ResolvedReference: ... class AnnotationEnvelope: ... class ConceptDefinition: ... class SearchFilters: ... class SearchResults: ... class CompiledPackage: ... </syntaxhighlight> === Methods: verb + explicit object === <syntaxhighlight lang="python"> def parse_legislation_article(xml_path: Path) -> LegislationArticle: ... def resolve_legal_reference(raw_text: str) -> list[ResolvedReference]: ... def search_full_text(query: str, filters: SearchFilters) -> SearchResults: ... def compile_domain_package(domain: str) -> CompiledPackage: ... def sanitize_html_content(raw_html: str) -> str: ... def extract_text_content(element: Element, xpath: str) -> str | None: ... def validate_date_range(date_from: str | None, date_to: str | None) -> None: ... </syntaxhighlight> === Protocols: named for the capability === <syntaxhighlight lang="python"> class LegislationParser(Protocol): ... class ReferenceResolver(Protocol): ... class SearchEngine(Protocol): ... class VersionSelector(Protocol): ... class DecisionDownloader(Protocol): ... </syntaxhighlight> === Enums === <syntaxhighlight lang="python"> class ConceptType(Enum): QUALIFIABLE = "qualifiable" OPEN_STANDARD = "open_standard" GUIDING_PRINCIPLE = "guiding_principle" PROCEDURAL = "procedural" SCALE = "scale" class Confidence(Enum): STUB = "stub" MEMORY_ONLY = "memory_only" SOURCE_CHECKED = "source_checked" CROSS_VALIDATED = "cross_validated" class Outcome(Enum): QUALIFIED = "qualified" NOT_QUALIFIED = "not_qualified" VALIDATED = "validated" INVALIDATED = "invalidated" PROCEDURAL = "procedural" MOOT = "moot" </syntaxhighlight> == Architecture patterns == === Dependency injection === Dependencies are passed explicitly. No global singletons, no module-level mutable state. <syntaxhighlight lang="python"> # Wrong class SearchEngine: def __init__(self): self._pool = _get_global_pool() self._load_cache() # Right class FullTextSearchEngine: def __init__(self, connection_pool: ConnectionPool): self.connection_pool = connection_pool </syntaxhighlight> === No side effects in <code>__init__</code> === Constructors store parameters. They do not open connections, load caches, or perform I/O. <syntaxhighlight lang="python"> # Wrong class FrenchReferenceResolver: def __init__(self, connection_pool: ConnectionPool): self.connection_pool = connection_pool self._code_cache = self._load_code_cache() # I/O in __init__ # Right class FrenchReferenceResolver: def __init__(self, connection_pool: ConnectionPool): self.connection_pool = connection_pool self._code_cache: dict[str, str] | None = None def _ensure_code_cache(self) -> dict[str, str]: if self._code_cache is None: self._code_cache = self._load_code_cache() return self._code_cache </syntaxhighlight> === Composition over inheritance === Core libraries define <code>Protocol</code> interfaces. Country packages and plugins provide implementations. Applications compose them. <syntaxhighlight lang="python"> # duralex -- defines the interface class ReferenceResolver(Protocol): def resolve_legal_reference(self, raw_text: str) -> list[ResolvedReference]: ... class CompositeReferenceResolver: """Chains multiple resolvers. First match wins.""" def __init__(self, resolvers: list[ReferenceResolver]): self.resolvers = resolvers def resolve_legal_reference(self, raw_text: str) -> list[ResolvedReference]: for resolver in self.resolvers: if results := resolver.resolve_legal_reference(raw_text): return results return [] # duralex-fr -- implements for France class FrenchLegalReferenceResolver: """French legal references: articles, lois, pourvois, ECLI.""" ... # Application -- composes at startup resolver = CompositeReferenceResolver([ FrenchLegalReferenceResolver(connection_pool=pool), SireneCompanyResolver(), ]) </syntaxhighlight> === One module = one concept === A Python file should contain one coherent concept. If you need a table of contents to navigate the file, split it. {| class="wikitable" ! Wrong !! Right |- | <code>db.py</code> (1400 lines: pool + CRUD + FTS + ingest + dedup + browse) || <code>connection_pool.py</code>, <code>full_text_search.py</code>, <code>ingest_state.py</code>, <code>browse_structure.py</code> |- | <code>validation.py</code> (filters + jurisdiction + pagination + dates + courts) || <code>search_filters.py</code>, <code>court_classification.py</code> |} Target: '''under 300 lines per file'''. Hard limit: '''500 lines'''. == Type annotations == Every function signature is fully annotated. An auditor reads signatures before reading bodies. <syntaxhighlight lang="python"> # Wrong def search(query, table, limit=20): ... # Right def search_full_text( search_query: str, table_name: str, result_limit: int = 20, date_from: date | None = None, date_to: date | None = None, ) -> SearchResults: ... </syntaxhighlight> Use <code>|</code> union syntax (Python 3.10+), not <code>Optional</code> or <code>Union</code>. == Docstrings == Every public class and function has a docstring. Docstrings include <code>Examples</code> blocks -- AI reads examples first to understand expected behavior. <syntaxhighlight lang="python"> def resolve_legal_reference(raw_text: str) -> list[ResolvedReference]: """Parse a legal citation string into structured references. Runs a pipeline of detectors in priority order (most specific first). First match wins. Returns empty list if no pattern matches. Args: raw_text: A French legal citation in natural language. Returns: List of resolved references with canonical URIs. Examples: >>> resolve_legal_reference("article 1240 du code civil") [ResolvedReference(uri="fr.law.code.civil.article-1240")] >>> resolve_legal_reference("loi nΒ° 85-677") [ResolvedReference(uri="fr.law.loi.85-677")] >>> resolve_legal_reference("bonjour") [] """ </syntaxhighlight> == Error handling == Errors are explicit. Never swallowed. Never hidden behind a generic fallback. <syntaxhighlight lang="python"> # Wrong try: result = parse_article(path) except Exception: result = None # Right try: result = parse_article(path) except FileNotFoundError: raise ArticleNotFoundError(article_id=article_id, path=path) from None except etree.XMLSyntaxError as error: raise ArticleParseError(article_id=article_id, detail=str(error)) from error </syntaxhighlight> Custom exception classes inherit from a common base: <syntaxhighlight lang="python"> class DuralexError(Exception): """Base exception for all Dura Lex errors.""" class ArticleNotFoundError(DuralexError): """Raised when a legislation article file does not exist on disk.""" class ArticleParseError(DuralexError): """Raised when a legislation article XML file cannot be parsed.""" class ReferenceResolutionError(DuralexError): """Raised when a legal reference is ambiguous or malformed.""" </syntaxhighlight> == Language == '''All code is in English.''' Variable names, function names, class names, docstrings, comments, error messages -- everything. '''Content is in the jurisdiction's language.''' Concept names (<code>fr.civil.contrat.formation.consentement.vice.dol</code>), article text, legal vocabulary, court names -- these are in French (or the local language of the jurisdiction). The boundary is clear: code structure is English, data values are local. [[Category:Development]]
Summary:
Please note that all contributions to Dura Lex Wiki are considered to be released under the Creative Commons Attribution-ShareAlike (see
Dura Lex Wiki:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
Edit source
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Tools
What links here
Related changes
Page information