SparqlModel Technical Specification

Overview

This document specifies the SparqlModel ORM layer: session API, query compilation, hydration, cascade policy, and stores.

SparqlModel — the SQLModel of SPARQL.

Mapping (literals, terms, to_graph, sync_to_graph, from_graph, parse, serialize) is specified and implemented by TripleModel >=0.10, a required dependency (Pydantic TripleModel classes). SparqlModel integrates TripleModel internally; application code uses SPARQLSession and Pydantic v2 SPARQLModel unless doing stateless file I/O.

Concern	SparqlModel	TripleModel
`SPARQLSession` CRUD	Yes	No
Query DSL + compiler	Yes	No
Cascade / orphans on `put`	Yes	No
Hydration `depth`	Yes	No
Stores	Yes	No
Model ↔ triples, terms, files	`SPARQLModel(TripleModel)` (0.4+); thin `serializers` wrappers (0.7)	Yes

ORM.md · ECOSYSTEM.md · ROADMAP.md · PRODUCTION.md

SPARQLSession

ORM entry point. Binds a Store (default MemoryStore) and namespace registry.

with SPARQLSession() as session:
    session.put(person)
    found = session.query(Person).where(Person.name == "Odos").first()

Methods

Method	Behavior
`add(model)`	Append triples; no removal of existing subject triples
`put(model, *, flush=True)`	Remove owned subjects (cascade), then write; queue when `flush=False`
`delete(model)`	Remove owned triples for root + embedded composition
`get(model_cls, iri, *, depth=0)`	Load one resource; optional relationship depth 0–2
`query(model_cls)`	Return `Query` builder
`execute(sparql)`	Raw SELECT; auto-prefixes when configured
`flush()` / `rollback_pending()`	Apply or discard pending `put` queue
`close()`	Call `store.close()` when available
`expire(model_cls, iri)`	Evict identity and hydration cache for an IRI (0.9 also drops pending `put` for that subject)
`expunge(model)`	Detach one instance from session cache (store unchanged) — 0.9
`expunge_all()`	Clear identity map and hydration cache (pending queue unchanged) — 0.9
`refresh(model, *, depth=0)`	Reload from store into cached instance when present — 0.9
`merge(model)`	Return canonical session instance for identity key (no store write) — 0.9

Context manager

On clean exit: flush() if the pending queue is non-empty. On exception: rollback_pending() when rollback_on_error=True (default). Always calls close() when close_on_exit=True (default). Does not undo already-flushed writes.

Properties

store — backing store
graph — triplemodel.Store (MemoryStore graph, or HttpStore local mirror — not the remote dataset)
namespaces — NamespaceRegistry for compiler and serialization

Session lifecycle (target API)

Current (0.9): Context manager flushes pending put queue on success; rollback_pending on error; expire(model_cls, iri) evicts identity and hydration cache; merge, refresh, expunge, and expunge_all for explicit cache control (sync and async). Not thread-safe.

Target (1.2):

Method	Behavior
`merge(model)`	Attach detached/transient instance to session; reconcile with identity map
`refresh(model, *, depth=0)`	Reload from store; replace cached attributes
`expunge(model)`	Remove one instance from identity map
`expunge_all()`	Clear identity map and hydration cache
`scoped_session(...)`	Factory for request-scoped sessions (FastAPI pattern)

Object states (SQLAlchemy-aligned):

transient → (add|put) → pending (flush=False) → persistent (in store + identity map)
persistent → delete → (removed from store; expunge clears session)
persistent → expunge → detached (no session; may merge again)

Threading: One SPARQLSession per task/request unless documented otherwise; shared HttpStore requires external synchronization or single-writer discipline.

Query builder

with SPARQLSession() as session:
    session.query(Person).where(Person.name == "Odos").all()
    session.query(Person).where(Person.works_for.name == "Acme").limit(10).first()

.where(*expr) — CompareExpr, AndExpr, or top-level OrExpr
.limit(n) — non-negative integer
.offset(n) — non-negative integer (0.8)
.order_by(field, *, desc=False) — scalar field only; repeatable (0.8)
.count() — returns int; ignores limit/offset/order_by (0.8)
.first() — always uses LIMIT 1; ignores any prior .limit() or .offset() on the same query
.use_not_exists_for_ne() — compile != with NOT EXISTS (default since 0.5.2)
.use_inequality_for_ne() — legacy inequality != (pre-0.5.2 default)
.all(*, depth=0) / .first(*, depth=0) — execute and hydrate

Query builder (target API)

Current (0.8): .offset(n), .order_by(field, *, desc=False), .count() (ignores limit/offset/order_by). .first() always LIMIT 1 and ignores .limit() / .offset(). Nullable relationship hops use OPTIONAL; relationship.is_(None) / is_not(None) for absence/presence. No distinct or field projection.

Target (post-1.3):

Method	SPARQL
`.distinct()`	`DISTINCT` projection (if supported)

Precedence: Python & binds tighter than |; (A & B) | C is two disjuncts (fixed 0.2).

SPARQL compilation

Person.name == "Odos" → SPARQL triple patterns bound to ?person.

Operator	Semantics
`==`	Pattern match
`!=`	`NOT EXISTS` by default (or `Query.use_inequality_for_ne()` for legacy inequality)
`&`	Conjoin patterns (`AndExpr` or multiple `.where`)
`\|`	Disjunction via `FILTER` + `EXISTS` branches (`OrExpr`)
`<`, `>`, `<=`, `>=`	Ordering on bound literal variables
`.in_(tuple)` / `.in_(list)`	`FILTER(?var IN (...))` — bare `str` raises `QueryError` (use `("value",)` or `["value"]`)
`None`	Raises `QueryError`

Nested attribute paths (Person.works_for.located_in.name) support arbitrary hop length via join variables and related-type patterns.

Implementation: compiler.py — SparqlModel only; TripleModel does not compile Python filters.

Hydration

with SPARQLSession() as session:
    session.get(Person, iri, depth=2)
    session.query(Person).where(...).all(depth=1)

`depth`	Loads
`0`	Scalars on root
`1`	One hop of `Relationship` fields
`2`	Two hops

validate_depth rejects values outside 0–2.

Integration note (0.3.x): scalar and relationship loading uses sparql_from_graph → TripleModel from_graph via interim _triple.py. 0.4+: SPARQLModel.from_graph on the unified subclass + SparqlModel depth hydration.

SPARQLModel

ORM entity base class. SQLModel-style declaration:

class Person(SPARQLModel):
    rdf_type = "schema:Person"
    __prefixes__ = {"schema": "https://schema.org/"}

    id: IRI
    name: str = Field("schema:name")

Metaclass enables Person.name == "x" in queries (FieldRef)
ensure_id() assigns urn:uuid:… when id is unset
JSON-LD helpers: model_dump_jsonld / model_validate_jsonld (ORM dict API; file JSON-LD via serialize — 0.7)
Subclasses TripleModel (Option A, 0.4+); merged metaclass for query FieldRef
model_config uses extra="forbid"
Field / Relationship are ORM sugar over rdf_field / Predicate (built at class creation, no exec)

Interim (0.3.x): dynamic shadow TripleModel classes via sparqlmodel._triple — removed in 0.4.

See also Models and Pydantic validation for application patterns.

Validation architecture

Three layers; all are complementary, not interchangeable.

Layer	When	Mechanism
Application (Pydantic)	`SPARQLModel(...)` / `model_validate`	Field types, `Field` constraints, `extra="forbid"`
Mapping (TripleModel)	`from_graph(..., validate_type=True)`	Expected `rdf:type` on subject; literal coercion per field
Graph shapes (optional)	`put` — 0.14	SHACL via `triplemodel[shacl]`; after Pydantic passes

Write path (0.4+): validated SPARQLModel → cascade in graph.py → sync_to_graph(model, store.graph, …).

Write path (0.3.x interim): validated SPARQLModel → to_triplemodel → TripleModel.model_validate → sync_to_graph.

Read path (0.4+): graph → SPARQLModel.from_graph → optional depth hydration → identity map; Pydantic ValidationError surfaced as HydrationError.

Planning rule: new ORM features should extend Pydantic annotations and Field kwargs before adding ad-hoc validation in session or compiler code. See SparqlModel Roadmap (Pydantic-first).

Relationships

works_for: Organization | None = Relationship("schema:worksFor", model=Organization)

Value type	Semantics
Embedded `SPARQLModel`	Composition — cascade on `put`/`delete`
`IRI`	Reference — no cascade delete of target

Relationships and hydration (target API)

Current (0.2): Single object per predicate on load; depth 0–2 eager-loads Relationship fields; composition cascade on put/delete.

Target (0.13):

list[T] / collection fields for multi-valued literals and IRIs (via TripleModel) — 0.13
Language-tagged fields (LangString, multi-lang maps) — 0.13 (TripleModel)
Polymorphic session.query(Base).where(...) matching subclasses — 0.13
Property paths, VALUES clause, IRI string filters, query negation — 0.13 (parity backlog)
Compiler emits OPTIONAL for nullable relationship paths in filters — 0.8.0
Optional Relationship(..., back_populates=...) / inverse navigation — 0.13

Persistence policy

SparqlModel-specific; orchestrates which subjects TripleModel (or interim graph.py) syncs.

`put`

Compute cascade_subjects_for_removal (root, nested embeds, orphans on relationship change)
Remove owned_triples_for_subjects from store graph
Add current model graph (model_to_graph → future: TripleModel export + cascade)

`delete`

Remove owned triples for cascade subject set (no re-add).

Ownership rules

Only declared predicates + rdf:type are owned
Extension triples on a subject are not removed by put/delete
Orphan keys use expanded IRIs and stable _:bnode keys

Mapping integration (TripleModel)

Dependencies (0.5+): triplemodel>=0.10.0,<2, pyoxigraph>=0.5,<0.6 in pyproject.toml (no core rdflib).

Today (0.7+): SPARQLModel(TripleModel); session graphs are triplemodel.Store; graph.py holds cascade/orphan policy; rdf_bridge owns graph I/O. serializers.py is thin wrappers over TripleModel infer_format, load_graph, and serialize.

Target wiring (0.4+):

SparqlModel surface	TripleModel API
`put` graph write	`sync_to_graph(model, graph, mode=...)` + cascade (same instance type)
`get` / query load	`SPARQLModel.from_graph` + depth hydration
`export_model`	`to_graph().serialize(...)` or `serialize()`
Predicate metadata	`rdf_field`, `Predicate`, nested `class Rdf`

Cascade orchestration remains in SparqlModel after wiring.

HttpStore

SPARQL 1.1 over HTTP (httpx) with a local mirror (stores/http.py).

Method	Target
`update_graph`	Remote `INSERT DATA` / `DELETE DATA`, then mirror delta on success
`query` / `execute` (via session)	Remote SELECT
`graph`, `get`, cascade/orphan	Mirror only

External writers or SELECT-only visibility without a matching mirror update can make get return None while execute returns bindings. Single-writer per endpoint is assumed. If both auth and bearer_token are set, Basic auth wins.

put may send DELETE DATA followed by INSERT DATA in one SPARQL Update request; whether that is atomic depends on the endpoint (not guaranteed in 0.2). After HttpStore.close(), query, update_graph, and pull_subjects_into_mirror raise RuntimeError (same for AsyncHttpStore.aclose()).

HTTP resilience (0.11+)

Constructor kwargs on HttpStore / AsyncHttpStore (sync + async parity):

Parameter	Default	Behavior
`max_retries`	`2`	Retry 502/503/504 and connection/timeouts on SELECT, CONSTRUCT pull, and each UPDATE chunk
`retry_backoff`	`0.5`	Exponential backoff between attempts (cap 30s)
`max_triples_per_update`	`500`	Split `update_graph` into multiple `INSERT DATA` / `DELETE DATA` requests
`query_method`	`"post"`	`"get"` for remote SELECT only (CONSTRUCT stays POST)

Mirror updates run only after all remote UPDATE chunks succeed. Mid-batch remote failure leaves the mirror unchanged; remote state may be partial.

Store protocol (target API)

Current (0.2): Store — graph, query(sparql), update_graph(add=, remove=).

Target (Production HttpStore 0.10–0.12):

Capability	Notes
`query`	SPARQL 1.1 SELECT (required)
`update`	Chunked `INSERT DATA` / `DELETE DATA` — shipped 0.11.0 (atomic multi-op sequences still endpoint-dependent)
`query_method`	GET vs POST for remote SELECT — shipped 0.11.0
`ask` / `construct`	Optional protocol methods for existence and graph-shaped reads — 0.14 (P2)
HttpStore `read_endpoint` / `write_endpoint`	Fuseki-style split URLs — 0.9.1 (shipped)
Replace-on-pull, `mirror_mode`	Shipped 0.10.0
Mirror sync (GSP `sync_mirror`)	Shipped 0.12.0 (`graph_store_url`, `sync_mirror`)
Retries, batch size limits	Shipped 0.11.0 (`max_retries`, `max_triples_per_update`)
`OxigraphStore` / embedded backends	Optional — 0.14+

Protocols: SPARQL 1.1 Query, SPARQL 1.1 Update, Graph Store HTTP.

Security (SPARQL generation)

Current (0.5+): Filter values serialized via SparqlModel N3 helpers (rdf_n3) on pyoxigraph terms and string IRIs. IRIs with invalid characters raise QueryError. Predicates come from model metadata (trusted code).

Target (1.3 GA):

No public API that concatenates untrusted strings into SPARQL text
Predicates and class IRIs remain declaration-time only
LIMIT / OFFSET remain integer-typed at API boundary
Security review documented before 1.3 GA

Async API (target 0.6)

Parallel to the sync stack; sync API remains supported.

Component	Sync (shipped)	Async (0.6.0)
Session	`SPARQLSession`	`AsyncSPARQLSession`
Store	`Store`, `MemoryStore`, `HttpStore`	`AsyncStore`, `AsyncMemoryStore`, `AsyncHttpStore`
Query	`Query.all()` / `first()`	`AsyncQuery` — `await .all()` / `.first()`
FastAPI	`SessionDep`, sync `get_session`	`AsyncSessionDep`, async `get_async_session`

Semantics: Same identity map, cascade, compiler, and hydration rules as sync. One session per asyncio task (not shared across concurrent tasks). HttpStore uses httpx.Client; AsyncHttpStore uses httpx.AsyncClient with the same mirror contract.

Non-goals for 0.6: Replacing sync session; async TripleModel mapping APIs (unified model stays sync; in-memory graph work stays on the event loop thread).

Known limitations

Until 0.4 (unified model)

Area	Behavior
Dual model types	0.3 uses interim `_triple.py` dynamic adapter; 0.4 unifies on `SPARQLModel(TripleModel)`

HttpStore mirror (0.12+)

Area	Behavior
Mirror vs remote	`get` / cascade use mirror; `query` uses remote
External writers	Use `pull_subjects_into_mirror`, `mirror_mode="remote_authoritative"`, or `sync_mirror()` (GSP GET; requires `graph_store_url`)
Multi-writer endpoints	Assume single writer per endpoint; reconcile with `sync_mirror()` after bulk external changes

Permanent constraints

Area	Behavior
Composition vs reference	Embedded `SPARQLModel` cascades; `IRI` references do not
Owned triples	Only declared predicates + `rdf:type` removed on `put`/`delete`
`add` vs `put`	`add` does not remove stale triples
`put(..., flush=False)`	Pending models not visible in `get` until flush
`flush()`	Not a full remote transaction; partial failure re-queues remainder (0.2+)
Sessions	Not thread-safe; one session per task unless scoped externally
Closed session	After `close()`, all CRUD/query methods raise `RuntimeError`; share the store via a new session
Interim mapping	0.3.0: `_triple.py` adapter; 0.4 Option A removes it; `serializers.py` thin wrappers since 0.7

Other (current)

Area	Behavior
Duplicate predicates	Two fields with the same expanded predicate on one model class → `ConfigurationError` at class definition
Write-path cycles	Cyclic embedded `SPARQLModel` graphs → `ConfigurationError` on `put` / `model_to_graph`
Shared composition	Orphan cleanup skips embedded targets still linked from subjects outside the current put cascade
Pending `put`	Identity for that subject evicted when queued; `close()` with pending writes raises `RuntimeError`
Nested query filters	Related resource must have expected `rdf:type`
AND filters, same path	Compiler reuses join variables per relationship path within one WHERE / EXISTS block
JSON-LD	`model_dump_jsonld` vs `export_model(..., "json-ld")` differ; non-cascade embeds omitted from `model_to_jsonld`
Export without `id`	`ensure_id()` may assign `urn:uuid:…`

Optional: export and files

ORM workflows do not require sparqlmodel.serializers.

Long term: all formats via TripleModel; SparqlModel may expose session-scoped helpers only.

FastAPI (optional extra)

Install sparqlmodel[fastapi]. fastapi/deps.py provides init_app, get_session, SessionDep, http_store_lifespan. fastapi/__init__.py provides turtle_response, jsonld_response, negotiated_response.

Feature ownership

Feature	Owner
SHACL shapes / validation engine	TripleModel `[shacl]`
SHACL on `session.put`	SparqlModel hook calling TripleModel
Named graphs / Dataset	TripleModel; SparqlModel consumes
SPARQL federation in apps	SparqlModel
Alternate store backends	SparqlModel `stores/`
OWL reasoner	Out of scope

Maintainer boundaries

For end users, use ORM.md. This table is for contributors.

Symptom	Fix in
Wrong XSD / literal on export	TripleModel
Subject IRI collision	TripleModel
Stale predicate after `put`	TripleModel sync + SparqlModel cascade
Orphan after relationship change	SparqlModel `graph.py`
`!=` / nested filter wrong	SparqlModel `compiler.py`
New RDF format	TripleModel
Fuseki / HTTP store	SparqlModel `stores/`

Anti-patterns: new mapping code only in graph.py; session/compiler in TripleModel; triplemodel importing sparqlmodel.

Package layout

sparqlmodel/
  session.py       # ORM unit of work
  query.py         # query builder
  compiler.py      # ORM-only
  hydration.py     # depth; → TripleModel load
  model.py         # SPARQLModel(TripleModel) — 0.4+
  fields.py        # Field/Relationship sugar → rdf_field / Predicate
  graph.py         # cascade/orphan policy only
  serializers.py   # thin TripleModel parse/serialize wrappers (0.7)
  stores/
  rdf_bridge.py    # graph I/O (Option A; replaced _triple.py in 0.4)

Dependencies

pydantic>=2.5,<3
pyoxigraph>=0.5,<0.6
triplemodel>=0.10.0,<2
typing-extensions>=4.8

Optional: httpx, fastapi

Project	Role
TripleModel	Required mapping engine
Pyoxigraph / TripleModel	In-process graphs and SPARQL execution (`Store`)
semantic-sqlmodel	Optional future backend

SparqlModel Technical Specification

Overview

Production ORM checklist (1.3 GA gate)

P0 — Production APIs

P1 — SQLModel / SPARQLMojo parity

P2 — Advanced

SPARQLSession

Methods

Context manager

Properties

Session lifecycle (target API)

Query builder

Query builder (target API)

SPARQL compilation

Hydration

SPARQLModel

Validation architecture

Relationships

Relationships and hydration (target API)

Persistence policy

`put`

`delete`

Ownership rules

Mapping integration (TripleModel)

HttpStore

HTTP resilience (0.11+)

Store protocol (target API)

Security (SPARQL generation)

Async API (target 0.6)

Known limitations

Until 0.4 (unified model)

HttpStore mirror (0.12+)

Permanent constraints

Other (current)

Optional: export and files

FastAPI (optional extra)

Feature ownership

Maintainer boundaries

Package layout

Dependencies

SparqlModel Technical Specification

Overview

Production ORM checklist (1.3 GA gate)

P0 — Production APIs

P1 — SQLModel / SPARQLMojo parity

P2 — Advanced

SPARQLSession

Methods

Context manager

Properties

Session lifecycle (target API)

Query builder

Query builder (target API)

SPARQL compilation

Hydration

SPARQLModel

Validation architecture

Relationships

Relationships and hydration (target API)

Persistence policy

put

delete

Ownership rules

Mapping integration (TripleModel)

HttpStore

HTTP resilience (0.11+)

Store protocol (target API)

Security (SPARQL generation)

Async API (target 0.6)

Known limitations

Until 0.4 (unified model)

HttpStore mirror (0.12+)

Permanent constraints

Other (current)

Optional: export and files

FastAPI (optional extra)

Feature ownership

Maintainer boundaries

Package layout

Dependencies

Related projects

`put`

`delete`