Real-world examples
These examples use real vocabularies, public datasets, and typical integration problems—not synthetic http://example.org/ toys. They are adapted from the TripleModel real-world suite to show SparqlModel patterns: load bundled Turtle into a MemoryStore, then use SPARQLSession for queries, get, and execute.
Source tree: examples/realworld/ (scripts below are included from that directory at doc build time).
Overview
Example |
Script |
Data |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
Provenance and licenses: DATA_SOURCES.md.
Run locally
pip install sparqlmodel
From the SparqlModel repository root:
PYTHONPATH=src python examples/realworld/nobel_laureates.py
PYTHONPATH=src python examples/realworld/dcat_data_catalog.py
PYTHONPATH=src python examples/realworld/wikidata_capitals.py
PYTHONPATH=src python examples/realworld/schema_org_ngos.py
Load bundled Turtle
Each script opens data with the public API :meth:~sparqlmodel.session.SPARQLSession.from_rdf_file (in-memory :class:~sparqlmodel.stores.memory.MemoryStore):
from pathlib import Path
from sparqlmodel import SPARQLSession
DATA_DIR = Path(__file__).resolve().parent / "data"
with SPARQLSession.from_rdf_file(
DATA_DIR / "nobel_laureates_1901.ttl",
prefixes=PREFIXES,
) as session:
...
Pass a :class:~pathlib.Path (or path string), not file contents—TripleModel treats long strings as path-like sources.
For production, swap the default in-memory store for HttpStore (see SparqlModel production guide) and keep the same session API.
Nobel Prize linked data (1901)
Problem: Cultural heritage and science datasets publish stable URIs and a shared ontology; you need typed models and filters over an existing graph.
Data: nobel_laureates_1901.ttl — excerpt aligned with Nobel Prize linked data examples.
#!/usr/bin/env python3
"""Nobel Prize linked data (1901): query laureates with :class:`~sparqlmodel.session.SPARQLSession`.
Problem: integrate biographical linked open data where resources already have
stable URIs and a published ontology (common in cultural heritage and science).
Data: ``examples/realworld/data/nobel_laureates_1901.ttl``
Source: https://www.nobelprize.org/about/linked-data-examples/
"""
from __future__ import annotations
from pathlib import Path
from sparqlmodel import IRI, Field, SPARQLModel, SPARQLSession
DATA_DIR = Path(__file__).resolve().parent / "data"
NOBEL = "http://data.nobelprize.org/terms/"
RDFS = "http://www.w3.org/2000/01/rdf-schema#"
PREFIXES = {
"nobel": NOBEL,
"rdfs": RDFS,
"foaf": "http://xmlns.com/foaf/0.1/",
}
class Laureate(SPARQLModel):
"""Person or organisation receiving a Nobel Prize (``nobel:Laureate``)."""
rdf_type = "nobel:Laureate"
__prefixes__ = PREFIXES
id: IRI
name: str = Field("rdfs:label")
gender: str | None = Field("foaf:gender", default=None)
class NobelPrize(SPARQLModel):
"""Award instance for a category and year (``nobel:NobelPrize``)."""
rdf_type = "nobel:NobelPrize"
__prefixes__ = PREFIXES
id: IRI
title: str = Field("rdfs:label")
year: str = Field("nobel:year")
def main() -> None:
with SPARQLSession.from_rdf_file(
DATA_DIR / "nobel_laureates_1901.ttl", prefixes=PREFIXES
) as session:
laureates = session.query(Laureate).all()
prizes = session.query(NobelPrize).all()
print(
f"Loaded {len(laureates)} laureates and {len(prizes)} prizes from 1901 excerpt"
)
for person in sorted(laureates, key=lambda m: m.name):
print(f" {person.name} ({person.gender})")
roentgen = next(p for p in laureates if "Röntgen" in p.name)
physics = session.query(NobelPrize).where(NobelPrize.year == "1901").all()
physics_1901 = next(p for p in physics if "Physics" in p.title)
assert physics_1901.year == "1901"
male_laureates = session.query(Laureate).where(Laureate.gender == "male").all()
assert roentgen in male_laureates
loaded = session.get(Laureate, roentgen.id)
assert loaded is not None and loaded.name == roentgen.name
print("Round-trip OK for Wilhelm Conrad Röntgen")
# Example output:
# Loaded 6 laureates and 5 prizes from 1901 excerpt
# Emil Adolf von Behring (male)
# Frédéric Passy (male)
# Jacobus Henricus van 't Hoff (male)
# Jean Henry Dunant (male)
# Sully Prudhomme (male)
# Wilhelm Conrad Röntgen (male)
# Round-trip OK for Wilhelm Conrad Röntgen
Note
rdfs:label values in the bundle include language tags (@en). Equality filters on name must match the stored literal form; this example filters on gender and uses session.get by IRI for round-trip checks.
DCAT open data catalog
Problem: Governments and EU portals publish DCAT metadata so users can discover datasets and SPARQL endpoints before downloading data.
Data: dcat_nobel_catalog.ttl.
Use IRI for object fields that are resources in the graph (e.g. dcat:accessURL). Multi-valued dcat:keyword in the bundle hydrates as the first value only (see Troubleshooting).
#!/usr/bin/env python3
"""DCAT data catalog: discover datasets and SPARQL endpoints with the query DSL.
Problem: governments and EU institutions publish metadata as DCAT/DCAT-AP so
users can find datasets and SPARQL/HTTP distributions before downloading data.
Data: ``examples/realworld/data/dcat_nobel_catalog.ttl``
"""
from __future__ import annotations
from pathlib import Path
from sparqlmodel import IRI, Field, SPARQLModel, SPARQLSession
DATA_DIR = Path(__file__).resolve().parent / "data"
DCAT = "http://www.w3.org/ns/dcat#"
DCT = "http://purl.org/dc/terms/"
PREFIXES = {"dcat": DCAT, "dct": DCT}
class DataCatalog(SPARQLModel):
rdf_type = "dcat:Catalog"
__prefixes__ = PREFIXES
id: IRI
title: str = Field("dct:title")
description: str | None = Field("dct:description", default=None)
class Dataset(SPARQLModel):
rdf_type = "dcat:Dataset"
__prefixes__ = PREFIXES
id: IRI
title: str = Field("dct:title")
description: str | None = Field("dct:description", default=None)
keyword: str | None = Field("dcat:keyword", default=None)
class Distribution(SPARQLModel):
rdf_type = "dcat:Distribution"
__prefixes__ = PREFIXES
id: IRI
title: str = Field("dct:title")
access_url: IRI = Field("dcat:accessURL")
def main() -> None:
with SPARQLSession.from_rdf_file(
DATA_DIR / "dcat_nobel_catalog.ttl", prefixes=PREFIXES
) as session:
catalogs = session.query(DataCatalog).all()
datasets = session.query(Dataset).all()
distributions = session.query(Distribution).all()
print(f"Catalog: {catalogs[0].title}")
for ds in datasets:
print(f" Dataset: {ds.title}")
if ds.keyword:
print(f" Keyword: {ds.keyword}")
for dist in distributions:
print(f" Distribution: {dist.title}")
print(f" accessURL: {dist.access_url}")
sparql_dist = session.query(Distribution).where(
Distribution.access_url == IRI("http://data.nobelprize.org/sparql")
).first()
assert sparql_dist is not None
assert "Nobel prize" in (datasets[0].keyword or "")
print("DCAT catalog query OK")
# Example output:
# Catalog: Nobel Media Dataset catalog
# Dataset: Linked Nobel prizes
# Keyword: Nobel prize
# Distribution: Nobel Prize SPARQL endpoint
# accessURL: http://data.nobelprize.org/sparql
# DCAT catalog query OK
Wikidata capital cities
Problem: Wikidata (and similar KGs) often assert types with wdt:P31 rather than rdf:type, so the default session.query type pattern (?s a <Class>) may not match.
Data: wikidata_capitals.ttl — Paris and London with population and country (CC0).
Approach: session.execute with Wikidata property patterns, then from_graph(..., validate_type=False). session.execute on MemoryStore supports SELECT (not ASK).
#!/usr/bin/env python3
"""Wikidata capital cities: session execute + graph load (P31, not only rdf:type).
Problem: knowledge-graph pipelines (Wikidata) often use property assertions
(``wdt:P31``) instead of ``rdf:type``; combine raw SPARQL with ``from_graph``.
Data: ``examples/realworld/data/wikidata_capitals.ttl``
Source: Wikidata Q90, Q84 — CC0 1.0
"""
from __future__ import annotations
from pathlib import Path
from sparqlmodel import IRI, Field, SPARQLModel, SPARQLSession
DATA_DIR = Path(__file__).resolve().parent / "data"
WD = "http://www.wikidata.org/entity/"
WIKIDATA_PREFIXES = {
"wd": WD,
"wdt": "http://www.wikidata.org/prop/direct/",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
}
class Country(SPARQLModel):
"""Wikidata item with an English label (e.g. France, United Kingdom)."""
rdf_type = "wd:Q6256"
__prefixes__ = WIKIDATA_PREFIXES
id: IRI
label_en: str | None = Field("rdfs:label", default=None)
class CapitalCity(SPARQLModel):
"""Capital city facts: label, population, country item IRI."""
rdf_type = "wd:Q174844"
__prefixes__ = WIKIDATA_PREFIXES
id: IRI
label_en: str | None = Field("rdfs:label", default=None)
population: int = Field("wdt:P1082")
country: IRI = Field("wdt:P17")
def main() -> None:
with SPARQLSession.from_rdf_file(
DATA_DIR / "wikidata_capitals.ttl", prefixes=WIKIDATA_PREFIXES
) as session:
large_cities = session.execute(
"""
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?city WHERE {
wd:Q90 wdt:P1082 ?pop .
FILTER(?pop > 2000000)
BIND(wd:Q90 AS ?city)
}
"""
)
assert len(large_cities) == 1
capital_rows = session.execute(
"""
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?city ?pop WHERE {
?city wdt:P31 wd:Q174844 ; wdt:P1082 ?pop .
}
ORDER BY DESC(?pop)
"""
)
cities: list[CapitalCity] = []
for row in capital_rows:
city = CapitalCity.from_graph(
session.graph,
row["city"],
validate_type=False,
)
cities.append(city)
print("European capitals (Wikidata excerpt):")
for city in cities:
country = Country.from_graph(
session.graph,
str(city.country),
validate_type=False,
)
print(
f" {city.label_en}: population={city.population:,} "
f"country={country.label_en} ({city.country})"
)
paris = next(c for c in cities if str(c.id).endswith("Q90"))
assert paris.label_en == "Paris"
assert paris.population == 2_103_778
france = Country.from_graph(session.graph, str(paris.country), validate_type=False)
assert france.label_en == "France"
print("Paris load OK (country link via wdt:P17)")
# Example output:
# European capitals (Wikidata excerpt):
# London: population=8,799,728 country=United Kingdom (http://www.wikidata.org/entity/Q145)
# Paris: population=2,103,778 country=France (http://www.wikidata.org/entity/Q142)
# Paris load OK (country link via wdt:P17)
Schema.org NGO registry
Problem: Transparency and search pipelines expose schema:NGO records; you want Pydantic validation and session APIs over that graph.
Data: schema_org_ngos.ttl.
#!/usr/bin/env python3
"""Schema.org NGOs: nonprofit registry records via session query and get.
Problem: transparency portals publish organization metadata with schema.org;
map it into Pydantic models for validation and filter with the ORM query DSL.
Data: ``examples/realworld/data/schema_org_ngos.ttl``
"""
from __future__ import annotations
from pathlib import Path
from sparqlmodel import IRI, Field, SPARQLModel, SPARQLSession
DATA_DIR = Path(__file__).resolve().parent / "data"
SCHEMA = "https://schema.org/"
class NgoOrganization(SPARQLModel):
rdf_type = "schema:NGO"
__prefixes__ = {"schema": SCHEMA, "xsd": "http://www.w3.org/2001/XMLSchema#"}
id: IRI
name: str = Field("schema:name")
url: str = Field("schema:url")
nonprofit_status: str | None = Field("schema:nonprofitStatus", default=None)
founding_year: int | None = Field("schema:foundingDate", default=None)
def main() -> None:
with SPARQLSession.from_rdf_file(DATA_DIR / "schema_org_ngos.ttl") as session:
ngos = session.query(NgoOrganization).all()
print(f"Loaded {len(ngos)} NGO records")
for org in sorted(ngos, key=lambda o: o.name):
founded = org.founding_year if org.founding_year is not None else "n/a"
print(f" {org.name} (founded {founded}) — {org.url}")
wwf = session.get(NgoOrganization, IRI("https://example.org/org/wwf"))
assert wwf is not None
ttl = wwf.serialize(format="turtle")
assert "World Wide Fund" in ttl or "schema:name" in ttl
with_status = session.query(NgoOrganization).where(
NgoOrganization.nonprofit_status == "NonprofitANBI"
).all()
assert len(with_status) == len(ngos)
print("Schema.org NGO session OK")
# Example output:
# Loaded 3 NGO records
# International Committee of the Red Cross (founded 1863) — https://www.icrc.org/
# Médecins Sans Frontières (founded n/a) — https://www.msf.org/
# World Wide Fund for Nature (founded 1961) — https://www.worldwildlife.org/
# Schema.org NGO session OK
TripleModel vs SparqlModel in these examples
Task |
TripleModel (upstream) |
SparqlModel (here) |
|---|---|---|
Parse bundled TTL |
|
|
Filter rows |
Python list comprehensions |
|
Load one resource |
|
|
Wikidata P31 typing |
|
|
Remote SPARQL |
|
|
Mapping details (literals, serialize, parse) remain in TripleModel; SparqlModel adds the session and query layer on top of SPARQLModel(TripleModel).
What’s next
Sessions and stores — flush queue, identity map, stores
Query DSL — boolean filters, multi-hop paths, raw SPARQL
SparqlModel production guide —
HttpStoreand Nobel / Wikidata live endpointsSparqlModel (ORM) + TripleModel (mapping engine) — package boundaries with TripleModel