Wikidata:SPARQL federation input

From Wikidata
Jump to navigation Jump to search

One of the cool features of SPARQL is federation. It allows you to query several SPARQL endpoints together to get a combined query result. In order to enable better integration of data available in Wikidata with other linked data sources, we plan to enable SPARQL Federation on Wikidata Query Service to a selected number of other SPARQL endpoints. For security and performance reasons, we can not just allow any endpoint without filtering. We need to have a whitelist of approved endpoints. This page is for nominating and discussing which endpoints should be supported. Currently supported endpoints are listed in the User Manual. The suggested SPARQL endpoints must satisfy the following conditions:

  • Complies with the SPARQL 1.1 protocol, "query operation" part, at least to the extent necessary to make federated SERVICE clause work (most SPARQL endpoints do). To test: this query should work:
SELECT  ?a ?b ?c WHERE {
{ SELECT ?a ?b ?c WHERE { 
   ?a ?b ?c 
} LIMIT 1
}} 
VALUES ?a { <http://www.openlinksw.com/virtrdf-data-formats#default-iid> }
  • Contains data that can be linked to Wikidata - i.e., either contains Wikidata IDs or can be queried by values contained in one of the Wikidata properties.
  • Has data freely available under license compatible with CC0 (preferred) or other free database license allowing unrestricted reuse. Attribution licenses like CC-BY are ok too. Currently, we do not accept endpoints with reuse restriction clauses like NC/ND.

Please post the URL of the endpoint, a short description of it, and, if available, the URL of its documentation. Check first that the endpoint is not already in use (and working) or rejected. Thank you for helping to improve Wikidata.

Nominate new endpoint

External lists

Suggestions

Best license for inclusion is CC0 and alike. Looks like attribution license are OK too, we will acknowledge them on licensing page for the service. If for some reason such acknowledgement is not enough, please do not add the endpoint here or, if it's already added, please alert us on the Discussion page and we will remove it.

See also:

Licence suitable

Endpoints that are immediately suitable for inclusion.

Canadian Performing Arts Knowledge Graph

SPARQL Endpoint
http://kg.artsdata.ca/sparql
Federation endpoint
http://db.artsdata.ca/repositories/artsdata
Documentation
Under development (Note: we would like to activate this endpoint in order to add illustrations and examples to the documentation.)
Item about database/website/endpoint
Artsdata.ca (Q76180364)
Licence
All triples are CC0 (https://creativecommons.org/publicdomain/zero/1.0/)
Background
Players in the Canadian arts sector, lead by Crow's Theatre and the Canadian Arts Presenting Association (CAPACOA) and supported by the Canadian Government, are currently undertaking a flagship digital literature initiative, during which, based on shared metadata strategies and prototypes, they seek to evolve radically new collaboration mindsets in the arts. Their intent is to enable smarter, more efficient digital collaboration across the entire performing arts sector and to help artists and arts organizations to reap benefits from digital tools and strategies focused on interoperability, discoverability and shareability. Linked data publication in the area of the performing arts is still in a pilot phase. So far, only a small share of performing arts related data is available as linked open data. This project will expand the data for the performing arts with links to entities in Wikidata.
@Saumier: I am getting http 400 for any query I try to run against this endpoint... Smalyshev (WMF) (talk) 22:25, 3 June 2019 (UTC)[reply]

Licence unclear

Unclear license status, please help us to figure it out.

Not suitable

Endpoint suggestions rejected for license or other reasons. May be reconsidered if license or circumstances change.

Incoming nominations

The nominations are initially placed here and then sorted and moved into the specific topics above.

TORA

Topographical register at the Swedish National Archives see Phabricator T200066

SPARQL Endpoint
https://tora.entryscape.net/snorql/
Documentation
https://riksarkivet.se/tora-english
Item about database/website/endpoint
Property:P4820
Licence
CC BY 4.0
Background
This is the National Archive of Sweden and they are an authority on locations i.e. we can use them to Quality assure data in Wikidata e.g. Swedish Church parishes

DBpedia German

SPARQL Endpoint

http://de.dbpedia.org/sparql

Documentation

http://de.dbpedia.org/

Item about database/website/endpoint
Licence

Creative Commons Attribution-ShareAlike 3.0 GNU Free Documentation License

Background

Its the Endpoint for the German DBpedia Chapter.


YAGO

SPARQL Endpoint
https://linkeddata1.calcul.u-psud.fr/sparql
Documentation
https://io.datascience-paris-saclay.fr/dataset/YAGO
Item about database/website/endpoint
YAGO (Q8045810)
Licence
CC-BY 3.0
Background

Scholarlydata

SPARQL Endpoint
http://www.scholarlydata.org/sparql
Documentation
http://www.scholarlydata.org/
Item about database/website/endpoint
Scholarlydata (Q67656887)
Licence
CC-BY 3.0
Background

geo.admin.ch Linked Data Service

SPARQL Endpoint
https://ld.geo.admin.ch/query
Documentation
https://ld.geo.admin.ch/
Licence
https://opendata.swiss/en/dataset/swissboundaries3d-gemeindegrenzen
Background
Linked Data representation of swissBOUNDARIES 3D dataset by geo.admin.ch, the Swiss federal geoportal. URIs link to Wikidata URIs where appropriate, useful for visualizing data related to Swiss entities. One can get up to date shapes of these entities as Well Known Text (WKT).
Discussion
@TheKtk: Unfortunately, the License link above returns 404. Could you update it? (done)
I’ve moved this back into the “incoming nominations” section (from “not suitable”) – I’m not sure what the license situation was when this endpoint was first considered, but right now it seems perfectly suitable (“open use, must provide the source”, presumably similar to CC BY). --Lucas Werkmeister (WMDE) (talk) 11:07, 3 November 2019 (UTC)[reply]

OpenAIRE

SPARQL Endpoint
http://lod.openaire.eu/sparql
Documentation
http://lod.openaire.eu/eu-open-research-data
Item about database/website/endpoint
OpenAIRE (Q25106053)
Licence
https://creativecommons.org/licenses/by/4.0/
Background
At the Wikibase workshop on Grants and Projects a ontology was drafted for projects and grants. This ontology is now also being used in OpenAIRE. It would be interesting to run various federated queries based on this DiNGO ontology between Wikidata and OpenAIRE.

data.europa.eu

SPARQL Endpoint
http://publications.europa.eu/webapi/rdf/sparql
Documentation
http://data.europa.eu/euodp/data/dataset/sparql-cellar-of-the-publications-office
Licence
https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32011D0833&qid=1495026738506&from=EN
Background
This sparql endpoint is a portal to the linked data resource of the EU containing EU publications. It would be a valuable addition to link content of this repository to generic knowledge on Wikidata --Andrawaag (talk) 09:57, 23 November 2019 (UTC)[reply]
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX movie: <http://data.linkedmdb.org/resource/movie/>
PREFIX dcterms: http://purl.org/dc/terms/
SELECT ?film ?label ?subject WHERE {
        SERVICE <http://data.linkedmdb.org/sparql> {
              ?film a movie:film .
              ?film rdfs:label ?label .
              ?film owl:sameAs ?dbpediaLink
              FILTER regex(STR(?dbpediaLink), "dbpedia", "i")
       }}
LIMIT 100

linkeddata.uriburner.com

SPARQL Endpoint

http://linkeddata.uriburner.com/sparql

Documentation

http://linkeddata.uriburner.com

Item about database/website/endpoint
Licence

All the Data is Licensed CC-BY-SA 3.0 just like Wikipedia.

Background

This is a SPARQL endpoint for an underlying RDFization service to results in powerful gateway links to the massive LOD Cloud Knowledge Graph. This is also as old as DBpedia itself, since its the same underlying platform i.e., a Virtuoso instance with the Sponger Middleware Module enabled.

Examples using standard SPARQL-FED.

PREFIX swp: <http://www.semanticwebprimer.org/ontology/apartments.ttl#>.
PREFIX dbpedia: <http://dbpedia.org/resource/>.
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>.
SELECT (AVG(?bedrooms) AS ?avgNumRooms)
WHERE {
?apartment swp:hasNumberOfBedrooms ?bedrooms.
}

digital-agenda-data.eu

SPARQL Endpoint
http://digital-agenda-data.eu/data/sparql
Documentation
Item about database/website/endpoint
Data Visualisation Tool - Data & Indicators (Q99843931)
Licence
Creative Commons Attribution 4.0

https://digital-agenda-data.eu/datasets/digital_agenda_scoreboard_key_indicators

Background
Description from EU open data portal: European Commission services selected about a hundred indicators, divided into thematic groups, which illustrate some key dimensions of the European information society (as Broadband and telecom markets, internet usage, eCommerce, research in ICT, etc). These indicators allow a comparison of progress across countries as well as over time. You can browse the data with the help of a dedicated visualization tools (where you are also able to download selected information in CSV and XLS), or you can download the whole database in CSV, TSV, HTML and RDF-N3/Turtle.
The site is released 2015 and a bigger update was 2018. Most of the data seems to be a couple of years old, but a good dataset for demoing federation for the example.

European Environment Agency SPARQL endpoint

SPARQL Endpoint
http://semantic.eea.europa.eu/sparql
Documentation
Item about database/website/endpoint
European Environment Agency (Q632988)
Licence
Licence linked from front page is CC-BY-2.5-dk.
https://creativecommons.org/licenses/by/2.5/dk/deed.en_GB

Datasets are under their own licences defined case by case. Based on a brief look the EU and World Bank datasets are under CC-BY, ITIS is CC0. However, the contents of the FAO datasets are CC-BY-NC if they are new or something similar if they are older. WHO datasets doesn't have any licence information. However, it seems that there is only metadata of WHO/FAO on SPARQL and the actual data is on linked external datadumps.

Background

Object-oriented search engine where you can search for the content of data in Eionet. Background database is set of statistics by

  • European Environment Agency
  • Eurostat
  • Food and Agriculture Organization of the United Nations
  • Integrated Taxonomic Information System (ITIS)
  • World Bank
  • World Health Organization

Bulgarian Election Data

SPARQL Endpoint

https://elections.ontotext.com/repositories/elections

Documentation

Github of Project

Item about database/website/endpoint

Non yet

Contact

Nikola Tulechki (talk)

Licence

CC0

Background

This is a Knowledge Graph with official Bulgarian election results. Wikidata entities are used to identify geographical entities such as jurisdictions, as well as election-specific entities such as political parties. In the future individual candidates will also be matched to Wikidata objects.

Example SPARQL-query which uses the endpoint

This displays the candidate lists of Movement for Rights and Freedoms (Q164242) for municipal elections in Dulovo Municipality (Q2387177)

see query results on the endpoint


Description

FoodEx2 (version 2 of the EFSA Food classification and description system for exposure assessment) food taxonomy. Wikidata has P4637 FoodEx2 code

SPARQL Endpoint

https://data.food.gov.uk/codes/ui/sparql-form

Documentation

https://data.food.gov.uk/codes/ui/about

Item about database/website/endpoint
Licence

https://www.food.gov.uk/crown-copyright

Example SPARQL-query which uses the endpoint
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
prefix owl: <http://www.w3.org/2002/07/owl#>
prefix xsd: <http://www.w3.org/2001/XMLSchema#>
prefix dct: <http://purl.org/dc/terms/>
prefix foaf: <http://xmlns.com/foaf/0.1/>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
prefix version: <http://purl.org/linked-data/version#>
prefix ldp: <http://www.w3.org/ns/ldp#>
prefix time: <http://www.w3.org/2006/time#>
prefix reg: <http://purl.org/linked-data/registry#>
prefix ui: <http://purl.org/linked-data/registry-ui#>
prefix qb: <http://purl.org/linked-data/cube#>
prefix org: <http://www.w3.org/ns/org#>
select *
where {
 ?item reg:register <${registry.baseURI}/system/prefixes>;
       version:currentVersion ?itemVer.
}

Motivation

Supporting federation with the FoodEx2 SPARQL endpoint would allow us to ask detailed questions about food items. Much of this data is not yet it Wikidata. YULdigitalpreservation (talk) 15:49, 17 February 2021 (UTC)[reply]

Discussion


The German Digital Library Knowledge Graph (DDB-KG)

Description
This SPARQL Endpoint (BETA) contains resources from the German Digital Library or Deutsche Digitale Bibliothek(DDB).
SPARQL Endpoint
http://ddbkg.fiz-karlsruhe.de
Documentation
https://ise-fizkarlsruhe.github.io/ddbkg/docs/
Licence
https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en
Example SPARQL-query which uses the endpoint
https://ise-fizkarlsruhe.github.io/ddbkg/docs/examples/

Motivation

The resources in this endpoint consist of objects in the collection of the DDB, which is well represented in the Europeana. Some entities are currently linked to the German Integrated Authority File or Gemeinsame Normdatei.

Discussion


Experimental Schema.org Knowledge Graph Exchange repo (Fact Checks, Political data)

Description

Experiments with combining Schema.org data from Fact Checkers with WD and other Linked Data sources. For example, FullFact UK, UK Parliament, and information about associated entities from Wikidata. Currently populated with Full Fact data from https://fullfact.org/well-known/feed/.

SPARQL Endpoint

https://dydra.com/danbri/fffc-feed/sparql

Documentation

https://twitter.com/danbri/status/1514619420651446274

Item about database/website/endpoint

Schema.org (Q3475322)


Licence

I believe https://fullfact.org/api/license/ for currently loaded data. We plan a page at Schema.org with full details but wanted to establish interop with Wikidata first.

Example SPARQL-query which uses the endpoint
  1. Thanks to @generalising
  2. Related discussion: https://twitter.com/generalising/status/1515711399707787276

PREFIX : <http://schema.org/> SELECT DISTINCT ?sameAs (SAMPLE (?authName) as ?name) (COUNT (DISTINCT ?cr) as ?count) {

GRAPH <https://fullfact.org/well-known/feed/>
 {
   ?cr a :ClaimReview . 
   ?cr :itemReviewed ?claim .
   ?claim :author ?auth . OPTIONAL { ?auth :name ?authName } .
   ?auth :sameAs ?sameAs 
 }

} GROUP BY ?sameAs ORDER BY desc (?count)


Motivation

Exploring approaches to combining Schema.org and Wikidata information.

Discussion


Factgrid

Description
FactGrid is a database for historians, containing linked open data for more than 20 projects. Technically it is a Wikibase instance maintained by Wikimedia Germany.
SPARQL Endpoint
https://database.factgrid.de/query/
Documentation
https://blog.factgrid.de/welcome, technical setup
Item about database/website/endpoint
FactGrid (Q90405608)
Licence
CC0
Example SPARQL-query which uses the endpoint
 # get ten humans from factgrid
 SELECT ?item ?itemLabel ?itemDescription WHERE {
 SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 OPTIONAL { ?item wdt:P2 wd:Q7. }
 }
 LIMIT 10

Motivation

Linked, open data show their full power when connected. FactGrid therefore wants to become part of the Wikibase ecosystem.

Discussion

Removena (talk) 09:55, 30 May 2022 (UTC)[reply]

Olaf Simons (talk) 09:49, 30 May 2022 (UTC)[reply]

BNCF

Description
Nuovo soggettario (Q16583225) is linked from Wikidata through BNCF Thesaurus ID (P508) and links to Wikidata (query)
SPARQL Endpoint
https://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS/query
Documentation
https://lod-cloud.net/dataset/bncf-ns
Item about database/website/endpoint
Nuovo soggettario (Q16583225)
Licence
CC-BY (source of the info)
Example SPARQL-query which uses the endpoint
https://digitale.bncf.firenze.sbn.it/openrdf-workbench/repositories/NS/query?query=PREFIX+skos%3A%3Chttp%3A%2F%2Fwww.w3.org%2F2004%2F02%2Fskos%2Fcore%23%3E%0ASELECT+%3Fid+%3Fid1+%0AWHERE+%0A%7B%0A++++%3Fid+skos%3AcloseMatch+%3Fid1.+FILTER+regex(str(%3Fid1)%2C+%22wikidata%22)%0A%7D

Motivation

Useful for maintenance of the links from Nuovo soggettario (Q16583225) to Wikidata and from Wikidata to it through BNCF Thesaurus ID (P508); also useful because it is synced with other thesaurus (Bibliothèque nationale de France ID (P268), GND ID (P227), Library of Congress authority ID (P244) etc.). --Epìdosis 21:50, 15 May 2023 (UTC)[reply]

Discussion