Oct 30, 2025

DATASET Getty GPI merged with Wikidata art people and galleries IDs

 DATASET 

DATASET_GPIulanWIKIDATA_FULL.csv brings together actors in provenance entities from the Getty Provenance Index (GPI) with corresponding Wikidata identifiers for artists, collectors, dealers, galleries, and auction houses (not museums).

VIEW DATASET HERE

The file contains 61,419 records, each representing a person or organization appearing in the GPI.
Every record retains six key GPI fields and is enriched, where available, with open-data identifiers and descriptions from Wikidata. 


DOWNLOAD DATASET CSV 


Merge details

  • Primary key: ULANurl (Getty ULAN link)

  • Join type: Left join — all GPI rows are preserved, even if no Wikidata match exists

  • Wikidata coverage: 11,146 matched entities out of 24,881 unique ULANs (≈45%)

  • Columns included: All original GPI columns plus Wikidata fields such as

    • item (Wikidata QID)

    • itemLabel (name in Wikidata)

    • itemDescription

    • External identifiers: VIAFGNDISNIRKDProveana, and others

Use
This dataset enables cross-referencing between the Getty Provenance Index and Wikidata, facilitating linked-data research on art-market actors, networks, and provenance patterns.


It is particularly useful for identifying entities appearing in both GPI and open knowledge graphs, enriching provenance chains with additional biographical and institutional context.

It is also useful for identifying GAPS in the data (for example, missing ULAN codes in either GPI or WIKIDATA) and for targeting useful actions to improve data coherence and completeness.

📘 Columns from the Getty Provenance Index (GPI)

  1. URI Linkedart json – Persistent URI for the Linked Art JSON record

  2. name – Preferred name of the person or organization

  3. ULANurl – Getty ULAN identifier in URL form (merge key)

  4. starId – Internal GPI identifier

  5. birthYear – Year of birth (where applicable)

  6. biography GPI – Textual biographical note from GPI


🟦 Columns from Wikidata

  1. item – Full Wikidata entity URI (e.g., http://www.wikidata.org/entity/Q5582)

  2. itemLabel – English label (name of the entity)

  3. itemDescription – Short descriptive phrase from Wikidata

  4. ulan – ULAN numeric identifier used in Wikidata (e.g., 500044458)

  5. VIAF – Virtual International Authority File ID (P214)

  6. GND – German National Library identifier (P227)

  7. Lexikon – Künstlerlexikon der Schweiz identifier (P9585)

  8. Proveana – Proveana database ID (P9434)

  9. RKD – RKDartists ID (P650)

  10. ArtHist – arthist.net identifier (P10015 or equivalent, if present)

  11. BritishM – British Museum person or org ID (P1711 or similar)

  12. ISNI – International Standard Name Identifier (P213)

  13. LoC – Library of Congress ID (P244)

  14. BNF – Bibliothèque nationale de France ID (P268)

  15. YadVashem – Yad Vashem Holocaust database ID (P6890)

  16. SNAC – Social Networks and Archival Context ID (P3430)

  17. Joconde – French museum catalogue ID (P347)

  18. BiografischPortaal – Biografisch Portaal van Nederland ID (P651)

  19. ULANurl_wikidata – Getty ULAN URL as represented within Wikidata


    Merge details

    • Primary key: ULANurl (Getty ULAN link)

    • Join type: Left join — all GPI rows are preserved, even if no Wikidata match exists

    • Wikidata coverage: 11,146 matched entities out of 24,881 unique ULANs (≈45%)

No comments: