Showing posts with label provenance research. Show all posts
Showing posts with label provenance research. Show all posts

Jun 5, 2020

DATASET: Text Analysis Challenge - Detect Looted Art - GLAMhack2020



Download  Provenance Texts Dataset: CSV 

Text Analysis Challenge: Detect Looted Art.

Help Automate Analysis, Flagging and Ranking of Museum Art Provenance Texts by the Probability of a Hidden History

The Question: How to sift through the millions of objects in museums to identify top priorities for intensive research by humans? 

The Goal: Automatically Classify and Rank 60,000+ art provenance texts by probability that further research will turn up a deliberately concealed history of looting, forced sale, theft or forgery. 

The Challenge: Analyse texts quickly for Red Flags, quantify, detect patterns, classify, rank, and learn. Whatever it takes to produce a reliable list of top suspects

For this challenge several datasets will be provided.

1) DATASET: 60,000+ art provenance texts for analysis

(example)
https://www.harvardartmuseums.org/collections/object/296887[Pierre Matisse Gallery, New York, New York], by 1932, to M. Gutmann, 1936. Maurice Wertheim, by 1937, bequest, to Fogg Art Museum, 1951.;NOTE: Provenance derived from "Degas to Matisse: The Maurice Wertheim Collection," John O'Brian, Harry N. Abrams, New York, 1988.1951.76
https://www.harvardartmuseums.org/collections/object/229045Private Collector, Paris, Said to have been bought directly from artist, 1918.;Maurice Wertheim, Purchased from the Valentine Gallery, 1937, Bequest to Fogg Art Museum, 1951.1951.52
https://www.harvardartmuseums.org/collections/object/229044Paul Guillaume, Paris, France, 1925, 1935. By 1930, per caption of photo, Paul Guillaume dining room dated c. 1930, Georgel 2006;Maurice Wertheim, 1937, Bequest to Fogg Art Museum, 1951.1951.51
https://www.harvardartmuseums.org/collections/object/229043Unidentified owner, Paris, sold, [through Hôtel Drouot, Paris, June 16, 1906, no 30.]. Dr. Alfred Wolff, Munich, (1912). Sir Michael Sadler, England, (1912). [De Hauke & Co., New York], sold, to A. Conger Goodyear, New York, (1929-1937) sold, [through Wildenstein && Co., New York];to Maurice Wertheim, New York (1937-1951) bequest, to Fogg Art Museum, 1951.1951.49

2) DATASET: 1000 Red Flag Names

(example)

Bignou, EtienneBignouEtienneEtienne Bignou
Billiet, DirectorBillietDirectorDirector Billiet
Binder, Dr. Moritz JuliusBinderDr. Moritz JuliusDr. Moritz Julius Binder
Bing CollectionBing CollectionBing Collection
Birtschansky, ZacharieBirtschanskyZacharieZacharie Birtschansky
Bisson, E.BissonE.E. Bisson
Blanc, PierreBlancPierrePierre Blanc
Bleye, WilliBleyeWilliWilli Bleye
Bloch, Dr. VitaleBlochDr. VitaleDr. Vitale Bloch
Bloch-Bauer Collection
Bloch-Bauer Collection
Bloch-Bauer Collection
BlotBlotBlot
Bode, Dr.BodeDr.Dr. Bode
Bodenschatz, General KarlBodenschatzGeneral KarlGeneral Karl Bodenschatz
Boedecker, AlfredBoedeckerAlfredAlfred Boedecker
Boehler, Julius, Jr.BoehlerJuliusJulius Boehler
Boehler, Julius, Sr.BoehlerJuliusJulius Boehler
Boehm, Dr. FranzBoehmDr. FranzDr. Franz Boehm
Boehmer, BernhardBoehmerBernhardBernhard Boehmer

3) DATESET:  Red Flag Words or Phrases

(example)
flaguncertaintyflaganonymityflagpuncutationflagmoveflagreliability
likelyprivate collector?transfertelephone
probablyanonymous[removedto at least
possiblyart marketuntil at least
maybeunidentifiedby 19
?unknownbefore 19
property of a European collectoraccording to
private collection
property of a lady
anon. 


You're the doctor and the texts are your patients! Who's in good health and who's sick? How sick? With what disease? What kind of tests and measurements can we perform on the texts to help us to reach a diagnosis? What kind of markers for should we look for? What kind of patterns? 

What digital methods can we use - and put into practice during the Hackathon - to "diagnose" these texts and prioritize them for "treatment" (ie, additional provenance research)?


  • IDENTIFY Red Flag Names and Words in each Text?
  • COUNT Red Flag Names and Words in each Text?
  • CHARACTERIZE each Text (number of words? sentiment? completeness v gaps? other features to be identified that may be useful)?
  • ANALYZE for patterns, links and networks?
  • CALCULATE probability that the provenance conceals a Nazi-era history that will prove problematic if investigated in detail
  • RANK according to urgency for further in-depth provenance research 


(Voyant-Tools Whitelist containing names: 
keywords-14ca7131716f24c62c6529fcc143bbd2  )


What might a successful result look like?




  • A list of 50 provenances from the DATASET ranked most likely to conceal looted art
  • A color-coded evaluation of each provenance (RED, ORANGE, GREEN) by likelihood of concealing looted art
  • Instructions how to analyze the provenances with the tools, functions or code to use  (for example, how to use Voyant-Tools to count all the Red Flag Names and inject the result back into the spreadsheet)
  • Ideas for going further....


Triage: "assignment of degrees of urgency to wounds or illnesses to decide the order of treatment of a large number of patients or casualties."

Link to Glamhack2020 project for participants
https://hack.glam.opendata.ch/project/7



Help us to test and improve and test and improve the code!

Link to Code on Github 

https://github.com/parisdata/GLAMhack2020

Issues to resolve:
- While the word list counts and general name extraction seem to work pretty well, reconciliation with the list of 1000 Red Flag Names still needs work. To test: A tighter tolerance combined with a more complete listing of alias might help.
- The extraction of transaction years after 1900 gives interesting though incomplete indicators. The decision to exclude dates in parentheses (as they are sometimes biographical dates and not transaction dates) needs to be reviewed and refined. 
(note: The texts are from multiple sources applying multiple formats and deliberately entered exactly as is)

Results RAW (with uncorrected errors for analysis)




May 27, 2020

DATASET: Art Collectors who died or were detained in Nazi Concentration Camps, Ghettos or Jails



In this post, we publish a tiny subset of names that should immediately set off alarms when they appear in the provenance of an artwork:


Art collectors, dealers, curators, historians, and other art market professionals who were detained or died in Nazi custody, whether in concentration camps, ghettos, or jails.


This dataset is drawn from Wikidata Queries (see below), and so is necessarily limited to those very few names that have been properly documented in Wikidata, with occupation, date and place of death.



DATASET: Art Collectors who were detained or who died in Nazi Concentration Camps, Ghettos or Jails referenced in Wikidata


Version: 1

Licence: CC0

Date published: May 27, 2020

Publisher: Open Art Data

Source of information: Wikidata Query

File formats:


  • CSV


DOWNLOAD CSV


  • Google Sheet


VIEW GOOGLE SHEET


  • RDF Turtle

(to be published)

Description:

In this first version, published on May 27, 2020, the Dataset contains only 47 names. (see below) with the following information:


Fields in the DATASET
Wikidata url
Wikidata Qcode
Name
pic
Detention P2632
PlaceDetention
DeathDate
PlaceDied
Died P20
Birth P19
PlaceBorn
BirthDate
VIAF_ID
Library_of_Congress_authority_ID
GND_ID
ULAN_ID
FirstName
Lastname

To appreciate how extremely incomplete this dataset is, one can compare it to the 351 names of Jewish collectors on the German Lostart site or to the 371 names (for France alone) of victims of spoliation listed in France's ERR database, or the over three million names in the central database of Yad Vashem.

It is obvious, then, that this tiny dataset makes no claim to completeness. Its purpose is rather to provide an angle of attack for linking up information about murdered Jewish collectors to other information, including provenance information in artworks. 

Each name has a Wikidata identifier, which serves as a linked data hub to other identifiers, authority files and references such as Viaf, Library of Congress, DNB, Dictionary of Art Historians, Lostart, Yad Vashem, Wikipedia, DBpedia, Ulan, etc.

At present, very few Jewish collectors who perished in the Holocaust are properly referenced and linked. 

But every journey begins with a first step. 

The DATASET presented  here is such a first step: to link Jewish art collectors and art professionals who were detained or died in Nazi camps or ghettos into the world of linked data via Wikidata, and thus, hopefully, to help anchor them in the digital Knowledge Graph.

A further observation:
In describing datasets it is common to inventory what is present. In describing datasets related to the Holocaust, however, it is also important to inventory what is absent. This is to avoid the frequent error of imagining that what we see is all there is to know, and to have an idea of the scope of the work that remains to be done.



Extract of Dataset:

List of Forty-Seven Names of Art Collectors, Dealers, Curators, Historians who Died or were Detained in Nazi Camps or Ghettos, or Jails


Wikidata QcodeNameBirthDateVIAF_IDLibrary of Congress authority IDGND_IDULAN_ID
Q2546745Walter Westfeld1889-03-0481690025137508662
Q1039376Carl Laszlo1923-07-162511905n83226831118569996
Q1515593Gertrud Kantorowicz1876-10-0918030172n86032621119368153
Q55676461Jerzy Langman1903-07-3161280166no2005063216127273298
Q15995945Wilhelm Mautner1889-11-281811860701013999878
Q1913457Max Silberberg1878-02-2750130833123274826
Q98887Henri Hinrichsen1868-02-0554912104no96023216116897899
Q54501351Max Frankenburger1860-08-275294492no98080821116716649
Q19236821Walter Cohen1880-02-1840183001nr95021800119198592
Q14777672Ludwig Pollak1868-09-1439488480nr89003403118838172
Q89993Robert Eisler1882-01-01763850nr97031281116435526
Q86225Arthur Mahler1871-08-0128271851110216106
Q11779916Mieczysław Paszkiewicz1925-02-1026463n81056769
Q66832Franz Roh1890-02-2134587588n50048042118602152500048070
Q28192906Q281929061863-05-0354896846116308257
Q762753August Liebmann Mayer1885-10-27102317582nr88011775117542563500322318
Q6280128Josef Rosensaft1911-01-1520488355n86056702
Q19258964Alfred Werner1911-03-3156619941n79148949130178594
Q122226Jan Krugier1928-05-1245150837n85212724121187519
Q62070061Olga Bloch1900-08-30
Q3085081François Lang1908-02-2727877412nr93019463
Q5503811Friedrich Gutmann1886-11-153375432123271770
Q15433563Lisa Hamburg1890-09-103062236081107446635
Q1597216Heinrich Feurstein1877-04-1157367931no2016109679116484640
Q1598924Heinrich Stahl1868-04-1368530888no97004669136897487
Q63110939Wilhelm Kurtz1897-08-203164954311072437112
Q3174304Jean Riboud1919-11-1518117722n83185011122914171
Q6202160Jindřich Waldes1876-07-0255020467n00009745122092759
Q94971510Sigmund Fein1880-07-09
Q2038252Otto Bernheimer1877-07-1464750637no2004014234116146672
Q62579Joachim Ernst, Duke of Anhalt1901-01-11
Q11779942Mieczysław Porębski1921-03-31118048919n85827144119251752
Q468466Karolina Lanckorońska1898-08-1176324812n80001031122554140
Q669893Georges Mandel1885-06-0549264696n91053655119078872
Q87857Fritz Grünbaum1880-04-0774077040n86138300119368129
Q47500000Anton Mayer1879-04-2280994143nr2005025243117542393
Q9354188Tadeusz Dobrowolski1899-08-17117652610n80145073
Q47512272Ludwik Rajewski1900-12-1693474687n84219822
Q7085852Ole Henrik Moe1920-01-11311036799n8221205913623674X
Q19753989Karl Freund1882-01-0144770206n90699618140496785
Q3426274René Gimpel1881-01-0191549548n860362591044616733
Q9342555Stanisław Małkowski1889-08-22102102730
Q60040474Adam Abel1886-01-0168427564116553375
Q55006961Heinrich Rieger1868-12-25297121379no2008119220131888331
Q42887282Lucien Graux1878-04-049881824n86022614
Q9138146Abe Gutnajer1888-01-01303116224
Q88803939Leo Grünstein1876-07-1869694062no2008136862116893001



About the Wikidata Queries and data preparation

Two queries were used:

1) for art people detained in Nazi camps or Ghettos
and 

2) for art people who died in Nazi camps or Ghettos


Art People


{ ?item wdt:P106 wd:Q1792450.} UNION { ?item wdt:P31 wd:Q1007870. } UNION { ?item wdt:P106 wd:Q173950.} UNION { ?item wdt:P921 wd:Q328376.} UNION { ?item wdt:P106 wd:Q10732476.} UNION { ?item wdt:P106 wd:Q446966.} UNION { ?item wdt:P106 wd:Q22132694.} UNION { ?item wdt:P106 wd:Q674426.}

(Please note that artists are not included in this selection.)


Died in Camp

{ ?placedied wdt:P31 wd:Q328468.} UNION { ?placedied wdt:P31 wd:Q152081. } UNION { ?placedied wdt:P31 wd:Q153813.} UNION { ?placedied wdt:P31 wd:Q153813.} UNION { ?placedied wdt:P31 wd:Q2583015.}}



Detained in Camp

{ ?place_detention wdt:P31 wd:Q328468.} UNION { ?place_detention wdt:P31 wd:Q152081. } UNION { ?place_detention wdt:P31 wd:Q153813.} UNION { ?place_detention wdt:P31 wd:Q153813.} UNION { ?place_detention wdt:P31 wd:Q2583015.}


Fields Added Manually

Last Name, First Name