Showing posts with label NER. Show all posts
Showing posts with label NER. Show all posts

Dec 24, 2024

Nov 10, 2024

Experiments in Automated Entity Extraction with Pinpoint: Toledo Museum of Art Provenance PDF


Pinpoint is a tool for investigative journalists. It performs automatic entity extraction from PDF files. Can it be useful for processing provenance texts?

In this post, we examine the results for the Provenance Research PDF file published by the Toledo Museum of Art and archived at: 

https://web.archive.org/web/20121224083005if_/http://www.toledomuseum.org:80/wordpress/wp-content/uploads/Provenance-Research-lowres.pdf

Since the Toledo Museum of Art doesn't appear to publish provenance on its Website in 2024,  this older PDF file offers insights to the ownership history of artworks.