By Iain Bean, Cogapp.

This February marked my first Cogapp Company hack day — a day where we set aside our usual work and put our creative energies into something a little bit different.

The theme for this Coghack was ‘Museum APIs’, so for a bit of inspiration I had a look at a few museum collection sites. I’ve always been interested in graphic design and typography, so I was drawn to the Science Museum’s extensive collection of posters covering topics from transport to public health.

I wanted to inject some interactivity into the experience of viewing these posters, so after some thought I decided on a Hangman-style letter guessing game using Optical Character Recognition (OCR).

… Read the full post by Iain Bean at on the Cogapp blog.

Dutia, K. and Stack, J., Heritage Connector: A Machine Learning Framework for Building Linked Open Data from Museum Collections. Applied AI Letters. 2021;e23.


As with almost all data, museum collection catalogues are largely unstructured, variable in consistency and over whelmingly composed of thin records. The form of these catalogues means that the potential for new forms of research, access and scholarly enquiry that range across multiple collections and related datasets remains dormant. In the project Heritage Connector: Transforming text into data to extract meaning and make connections, we are applying a battery of digital techniques to connect similar, identical…

A conversation between Tim Boon and Kalyan Dutia

Tim: Let’s start by explaining why we’re here and who we are. The Science Museum Group has a current research project, Heritage Connector, which is bringing the power of modern computing to address some long-standing issues to do with how we record information about the objects in our collections. One way of describing the aim of the project is that we want to make it easier for people to find and appreciate these millions of objects, pictures and documents.

I come to this as a Science Museum curator and historian with several…

Rhiannon Lewis and John Stack

The Heritage Connector project explores how AI-generated knowledge graphs can facilitate new forms of exploration, discovery and research for digitised cultural heritage collections. As covered in our previous blog post, all collection catalogue data:

  • is inherently reductive;
  • is uneven, incomplete and forever a work in progress;
  • includes biases, both in content (where are there omissions?) and cataloguing which inevitably gives prominence to particular areas over others;
  • is structured in a tabular format (usually in a relational database system).

As Windhager notes in Visualization of Cultural Heritage Collection Data: State of the Art and Future Challenges, ‘CH [cultural heritage] collections are assemblages inherited from the past, experienced in the present, and preserved for the future.’

Continue reading on Heritage Connector blog…

The ai4lam (Artificial Intelligence for Libraries, Archives and Museums) community call at 16.00 GMT on 16 February 2021 featured lightning talks on named entity recognition.

Kalyan Dutia presented the talk Combining NER and Knowledge Graphs in the Heritage Connector Project.

Watch recording on the Heritage Connector blog

Between 2019 and 2021 the Science Museum Group is digitising hundreds of thousands of objects from its remarkable collection as they are moved to a new, purpose-built store in Wiltshire.

As each of these objects are added to the collection website, you could be the first person to see it published online though a web page which displays objects with a total lifetime page view count of zero.

Launch site

Find out more about the digitisation project and Never Been Seen.


Thanks to jamieu for help with the JavaScript :-)

Built using the Science Museum Group’s Collection Online API.

Code licensed under the open source MIT License.

Kalyan Dutia

As part of the Heritage Connector project we’re seeking to create new links between collection items and collections at scale by making use of existing metadata and mining structured data from text, as well as using Wikidata as a centralised point of connection between collections. These challenges require a set of technologies beyond those found in existing collections management systems. This blog post describes exactly which technologies we’re using and how we’re using them.

Continue reading on the Heritage Connector blog

Rhiannon Lewis and John Stack

The Heritage Connector project seeks to understand how existing digital tools and methods can be used to build relationships at scale between inconsistently, and at times thinly catalogued, digitised collection objects. Online collections have been with us for around twenty years now, and their digitisation has enabled access to databases with a wealth of collections knowledge. However, these databases have determined, and limited, how this collection knowledge was structured and accessed. Machine learning presents an opportunity to build links at scale through knowledge graphs between Wikidata and museum collections, so that we can begin to acknowledge and overcome these limitations.

Continue reading on the Heritage Connector blog…

On Friday 19 June 2020, the Science Museum hosted a free, public webinar on Wikidata and cultural heritage collections.

This was the first in a series of convenings as part of the Heritage Connector project.

Recordings of the webinar are available online.

Section of a CB1 manual telephone exchange switchboard, 1925–1960, Science Museum Group Collection, CC BY-NC-SA 4.0

Although museums have extensive displays and exhibition programmes, it is usual for them also have significant numbers of collection objects in storage. These objects are available for loan and for research purposes. As the Science Museum Group moves over 300,000 objects to a new storage facility, we are photographing, cataloguing and publishing these objects online.

Because of the need for a rapid digitisation programme, the approach is necessarily one of breadth rather than depth. We have therefore begun to explore the opportunities for artificial intelligence to add descriptive metadata keywords for the digitised objects.

Amazon’s Rekognition service was used to…

John Stack // Digital Director of the Science Museum Group

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store