Introduction to Transkribus and OCR

This 2-part workshop will explain and demonstrate the Handwritten Text Recognition (HTR) platform Transkribus, a popular tool since its release for making historical documents more readable and accessible. Currently Transkribus has over 1,700 regular users, representing 80 institutions, and is regularly utilised in crowdsourcing projects on a range of collections. Led by AHRC funded PhD student Joe Nockels, who has recently published work using Transkribus on National Library of Scotland (NLS) material cooperating with the developers of the software at the University of Innsbruck, these two sessions will ensure that pitfalls in using automatic transcription are avoided and untethered creativity can emerge in your work without error concerns. 



Direction and support will be given concerning the downloading of the software. The first session will then cover how HTR technologies have served to fill the gaps left by Optical Character Recognition (OCR) and how they differ as tools. What will follow is a demonstration of how to upload documents to Transkribus; how to segment them into lines to be transcribed; and how to train and run an HTR model to automatically produce transcripts. Other powerful functions of Transkribus will be highlighted, such as the keyword spotting tool, and further resources will be provided. 



The second session will then ask participants to feedback their experiences of using the software. A discussion of how to use Transkribus to produce a scholarly edition or to make personal materials, for example old postcards or letters, more readable will then be facilitated.  



Transkribus is a community project at heart and any transcriptions made, even a few pages, furthers their effort in producing accurate software. Engaging in these sessions will not only introduce you to an essential transcription tool but enable others to improve their own projects on the back of your effort also.  



Due to high demand for our training events, our cancellation and no-show policy applies to bookings for this event. Click here for details of this policy.

You might be interested in

Graphic for a workshop titled ‘Using API for Research.’ The background is a black-and-white photograph of people working with printing equipment and patterned sheets. A large magenta ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

Using API for Research

Graphic for a workshop titled ‘Text Classification in Practice: From Topic Models to Transformers.’ The background shows handwritten historical letters. A large green ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

Text Classification in Practice: From Topic Models to Transformers

Graphic for a workshop titled ‘Using Prompting Efficiently for Research.’ The background shows an aged, torn book page with visible text. A large green ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

Using Prompting Efficiently for Research

Graphic for an event titled ‘BYOD Festival.’ The background is a black-and-white photograph of people sitting around a table, drinking tea and playing cards. A large magenta ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

Bring Your Own Data (BYOD) Fest

an old map of Acotland with the text "Jennifer Smith & Brian Aitken, Project deep Dive"

Who Speaks Scots Where: What Crowdsourcing Reveals

Graphic for a workshop titled ‘Foundations of Webscraping.’ The background is a black-and-white photograph of students working together in a design studio with maps and models. A large teal ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

Collecting Data from the Web: Foundation of Webscraping

Graphic for a workshop titled ‘Getting Started with Descriptive Statistics.’ The background is a black-and-white photograph of people reading and working in a library. A large magenta ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

Getting Started with Descriptive Statistics

Graphic for a workshop titled ‘Working Collaboratively Through Version Control.’ The background is a black-and-white photograph of people weaving on large looms. A large magenta ampersand featuring an illustration of Ada Lovelace is placed on the left. The logo of the Centre for Data, Culture & Society (DCS) appears in the top right corner.

Working Collaboratively through Version Control