Working on digitised Manuscripts with Transkribus

Transkribus mashup

 

This is the third workshop of the digitised documents series 

This workshop will explain and demonstrate the Handwritten Text Recognition (HTR) platform Transkribus, a popular tool since its release for making historical documents more readable and accessible. Currently, Transkribus has over 1,700 regular users, representing 80 institutions, and is regularly utilised in crowdsourcing projects on a range of collections. Led by AHRC funded PhD student Joe Nockels, who has recently published work using Transkribus on National Library of Scotland (NLS) material cooperating with the developers of the software at the University of Innsbruck, these two sessions will ensure that pitfalls in using automatic transcription are avoided and untethered creativity can emerge in your work without error concerns.  

The first part will cover how HTR technologies have served to fill the gaps left by Optical Character Recognition (OCR) and how they differ as tools. Then we will see how to upload documents to Transkribus; how to segment them into lines to be transcribed; and how to train and run an HTR model to automatically produce transcripts. Other powerful functions of Transkribus will be highlighted, such as the keyword spotting tool, and further resources will be provided.  

Transkribus is a community project at heart and any transcriptions made, even a few pages, furthers their effort in producing accurate software. Engaging in these sessions will not only introduce you to an essential transcription tool but enable others to improve their own projects on the back of your effort also.   

This is an intermediate-level workshop. Intermediate sessions explore specific aspects of the method (libraries, tools etc.) and offer a more in-depth understanding of the workshop topics. Some previous knowledge of digitised documents is required to be able to follow the content. If you want to familiarise yourself with the basics of working with digitised documents you can attend our Introduction to Digitised Document Workshop

Those who have registered to take part will receive an email with full details on how to join us for this workshop. 

If you’re new to this training event format, or to CDCS training events in general, read more on what to expect from CDCS training. Here you will also find details of our cancellation and no-show policy, which applies to this event. 

In the next sessions of the digitised document series, we are going to look at: 

  

Return to the Training Homepage to see other available events. 

You might be interested in

Digital Method of the Month. Machine learning

Digital Method of the Month: Machine Learning

Sentiment Analysis

Silent Disco: Introduction to Sentiment Analysis

Beyond Social Networks with Gephi

Beyond Social Networks: Advanced Uses of Gephi in Humanities Research

Introduction to Topic Modelling with Bert

Introduction to Topic Modelling with Bert

Introduction to Programming with R and RStudio

Introduction to Programming with R and RStudio

Advanced Uses of LLM

Advanced Uses of LLMs

Null Hypothesis Testing in R

Null Hypothesis Testing in R

CDCS Fika April 2025

Fika

Introduction to Bayesian Statistics

Introduction to Bayesian Statistics