The Text & Data Analysis Summer School Programme will take place over five days. We're pleased to be able to share the course timetable, a detailed schedule for each day, and information on our speakers and workshop instructors. This page will be updated as preparations progress.
COURSE PROGRAMME
Overview
Daily Schedule
After participants have registered, we will gather for our first lecture. Dr Jessica Witte will deliver a seminar titled: "Medicalized Fasting: A Sentiment Analysis of Anorexia in Victorian Medical Journals".
The first hands-on workshop will teach you how to extract information from scanned texts using the Tesseract OCR package. The main focus will be on English texts, but we will also deal with similar processes for texts written in other languages.
After lunch, the second workshop will focus on text analysis. Using the Quanteda package, we will explore the text that was extracted in the morning. While we will mostly be working in English, we will also explore how to analyse content across different languages.
In the first BYOD session, participants will work together on ongoing research datasets provided in advance by the summer school attendees, focusing on good practices and troubleshooting.
The first day will be concluded by a plenary lecture by Professor Melissa Terras entitled "The Boundaries of Digitised Content: designing research projects within collection constraints". The talk will be followed by a reception.
Our focus will remain on unstructured data for the second day of the Summer School.
In the introductory lecture, Dr Justin Chun-Ting Ho will talk about his research on social media and nationalism.
Using Rvest and Selenium, the first workshop of the day will teach you how to crawl and extract data from the web. We will explore different techniques, different website structures, and how to solve common problems encountered in Web Scraping.
In the afternoon, the material gathered during the morning workshop will be used as the basis for a session on how to conduct sentiment analysis across textual data.
The day will conclude with a BYOD session. Please make sure to contact the organisers in advance if you want your data to be featured.
The third day of the Summer School will mark a change of focus to structured-data analysis.
Dr Ugur Ozdemir will deliver an introductory lecture on his work on "Basic Human Values and Populist Electoral Support".
In the morning workshop, we will have an introduction to Data Analysis with a focus on Descriptive Statistics. In this session, we will discuss how to use descriptive statistics to report important features of a sample and the general patterns shown in a dataset. We will focus on measures of central tendency and measures of variability.
After the lunch break, the second workshop will turn to inferential statistics and will focus on hypothesis testing. This will demonstrate how to test probabilities and reject null hypotheses in R.
We will wrap up with another BYOD session. Please make sure to contact the organisers in advance if you would like your own data to be featured.
We will continue to explore Structured Data Analysis on the fourth day, but with a focus on more advanced Statistical Modelling.
The morning lecture will be given by Dr Ben Collier, who will talk about his work on the evolution of cybercrimes during the pandemic.
Our morning workshop will teach you how to perform Regression Analysis. Firstly, we will introduce a variety of regression models that are used for understanding the relationship between variables in a dataset. Thereafter, we will deal with linear models and generalised linear models.
In the second workshop, we will move to mixed effects Modelling Analysis, learning how to handle data with repeated measures by using mixed-effects models. In so doing, mixed models will emerge as powerful statistical tools that help us understand the world better by allowing us to account for individual differences in the analyses.
As usual, if you would like to contribute to the end-of-day BYOD workshop with your data, please get in touch with the organisers.
Our final day will be entirely dedicated to good Data Visualisation.
After our last introductory lecture, we will look at the basic principles of Data Visualisation. An introduction to the "Grammar of Graphics" and various plots within this package will then be presented.
After the break, we will shift to more advanced Data Visualisation, using spatial data. We will work with examples of real-world data, and cover the fundamentals of using spatial data and visualising it effectively.
As usual, the day will conclude with a BYOD session, where you can explore how what you have learned can be applied to your own projects.
We will celebrate the end of the Summer School with a Ceilidh party to be held at the Teviot Debating Hall at 8 pm.
Our Speakers
Melissa Terras
Melissa Terras is Professor of Digital Cultural Heritage at the University of Edinburgh‘s College of Arts, Humanities, and Social Sciences.
Her research focuses on the digitisation of cultural heritage, including its technologies, procedures, and impact, and how this intersects with internet technologies. She is the director for the Centre for Data, Culture, & Society.
Ben Collier
Dr Ben Collier is Lecturer in Digital Methods at the University of Edinburgh.
He collaborates with the Cambridge Cybercrime Centre on a number of active research projects. Ben has experience in using a range of qualitative and quantitative research methods and is particularly interested in criminological research which engages with Internet infrastructure.
Justin Chun-Ting Ho
Dr Justin Chun-Ting Ho is a postdoctoral fellow at Academia Sinica, the national academy of Taiwan. He formerly worked at Sciences Po.
His research focuses on nationalism and populism with a focus on how they are communicated via social media. His research employs a range of computational methods, including computational text analysis, social network analysis, and machine learning.
Ugur Ozdemir
Dr Ugur Ozdemir is Lecturer in Quantitative Political Science at the University of Edinburgh.
His research interests include comparative political behaviour, formal models of electoral politics, and quantitative methods. He is a dedicated advocate of bridging the gap between theoretical modelling and empirical analysis.
Benjamin Bach
Dr Benjamin Bach is Lecturer in Design Informatics and Visualization at the University of Edinburgh.
His research designs and investigates interactive information visualization interfaces to help people explore, communicate, and understand data. Before joining the University of Edinburgh, Benjamin worked at Harvard University, Monash University, and at the Microsoft-Research Inria Joint Centre.
Jessica C. Witte
Dr Jessica Witte is a postdoctoral fellow at the University of Edinburgh.
Her research interests include creating and applying textual analysis to historical texts that particularly focus on the medicalization of women’s bodies. She applies and creates these tools to better understand the epistemic dimension of women’s experiences that can be used to create better interventions based on individualized experience, and patient advocacy.
Our Practical Workshop Instructors & Helpers
Andrew McLean
Andrew is an Archaeology PhD student based at the School of History, Classics and Archaeology. His research interests currently focus on the economy of the Roman Adriatic, while his methodological approaches include GIS and statistical analysis. He is expanding on traditional Least Cost Path (LCP) analysis by using circuit theory to model maritime movement. Through this, he is familiar with QGIS, R, Circuitscape, shell scripting and programming languages such as Julia and Python.
Fang Jackson-Yang
Fang is a PhD student at the School of Philosophy, Psychology, and Language Sciences. Her research investigates how speakers encode prominent information in simulated conversations and how listeners predict upcoming utterances in comprehension. She works with both laboratory and corpus data. She conducts data analyses in R using multivariate statistical tools such as mixed-effects models.
James Besse
James is a PhD student in Science, Technology and Innovation Studies. His research covers the implementation of e-ID systems, specifically looking at the EU Settlement Scheme. He uses text mining alongside social surveys and interviews to understand user experience with the EUSS. James is also interested in the social impacts of new technologies, and how digital methods can help to understand them, more broadly. His areas of expertise entail statistics, web scraping, data visualization, research design and the use of R.
Javiera Alfaro Chat
Javiera is a PhD student at the School of Philosophy, Psychology, and Language Sciences. Her research focuses on the psycholinguistics aspects of bilingual language processing, specifically, she looks at code-switching (mixing of two languages in one sentence) and factors that might influence a bilingual speaker to select one language over the other in specific places in a sentence. Javi works with laboratory data, using in-person and online language experiments in conjunction with questionnaires. She conducts data analyses in R using multivariate statistical tools such as generalized mixed-effects models.
Summer School Ceilidh Party
Please join us on the final evening of the Summer School for a Scottish ceilidh, which will be an opportunity to dance and raise our glasses to celebrate what you have achieved over the course of the week!
This will be held at the Teviot Debating Hall, just a few steps from our main venue.
From 8pm we will dance to the music of the 7 Hills Ceilidh Band.



