split faces with computer mouse print over

RESEARCH PROJECTS

 

Green shoots continue to appear even in the most challenging circumstances. This year we have been delighted to be able to support a variety of new projects. CDCS has a text and data mining facility at the University of Edinburgh called defoe for interrogating large and heterogeneous text-based archives, and in the summer of 2020 we went ahead with our first CDCS Text Mining Lab, hosted remotely but otherwise as planned. We're currently in the early stages of our second Lab and continue to work closely with colleagues across the University of Edinburgh to develop support for this area of strategic focus. We have also supported individual projects, providing resources and guidance, and seen some of our previous pilot projects grow and bloom.

TEXT MINING LAB 2020

We organised our first CDCS text mining lab with support from colleagues at the Edinburgh Parallel Computing Centre, the National Library of Scotland, and our own Library and University Collections. Thirteen projects were pitched with a fantastic range of research questions, from how to trace the reception of writers over centuries, to exploring the depiction of highlanders in English publications, to the portrayal of mental health issues in newspapers. We were able to assist researchers in learning about text mining scripts and developing queries, giving them a great insight into the requirements and challenges of computational text analysis. 

Listen to Galina Andreeva (Business School) and Dave O'Brien (Edinburgh College of Art) talk about their projects and what they got out of the lab process in this video. 

Graphic mashup of medieval drawing

TEXT MINING LAB 2021

This spring we ran another call and welcomed six new projects into our Lab. With the help of colleagues Amy Krause and Anna Roubíčková at the EPCC, we'll be supporting researchers to develop queries to extract information about: the emergence of public protest as a political tool; how the economy is framed in public discussion; the representation of Kashmir; and the reception of Latin poetry.

Alongside new projects, we are also supporting last year's projects as they develop. We have funded assistance for work exploring the emergence and growth of newspaper reviews and on the use of Scots in printed chapbooks, on which researchers Sarah van Eyndhoven and Lisa Gotthard will present at the upcoming International Society for the Linguistics of English Conference, organised by the University of Eastern Finland. 

Rosa Filgueira

Dr Rosa Filgueira worked with us for six months in 2020 developing our text mining infrastructure and supporting our researchers with their projects during the 2020 Lab. She has now taken up a post as Assistant Professor at Heriot Watt University and will be the 2021-2022 National Library of Scotland Digital Scholarship Fellow. Her Fellowship will develop her work with historical collections, by exploring new ways to unlock the full value of the National Library of Scotland's Data Foundry collections by building a new AI toolbox called 'frances' and a web user interface that allows researchers to extract complex information from the collections.

other projects we've supported

Alison Cullingford, Suzanne Trill and Cordelia Beattie viewing an Alice Thornton manuscript in Durham Cathedral

AHRC funding Success: Alice Thornton's Books

We are thrilled for our colleagues Dr Cordelia Beattie and Dr Suzanne Trill, who have been awarded over £800,000 from the AHRC for their project 'Alice Thornton’s Books: Remembrances of a Woman’s Life in the Seventeenth Century’. Suzanne and Cordelia approached us in 2019 looking for support to explore the potential of a digital scholarly edition of the Thornton manuscripts using TEI markup. We were delighted to be able to help them access training, technical help and software for the pilot project that laid the solid foundations for this exciting and ambitious larger project.  

Beyond Humanitarian Emergencies

Led by Dr Kate Wright, working in collaboration with Dr Anouk Lang (Edinburgh), Dr Dani Madrid-Morales (Houston, Texas), and RA Dr Andrew Jones (Exeter), 'Beyond Humanitarian Emergencies' analyses 20 years of Anglophone news output to see whether the meanings commonly associated with the term 'humanitarian' are changing over time and in relation to specific issues (refugees, climate change, CV19), as well as differences between news outlets around the world. 

With CDCS funding, the team completed the world’s first global corpus of humanitarian news, comprised of 1.6 million broadcast, print and online news texts in the English language which contain the word ‘humanitarian’. Covering a ten-year period (2010-2020) and including news from 593 media outlets across 93 countries, this dataset will be made available through DataShare. The team worked with students in Autumn 2020 to conduct some exploratory data and analysis and vizualisation projects, specifically on the relationship of humanitarianism and CV19, and are now training a neural network with subcorpora of news texts from different regions. They plan to investigate word associations and compare discursive differences in the way the news media in different locations represent humanitarian crises and action.

Understanding the Drivers of the DDI programme

Edinburgh’s current aspiration is to be a ‘data capital of Europe’ through the ‘Edinburgh City Region Deal’ – a £1.3bn investment in the area, grounded in a vision of economic prosperity brought about by data-driven innovation. The University of Edinburgh has positioned itself in a vanguard role through the Data Driven Innovation (DDI) programme, which will receive £350,000,000 of City Region Deal funding over a ten-year period of research and development.

Following on from a panel about DDI during the 2019 Data Justice Week, which highlighted public questions about the motivations and underpinning policy behind the program, Morgan Currie, Jeremy Knox and Callum McGregor have begun a research project that seeks to understand the policy origins and different values and goals driving DDI projects since the program began in 2019. Support from CDCS has gone towards hiring a research assistant who has amassed primary source documents - founding policy documents and other textual artefacts behind the City Region Deal and DDI - and has begun an analysis of the origins, justifications, and aims of the DDI program.

'It’s all about the feelings…’

CDCS funding enabled Beverley Hood and her research assistant Alison Mayne to gather information about gendered representations of AI within written publications, as well as actual AI incarnations developed for commerce, research and cultural imaginings within film, tv, literature and art. The resulting data showed very clearly the intersectional nature of this bias and therefore the research is continuing with an expanded scope as ‘It’s all about the feelings…’, a pilot performance project exploring bias within AI sentiment recognition systems.

The project has now received further funding from ECA RKE Fund (£2000) and a Challenge Investment Fund (£4869.40) and will use creative practice based research methods to develop a pilot digital performance exploring new critical ways to make visible and discuss the biases being perpetuated within sentiment recognition systems. The project will aim to create positive digital literacy around the use of sentiment recognition and the challenges of algorithmic bias. It is intended as a first step towards an ambitious, large-scale touring performance work, involving an interdisciplinary group of researchers and external partners (Hood, Hill, Catanzariti, Goldsmith, Experiential AI research Group and Tramway) which would take these pressing themes and concerns from academia to the tech industry and wider general public. 

Using social media to understand business dynamics during the Covid-19 pandemic

Covid-19 and the lockdown have dramatically affected business activities across the globe. This has raised a lot of questions around the determinants of the resilience of different businesses. Traditionally, businesses are evaluated using financial statements, but they are submitted only once a year. This project, led by Dr Galina Andreeva, investigates the potential value of novel sources of up-to-date information, such as Twitter and newspapers, that offer an invaluable resource for tracking the changing sentiments and attitudes.

CDCS funding has facilitated the development and presentation of the traditional benchmark model for Scottish tourism and hospitality, one of the sectors with the worst disruption. It has also been used to purchase a Twitter developer licence enabling data collection and the development of programmatic resources. The research will now concentrate on exploring how this information can enhance the benchmark business evaluation model, and on developing a set of Jupyter notebooks with python code.

improvbot.ai logo

Improvbot.AI 

In 2020, the Edinburgh Festival Fringe (as we know it) did not go ahead for the first time since 1947. CDCS Director Prof. Melissa Terras, fellow researchers, and the Improverts – the Fringe’s longest running improv comedy group – responded to the situation by planning to provide festival entertainment via Twitter. They created a bot that would generate event blurbs using AI technology, creating an imagined Festival Fringe programme and inspiring improvised sketches that were shared online throughout August.