Image of Calton Hill with text and shape overlays

A Gentle Introduction to Coding for Data Analysis

10-14 June 2024

This course is designed for researchers who are complete beginners with no prior knowledge of coding and data analysis. Through lectures and exercises, attendees will learn how to code in Python, starting from core concepts such as variables and loops, through to coding live data visualisation.

The course has a practical focus, with all coding happening in student pairs with support from teachers and instructors.

Overview

This course explores the basics of programming: variables, functions, loops, operating on data structures, data wrangling, visualisation, and publishing to the web. By the end of the course, attendees will understand how to bridge the gap between humans and computers, and how to apply the skills they have learnt to their own data analysis and research.

Key Principles:

  • No previous coding experience is required.
     
  • Encourage collaboration with other students to achieve goals and tasks. Instructors will be available in the room to help.
     
  • The course combines mini-lectures, guided coding examples and programming challenges.
     
  • Students will work towards describing and visualising data.
     
  • Provides the foundations needed to continue your coding journey through self-study.

Pair Programming

 

For this stream, we are going to use a technique called Pair Programming, where two developers work together on the same task. In this one person writes the code (driver) and the other person reviews each line and provides feedback (navigator). This approach differs from traditional solo programming and offers various advantages.

The Driver's responsibility is to focus on the mechanics of operating the computer and entering code. The Navigator's responsibility is to think about what needs to be done and where we are going. The two will communicate continually and shift roles frequently, probably every few minutes.


 

 

Discover more about Pair Programming

Daily Schedule

Registration and Welcome

Morning Sessions:

Introduction to Python and Noteable: To kick off the week, you will be introduced to the summer school and an overview of all the great things you will learn through the week. With this we will look at what Python is and where it came from as well as how we can use the University of Edinburgh Noteable service to make getting stuck into programming in Python even easier.

Afternoon Sessions:

Conditions and logic: The first key concepts that we will come across in this gentle introduction are ‘conditions and logic’. In the world around us we take instructions for granted but getting a computer to understand when to execute code or not means we must think in a very basic, intuitive, way. This session will cover how computers think about these concepts and how to put them into practice in Python.

Keynote Lecture:

Prof. Melissa Terras

Title: How do the Humanities Keep Up with AI? Opportunities and Issues for Research

This talk explores the evolving dynamics between AI technologies and the humanities, asking how traditional fields like literature and history can integrate AI to enhance scholarly research and cultural understanding. It will discuss existing and potential methodologies, interdisciplinary collaborations, and the critical role humanistic inquiry plays in guiding the ethical development and application of artificial intelligence.

Melissa Terras is Professor of Digital Cultural Heritage within Design Informatics at the University of Edinburgh, UK. She is Director of Creative Informatics, the Edinburgh based AHRC Creative Cluster (2018-2024) supporting innovation in creative and cultural contexts, and a founding Director of Transkribus, the AI-powered platform for text recognition of historical documents.

Morning Seminar:

For our first seminar of the week, we will hear from Ozan Evkaya.

Morning Sessions:

Functions: Having got to grips with how to run code and understand data types in Python, the next thing we will consider is how to make code that does something repetitive. Typically, in life we want to run analysis on multiple rows in a data set or produce the same types of outputs for subtly different questions. Functions will become our superpower in doing this!

Afternoon Sessions:

After the second session on functions, we are going to focus Times and Dates. 

Often data in research is a lot more than just numbers and ‘Yes/No’. Perhaps you deal with geographical data, or times and dates. In this session, we will dive into the different data types you might come across when using Python and gain an understanding of how to deal with these – addressing how they behave differently from one another.

Morning Seminar:

For our second presentation of the week, we will be joined remotely by Hannah Claus, a Research Assistant at the Ada Lovelace Institute.

Morning Sessions:

Collections: As we get to halfway through the week, we will start to delve a little deeper into more complex data structures. The first of which will be collections. Often, we want to group different types of objects together to see patterns and trends in our data. For this understanding how collections work in Python is very important.

Afternoon Sessions:

List Comprehensions: With the new knowledge of collections, we will then dive into the concept of list comprehensions. This unique feature of Python, allows for quick and easy scanning of bigger objects to allow for a cleaner, more succinct way to get what you need.

Loops: Bringing Wednesday to a close is the topic of Loops. Often there is a desire to repeat some process over and over to make data cleaner and life easier. To this extent ‘looping’ through some code becomes a great way to do this without typing the same thing out 100s of times.

Morning Sessions:

Interlacing Loops, Functions and Lists: As we end the foundational skills that are required to use Python effectively and efficiently, we look at the process of interlacing loops, functions and lists. The use of all three in conjunction with each other is a powerful way to make code easy to understand and run.

Data Importing and Handling: Before being able to apply all the skills learnt so far in the week, this session will provide an easy-to-follow overview of how we can import different data files in Python. To do this we must use a package called ‘pandas’. Therefore, this session will both introduce the idea of bringing in other people’s functions into Python through packages, and an introduction to the usage of ‘pandas’ for data handling.

 

Afternoon Sessions:

Data Cleaning, Summaries and Overview: Having got to know the usage of pandas, we will then look at a simple data set and start to understand how one might filter and wrangle data such that we get it into a nice place to use for analysis. We will then touch upon functions from ‘pandas’ which allow us to obtain data summaries and other useful overviews.

Morning Seminar:

As we begin our final day, we will hear from Paweł Orzechowski, a lecturer with The Usher Institute at the University of Edinburgh.

Morning Sessions:

Data Visualisation Basics: In this second to last session, we will look at different ways which we can start to visualise the data (prepared the day before), and how there are many different packages and ways to do this in Python.

Afternoon Sessions:

Data Visualisation in Practice: Bringing the summer school to our last session, we will look at the latest tools to make data visualisation in Python the easiest and best presented possible – along with providing participants with the resources needed to take their Python training further.

Next Steps: In the final session of the summer school, together with the attendees of the other stream we will discuss the results of the week and which would be the next steps you can take to continue developing your computational skills. 

Monday Tuesday Wednesday Thursday Friday
09:00-09:30 Registration
09:30-09:40 Welcome Setting Up Setting Up Setting Up Setting Up
09:40-10:40 Introduction to Python and Noteable Seminar Seminar Interlacing Loops, Functions and Lists Seminar
10:40-11:00 Coffee Coffee Coffee Coffee Coffee
11:00-12:30 Introduction to Python and Noteable Functions Collections Data Importing, Handling Data Visualisation Basics
12:30-13:30 Lunch Lunch Lunch Lunch Lunch
13:30-15:00 Conditions and Logic Functions List Comprehensions Data Summaries and Overviews Data Visualisation in Practice
15:00-15:30 Coffee Coffee Coffee Coffee Coffee
15:30-17:00 Keynote Dates and Times Loops Data Cleaning Next Steps
Evening Reception Ceilidh Club Pub Quiz
Green: Room 2.55 in Wing A or Room 1.55 Wing A
Grey: Teaching Rooms, 1.50 and 1.52 in Wing B
Yellow: Small Events Space, Room 4.55 in Wing A
Blue: Social events happening outside the building

Our Speakers & Instructors

Aislinn Keogh

Aislinn Keogh

Aislinn is a PhD student in the Centre for Language Evolution. Her research combines behavioural experiments and agent-based modelling to investigate the role of language production biases in the emergence of linguistic structure. She is proficient in Python, R and JavaScript and is passionate about the use of simulation-based techniques for experimental design and data analysis.

Chris Oldnell

Chris Oldnall

Chris is a PhD mathematics researcher with the MAC-MIGS Centre of Doctoral Training, who is affiliated with the Institute of Genetics and Cancer. His work is interdisciplinary and involves combining causal inference with genomics. He loves teaching individuals on how to get the most out of ‘big data’ by using data analysis techniques appropriately and accurately, and most importantly how to implement these in Python and R.

Hannah Claus

Hannah Claus

Hannah Claus is a passionate and driven researcher and DeepMind scholar, and currently works at the Ada Lovelace Institute. Beyond the technical aspects, Hannah is deeply invested in the ethical considerations of AI, emphasising the importance of understanding intelligent systems for coexistence. Analysing the impact AI has on different societies and communities, she works on providing more accessible knowledge and skills on AI to improve AI literacy.

Martin Disley

Martin Disley

Martin Disley is a practice-led design researcher based at the Institute for Design Informatics at the University of Edinburgh. His critical engineering studio practice blends artistic inquiry and investigative computing, producing outputs in software, film, installation, and text. His PhD research explores adversarial design and investigative aesthetics as Research through Design methods for explainability and interpretability of generative computer vision.

Prior to pursuing his PhD, he worked as a software developer for a music technology startup and as a research engineer at the University of Edinburgh.

Melissa Terras

Melissa Terras

Melissa Terras is Professor of Digital Cultural Heritage at the University of Edinburgh‘s College of Arts, Humanities, and Social Sciences.

Her research focuses on the digitisation of cultural heritage, including its technologies, procedures, and impact, and how this intersects with internet technologies. She was the founding director of the Centre for Data, Culture, & Society.

Ozan Evkaya

Ozan Evkaya

Ozan Evkaya (FHEA) is a University Teacher in Statistics at the School of Mathematics and has been teaching mathematics students in higher education across different subjects. Outside of university teaching, Ozan is a co-organiser of TEMSE seminars and local GenAI group in School of Math,  co-organiser of EdinbR group and member of RSS Edinburgh local community. Previously, he held postdoc positions at Padova University (2021) and KU Leuven (2020), after completing his PhD in Statistics (2018) at Middle East Technical University.

Pawel Orzechowski

Pawel Orzechowski

Lecturer in Programming for Health and Social Care. Pawel teaches programming at the Usher Institute (School of Medicine), Edinburgh Futures Institute and Business School. 

Building on years of experience in the tech industry and teaching in coding bootcamps, Pawel will help you kickstart your coding journey.

Xan Cochran

Xan Cochran

Xan Cochran (they/them) is a Research Masters’ student in Informatics, with supervisors in Informatics and Philosophy. Their research concerns the metaphysics of ‘levels’ in scientific discourses, and combines philosophical analysis with the computational modelling of epistemic communities. They hold degrees in English Literature and Developmental Linguistics, and have for the last decade been working as a tutor for the University, having tutored in the Schools of Informatics; Philosophy, Psychology, and Language Sciences; Social and Political Sciences; Biological Sciences; Design; Music; Physics and Astronomy; Mathematics; Economics; and Geosciences. In their spare moments, they paint and write science fiction.

Jessica Teed

Jessica Teed

Jess is a PhD student in the School of Philosophy, Psychology, and Language Sciences. She is interested in exploring how the brain processes visual information and memories using behavioural paradigms combined with neuroimaging techniques such as functional MRI, transcranial magnetic stimulation, and electroencephalography. Jess's research focuses on the functional organisation of visual representations including object perception and visual imagery, using Python and R for experimental design and analysis.