Explore Unstructured Data: The Secret World of XML

Book Now
Book Now
Men working

 

In Person

eXtensible Markup Language (XML) is one of the secret ingredients of modern computing infrastructure. Despite being little-known, it underpins all sorts of critical digital infrastructure. A word document or Excel spreadsheet is in fact constructed with XML data. SharePoint is, at its core, an XML data platform. All sorts of mainstream applications utilise XML in various shapes or forms.

In Digital Scholarship, XML is used in a vast range of fields, for example text transcription (TEI-XML), mathematics notation (MathML), metadata in various guises such as bibliographic and archival catalogues (Dublin Core, Bib-XML etc, MODS/METS). It is, in short, an essential bit of digital apparatus with a wide range of roles and potential uses.

Understand XML and you will gain access to a niche but very potent digital framework.

The session will provide an overview of XML data structures as well as a background of the technology. The course will also offer a brief introduction into constructing XML data sets.

Secondarily we will provide an introduction into XPath; the mechanic that is used to navigate the graph structure of XML. And a taster of the various technologies that can utilise XML data. In this case XSLT (eXtensible Stylesheet Language Transformations).

This is an intermediate-level workshop. In order to undertake this session, it is recommended that you have a basic understanding of Python and the Python command line. You can build up familiarity with Python by attending our Introduction to Programming with Python course.

A background and some familiarity with HTML would also be useful as it shares many principles with XML. Some aspects of working with HTLM will be covered in the course Build Your Personal or Project Website with GitHub Pages. If you want to familiarise yourself with HTML you can have a look at the W3School tutorials.

XML has a utility for anyone interested in the following:

  • Text analysis, text mining, text production (transcriptions, scholarly editions, including non-extant languages).
  • Non-relational data structures.
  • Archival or collection metadata.
  • Semantic web technologies.
  • Modern computing infrastructure.

 

This workshop will be taught by Ed MacKenzie.

Those who have registered to take part will receive an email with full details on how to get ready for this workshop.

If you’re new to this training event format, or to CDCS training events in general, read more on what to expect from CDCS training. Here you will also find details of our cancellation and no-show policy, which applies to this event.

If you are interested in other training on working with unstructured data, you can have a look at the following:

 

Return to the Training Homepage to see other available events.

Room 4.35, Edinburgh Futures Institute

This room is on Level 4, in the North East side of the building.

When you enter via the level 2 East entrance on Middle Meadow Walk, the room will be on the 4th floor straight ahead.

When you enter via the level 2 North entrance on Lauriston Place underneath the clock tower, the room will be on the 4th floor to your left.

When you enter via the level 0 South entrance on Porters Walk (opposite Tribe Yoga), the room will be on the 4th floor to your right.

You might be interested in

An illustrative collage with & symbol and an old photograph

Building Personal and Project Websites

An illustrative collage with & symbol and some patterns in squares

Modelling Unstructured Data with Bert

image of head

CDCS Digital Research Prizes Award Ceremony

UoE archival image with training event title

Systematic Literature Review with R

An illustrative collage with & symbol and old graphs

Getting Started with Regression in R

An illustrative collage with & symbol and a maths graph

Linear Mixed Effects Modelling

An illustrative collage with & symbol and an old photograph

Explainable Machine Learning (XAI)

Thumbnail with title of the training

Comparing Sentiment Analysis Models in R

An illustrative collage with & symbol and a historical item

Getting Started with Bayesian Statistics