Hands on

In this part different steps surrounding the collection of data – how do I get from sources to (structured) data and what is that anyway? –, the preprocessing of data – how can or must I edit the data for my own use? – and the analysis of data – what is the structured data good for and what do I do with it? – will be exemplified. We will only touch many of the practices and concepts that can be delved into deeper with further Literature and Tutorials. You might miss some contents – that is unavoidable, but comments are always welcome.

As source we are using a correspondence that has been compiled as part of the digital edition “Der Sturm” at the Akademie der Wissenschaften in Mainz.1 The project which edits letters from individuals belonging to the international Avantgarde surrounding the Journal “Der Sturm”, had different users in view and offers the data in different formats:

On the Website of the project those letters already edited can be read, and there is an index of the named entities in the text, such as persons, places and works; in addition to that there is the possibility of downloading the sources or index data via an API. This means we can approach the letters via the front door, the website, or the back door, the command line, and compare the two approaches. Basic concepts for working with data, the necessary steps to get from a source to a data set, will be touched on.

Letters are a common source genre in many different epochs and areas and they can be used both for textual analysis and for the gathering of structural data. The example is therefore aimed at showing processes that can be very useful for actual historical research.


  1. DER STURM. Digitale Quellenedition zur Geschichte der internationalen Avantgarde, erarbeitet und herausgegeben von Marjam Trautmann und Torsten Schrade. Mainz, Akademie der Wissenschaften und der Literatur, 2018. Online: https://sturm-edition.de/, licence: CC-BY-4.0.↩︎