Appendix A — Glossar

API Application Programming Interface: a facility offered by a web resource which allows search queries independent of a GUI, often performed using scripts
bash default program that runs in the command line
bias systematic error that results from an unbalanced sample
big data huge amount of data, identifiable through repeated freezing of your standard program when opening a file
born digital data data which originated in a digital form
CLI Command Line Interface, text interface that allows interaction with the computer; see also bash
close reading careful and attentive interpretation of a text
CMS Content Management System
Console See CLI
Crowdsourcing projects that include the active participation of the public to generate content, transcribe sources etc.
csv comma separated values, a structured text format, using commas as separators between columns
distant reading quantitative approach to huge amounts of texts, using computational methods to search for interpretable patterns
GUI Graphical User Interface
HTML Hypertext Markup Language, a structured text format, like the format this guide is written in, to render documents in a browser
Jupyter notebook web application/interactive coding environment that runs in a browser; let’s you create and share code (https://jupyter.org)
machine learning umbrella term for different methods that use data to do a task in a specific way, using data to learn and to improve the results
machine readable transformation of, for example, text into a data format that is processable by a computer
OCR Optical Character Recognition, process of transforming text on an image into a data format
OS Operating System
open source freely available source code that can be used, modified and redistributed without limitations
OSS Open Source Software
Regular Expression syntax for search and replace text using patterns (instead of exact matches)
terminal See CLI
web scraping extracting data from websites