This is a brief introduction to relational databases for historians, that is for any researcher working with data that has a time dimension: historians, economic historians and cultural historians, but also for example archeologists, linguists and geologists.
This text aims to convince such historians of the use of relational databases in particular settings. Relational databases are not always the best tool for the job, so it is useful to know when and when not to use them.
Secondly, once the choice for relational databases is made, one needs to learn how to build and use them. This text will not go into that but refer to resources known to the author.
Thirdly, it address the issue of how to structure the time dimension in relational databases, but could also be helpful for historian-programmers.
When to use a relational database for historical research
What is a database
- a set of files
- a spreadsheet
- xml database
- relational databases
- Examples MS Access, FileMaker, MySQL
Relational databases in essence
Tables: put similar things together. They are related. Horizontally and vertically
Relating tables to each other.
When could you typically use them, and when not.
What to put where?
Normalization … do it but don’t take it too far.
Design considerations: what is the purpose? How are you going to use the data? Who is going to use the data? Technical stuff (capacity of the computer, speed of the connection, etc.). Interface design.
General principles / rules of thumb
- Never stop using your own brain – but don’t trust it when you’re a beginner
I.e. think if whatever your read applies to your own situation. Tweak and adapt it as necessary. However if you are new at it, better try the suggestions first to find out that you are not jumping to too quick adaptations.
- Better store what you know and keep it separate from what you guess. What you guess may be stored, but you’d better turn it into calculations.
- Store the quote, or better the quote in context, separately from how you code it into stored data. For later maintenance
- Add note fields, and police them, especially in multi-user contexts
Programming languages and database software usually offer a variables and/or fields to store dates. It seems so straight forward but for many historical applications and for applications covering certain current-day cultures or countries, it is hardly sufficient or even problematic. Focusing on historical database applications, there are a number of things to consider: the formatting of the date field, the calendar, and different forms of uncertainty. The following sections deal with these topics.
- Formatting, especially in international context
- Day of the week
- Do not necessarily go back in time very far, what to do then?
They are not always very suitable because they require a specific date. QaD solution: use the first of the month or year if you don’t know the day or the month respectively.
Uncertainty – the source is not specific
Uncertainty – unknown offset
To do perio.do