diff --git a/Ddf_documentation.md b/Ddf_documentation.md index 6049ee8c203f3e14d13fa2c56808e223547a4865..afe756c8a2b103a0403329d32246ccfa66cfc799 100644 --- a/Ddf_documentation.md +++ b/Ddf_documentation.md @@ -266,11 +266,11 @@ A new column `BKO_id` is added to the `BKO` dataframe which aligns `Work_ref` in [<sup id="fn1">1</sup>](#inline1) Georg Klingenberg, "Die ROMTEXT-Datenbank," _Informatica e diritto_ 4 (1995): 223-232. -[<sup id="fn2">2</sup>](#inline2) Theodor Mommsen & Paul Kruger, _Corpus Iuris Civlis. Editio stereotypa quinta. Vol 1: Institutiones. Digesta._ Berlin: Weidmann, 1889. +[<sup id="fn2">2</sup>](#inline2) Theodor Mommsen and Paul Krüger, _Corpus Iuris Civlis. Editio stereotypa quinta. Vol 1: Institutiones. Digesta._ Berlin: Weidmann, 1889. [<sup id="fn3">3</sup>](#inline3) Friedrich Bluhme, "Die Ordnung der Fragmente in den Pandectentiteln: Ein Beitrag der Entstehungsgeschichte der Pandecten," _Zeitschrift der Savigny-Stiftung für Rechtsgeschichte_ 4 (1820): 257-472. -[<sup id="fn4">4</sup>](#inline4) Tony Honore, "Justinian's Digest: The distribution of authors and works to the three committees," _Roman Legal Tradition_ 3 (2006): 1-47. +[<sup id="fn4">4</sup>](#inline4) Tony Honoré, "Justinian's Digest: The distribution of authors and works to the three committees," _Roman Legal Tradition_ 3 (2006): 1-47. [<sup id="fn5">5</sup>](#inline5) Dario Mantovani, _Digesto e masse bluhmiane_. Milan: Giuffré, 1987. diff --git a/sql/README.md b/sql/README.md new file mode 100644 index 0000000000000000000000000000000000000000..efaf7b0a3ea737b781de68053776042c123bd29f --- /dev/null +++ b/sql/README.md @@ -0,0 +1,156 @@ +## Relational database of Justinian's _Digest_ in SQLite + +The _Digest_ is the definitive Roman law compendium compiled under emperor Justinian I (533 CE). The text is arranged in 50 books, 432 thematic sections, 9132 quoted passages and 21055 text units. This relational database presents the _Digest_ in interlinked data tables chained together by unique keys. The `digest.db` database can be used for generating advanced analytical insights about Roman law as represented by the _Digest_ and for searching text and associated data in a staructured and efficient manner. + +### 0. Table of contents + +[1. Instructions for using `digest.db`](#1-instructions-for-using-digest-db) + +[2. Tables of the database](#2-tables-of-the-database) + +[3. Sample SQL queries](#3-sample-sql-queries) + +[4. Future steps](#4-future-steps) + +[5. Support for users](#5-support-for-users) + +[6. Footnotes](#6-footnotes) + +### 1. Instructions for using `digest.db` + +The database can be queried with `SQL` statements in four types of `SQLite` interfaces listed below. All of these interfaces are free of charge. They all support exporting the result of `SQL` queries in a `csv` flat file to be opened in a notebook, text or spreadsheet editor. + +#### 1.1 `Online application - SQLite online` + +Having downloaded the `digest.db` file to your computer, it can be opened and queried in a web browser on [sqliteonline.com](https://sqliteonline.com/). + +#### 1.2 `Graphical user interface (GUI) - DB Browser for SQLite` + +The GUI interface of `DB Browser` supports functionalities such as colour-coding data types and linking additional databases. It enjoys from an active and reponsive community of developers. The application's window structure helps to formulate `SQL` queries. Instructions for installation and use can be accessed on [sqlitebrowser.org](https://sqlitebrowser.org/dl/). + +#### 1.3 `Command line interface (CLI) - sqlite3` + +The CLI interface is most suitable for quick reference queries and mass data manipulation. Instructions for installation can be accessed on [sqlite.org](https://www.sqlite.org/download.html). + +#### 1.4 `Python slite3 package` + +The database can be connected directly in a Python script with the `sqlite3` package. Rather than performing standard queries, this interface is most suitable for those who wish to perform additional machine-assisted analyis. + +Alternatively, one may export tables and results of `SQL` statements as `csv` flat files which could be loaded as `pandas` dataframes into a Python code for further processing. + +### 2. Tables of the database + +#### 2.1 `text` + +The core "text" table includes the 21,055 text units with foreign keys which chain text units in many-to-one relationships to supplementary information stored in other tables. The text is pulled from the Amanuensis software[<sup id="inline1">1</sup>](#fn1) incorporating the the text from the ROMTEXT database.[<sup id="inline2">2</sup>](#fn2) The text itself is from Theodor Mommsen's printed edition of the _Digest_.[<sup id="inline3">3</sup>](#fn3) The numbering by books, sections, passages and text units follows the schoalrly convention. In Mommsen, if a passage includes multiple sentences (here denoted as "text units"), the first one is marked by "r.", the second by "1" and so on. This zero numbering is reflected in the database with "0" replacing Mommsen's "r." + +#### 2.2 `jurist` + +The "jurist" table lists the 37 jurists quoted in the _Digest_ with estimated dates for their "birth", their most active "date" and their "death". The dates are based on the articles written on the individual jurists in the _Paulys Realenzylkopädie_[<sup id="inline4">4</sup>](#fn4) and Adolf Berger's _Dictionary of Roman law_.[<sup id="inline5">5</sup>](#fn5) Dates were calculated on the assumption that jurists lived a maximum life expectancy of 60 years and they were the most active at the age of 40. These assumptions are based on the studies by Bruce Frier[<sup id="inline6">6</sup>](#fn6) and Walter Scheidel.[<sup id="inline7">7</sup>](#fn7) + +#### 2.3 `book` + +The "book" table lists the 1381 individual books from which text units are quoted. The shorthand referemce in the "ref" column follows the format of inscriptions attached to text units in ROMTEXT. + +#### 2.4 `work` + +The "work" table aggregate books which constitute one larger multi-volume work. There are 250 works quoted in the _Digest_. + +#### 2.5 `section` + +The "section" table includes the titles of the _Digest_'s 432 thematic sections with the id number of the text unit with which the section starts in the "text" table. + +#### 2.6 `bko` + +The "bko" table presents information related to the theory about the _Digest_'s compositional structure formulated by Friedrich Bluhme in 1820[<sup id="inline8">8</sup>](#fn8) and revised by Paul Krüger for Theodor Mommsen's edition of the _Digest_. The theory was accompanied by a tabular summary known as the Bluhme-Krüger Ordo (hence "bko") which was revised and expanded by Tony Honoré.[<sup id="inline9">9</sup>](#fn9) + +### 3. Sample SQL queries + +There are some sample queries in the `SQL_queries.txt` file to assist users unfamiliar with the `SQL` query language. These queries are ready to be copy-pasted as a multi-line `SQL` statement into the interface of your choice. The queries can be customised by replacing the relevant values. Names of tables and their columns are fixed, but all other values can be customised. Please play around and [send a message](mailto@m.ribary@surrey.ac.uk) for [support](#5-2-support-for-sql-queries). + +#### 3.1 Example analytical query + +```sql +-- #2 COUNT THE NUMBER OF TEXT UNITS FOR EACH JURISTS SORTED BY THEIR ERAS +SELECT j.name, j.date, + COUNT(t.jurist_id) as number_of_textunits, + CASE + WHEN j.date < 0 THEN 'E' + WHEN j.date < 190 THEN 'C-' + WHEN j.date < 240 THEN 'C+' + ELSE 'P' + END AS era +FROM text as t +LEFT JOIN jurist as j +ON t.jurist_id=j.id +GROUP BY t.jurist_id +ORDER BY j.date; +``` + +This query sorts the jurists of the _Digest_ into so-called eras: "early and pre-classical" ('E'), "early classical" ('C-'), "late classical" ('C+'), and "post-classical" ('P'). The `date` column in the `jurist` table includes the date when the jurist was most active. For the purpose of this periodisation, the query takes the year 0, the year 190 and the year 240 as the boundaries of the eras. Additionally, the query counts the number of text units authored by a partcular jurists by linking the `jurist` and the `text` table on a common key (`jurist_id`). The output is ordered by date where jurists, their eras and the number of text units they have in the _Digest_ are listed in `SQL` table ready to be exported. + +The user may define different boundaries, or name the eras differently by replacing the numeric values and the encoding of eras stated in single quotation marks. Less or more eras can be defined by removing a `WHEN` line or adding more to the query as appropriate. + +#### 3.2 Example search and filter query + +```sql +-- #4 WHERE DOES PAPINIAN (id=23) USES THE TERM "PROPRIETAS"? +SELECT t.id, j.id, t.text +FROM text AS t +LEFT JOIN jurist AS j +ON t.jurist_id = j.id +WHERE (t.text like '%proprieta%') AND (j.id = 23); +``` + +This query counts retruns text units in the "text" table authored by Papinian where the term "proprietas" or its morphological variation is used. The query joins the "text" table to the "jurist" table by the "jurist_id" foreign key in the "text" table matching the "id" in the "jurist" table. Papinian's "id" is "23" which is used as a filtering value in the `WHERE` clause. The other filtering value makes use of the `%` wildcard supported by `SQLite` which replaces zero or more optional characters. Hence, the query with `t.text like '%proprieta%'` in the `WHERE` clause returns text units where "proprieta-" appears in any of its morphological variations. + +A slight modification of the query in the statement's `SELECT` clause enables to count the occurences automatically. + +```sql +-- #5 HOW MANY TIMES DOES PAPINIAN (id=23) USES THE TERM "PROPRIETAS"? +SELECT COUNT(t.id) +FROM text AS t +LEFT JOIN jurist AS j +ON t.jurist_id = j.id +WHERE (t.text like '%proprieta%') AND (j.id = 23); +``` + +### 4. Future steps + +The current version of `digest.db` is intended to be polished with input from its users. While major flaws and inconsistencies in the data were captured during the pre-processing stage, it is expected that typographical errors and some inconsistencies remain. Please leave a comment or send an [email](mailto:m.ribary@surrey.ac.uk), if you spot an error. A reporting tool or a collaborative editing method will de added in due course. + +The database is also intended to be enriched with additional features in its tables and additional tables including new perspectives about the textual data. One possible expansion is a high-level taxonomy of legal concepts projected onto the textual units and thematic sections which will assist topical research of Roman law. + +Currently there is no custom-made GUI for using `digest.db`. As the project and the database matures, an appropriate user-friendly interface and visualisation tool will be created to open up the database to those less familiar with the `SQL` query language. + +### 5. Support for users + +#### 5.1 Feedback and collaborative development + +Please leave a comment or send an [email](mailto:m.ribary@surrey.ac.uk) with any feedback you may have about the database. Features, tables and functionalities will be added to the database with input from users. + +In time, the database will receive a custom interface which builds on top the `SQLite` database and the `SQL` queries. The interface will be designed according to user feedback. Please let me know what you want to see in future releases and how the database could support your research better. + +#### 5.2 Support for SQL queries + +Please leave a comment or send an [email](mailto:m.ribary@surrey.ac.uk), if you would like to request a sample `SQL` query for your research, or if you need help with adjusting one of the existing queries. These queries will be continuously added to the `SQL_queries.txt` file. + +### 6. Footnotes + +[<sup id="fn1">1</sup>](#inline1) Peter Riedlberger and Günther Rosenbaum, eds. (2020): _Amanuensis_ V5.0. München. URL: http://www.riedlberger.de/08amanuensis.html [Last accessed on 19 May 2020] + +[<sup id="fn2">2</sup>](#inline2) Georg Klingenberg, "Die ROMTEXT-Datenbank," _Informatica e diritto_ 4 (1995): 223-232. + +[<sup id="fn3">3</sup>](#inline3) Theodor Mommsen and Paul Krüger, _Corpus Iuris Civlis. Editio stereotypa quinta. Vol 1: Institutiones. Digesta._ Berlin: Weidmann, 1889. + +[<sup id="fn4">4</sup>](#inline4) Georg Wissowa, Wilhelm Kroll, Karl Mittelhaus, Konrat Ziegler and Hans Gärtner, eds.,_Paulys Realencyclopädie der classischen Altertumswissenschaft: Neue Bearbeitung_. Stuttgart: Metzler, 1893-1980. + +[<sup id="fn5">5</sup>](#inline5) Adolf Berger, "Encyclopedic dictionary of Roman law," _Transactions of the American Philosophical Society_ 43 (1953): 333-809. + +[<sup id="fn6">6</sup>](#inline6) Bruce Frier, "Roman life expectancy: Ulpian's evidence," _Harvard Studies in Classical Philology_, 86 (1982): 213-251. + +[<sup id="fn7">7</sup>](#inline7) Walter Scheidel, "Roman age structure: Evidence and models," _The Journal of Roman Studies_ 91 (2001): 1-26. + +[<sup id="fn8">8</sup>](#inline8) Friedrich Bluhme, "Die Ordnung der Fragmente in den Pandectentiteln: Ein Beitrag der Entstehungsgeschichte der Pandecten," _Zeitschrift der Savigny-Stiftung für Rechtsgeschichte_ 4 (1820): 257-472. + +[<sup id="fn9">9</sup>](#inline9) Tony Honoré, "Justinian's Digest: The distribution of authors and works to the three committees," _Roman Legal Tradition_ 3 (2006): 1-47. \ No newline at end of file