Skip to content
Snippets Groups Projects
Commit 7cd0d83f authored by Ribary, Marton Dr (School of Law)'s avatar Ribary, Marton Dr (School of Law)
Browse files

Sample SQL queries

parent a50cf901
No related branches found
No related tags found
No related merge requests found
...@@ -26,12 +26,61 @@ A copy of `digest_skeleton.db` is made in the same directory. This `digest.db` f ...@@ -26,12 +26,61 @@ A copy of `digest_skeleton.db` is made in the same directory. This `digest.db` f
### 2. Instructions for using `digest.db` ### 2. Instructions for using `digest.db`
The `digest.db` database can be used for generating advanced analytical insights about Roman law as represented by the _Digest_. The database can be queried with standard `SQL` statements in two interfaces supported by the `SQLite`: (1) command line interface (CLI), and (2) graphical user interface (GUI). The CLI These open-source applications and their installation instructions can be accessed on [sqlite.org](https://www.sqlite.org/download.html) and [sqlitebrowser.org](https://sqlitebrowser.org/dl/). These websites also include instructions for querying databases in a [CLI](https://sqlite.org/cli.html) and a [GUI](https://github.com/sqlitebrowser/sqlitebrowser/wiki) environment. The `digest.db` database can be used for generating advanced analytical insights about Roman law as represented by the _Digest_. The database can be queried with standard `SQL` statements in three types of `SQLite` interfaces listed below.
#### 3. Future steps 1. `Command line interface (CLI) - sqlite3`
Instructions for installation can be accessed on [sqlite.org](https://www.sqlite.org/download.html).
2. `Graphical user interface (GUI) - DB Browser for SQLite`
Instructions for installation and use can be accessed on [sqlitebrowser.org](https://sqlitebrowser.org/dl/).
3. `Online application - SQLite online`
Instructions for installation and use can be accessed on [sqliteonline.com](https://sqliteonline.com/).
The websites of the listed applications include instructions for querying databases with `SQL` statements. All interfaces allow exporting results into flat files such as `csv`. The exported `csv` could be opened in a regular spreadsheet application such as Excel or libreoffice-calc. The `csv` files could also be loaded as `pandas` dataframes into a Python code for further processing.
### 3. Sample SQL queries
There are some sample queries in the `SQL_queries.txt` file to assist users unfamiliar with the `SQL` query language. The queries are all ready to be copied and pasted as a multi-line `SQL` query into the interface of your choice. The queries can be customised by replacing the relevant values. Names of tables and their columns are fixed, but all other values can be customised. Please play around.
Take the following `SQL` query from `SQL_queries.txt`.
```sql
-- Count the number of text units for each jurists
SELECT j.name, j.date,
COUNT(t.jurist_id) as number_of_textunits,
CASE
WHEN j.date < 0 THEN 'E'
WHEN j.date < 190 THEN 'C-'
WHEN j.date < 240 THEN 'C+'
ELSE 'P'
END AS era
FROM text as t
LEFT JOIN jurist as j
ON t.jurist_id=j.id
GROUP BY t.jurist_id
ORDER BY j.date;
```
This query sorts the jurists of the _Digest_ into so-called eras: "early and pre-classical" ('E'), "early classical" ('C-'), "late classical" ('C+'), and "post-classical" ('P'). The `date` column in the `jurist` table includes the date when the jurist was most active.[<sup id="inline1">1</sup>](#fn1) For the purpose of this periodisation, the query takes the year 0, the year 190 and the year 240 as the boundaries of the eras. Additionally, the query counts the number of text units authored by a partcular jurists by linking the `jurist` and the `text` table on a common key (`jurist_id`). The output is ordered by date where jurists, their eras and the number of text units they have in the _Digest_ are listed in `SQL` table ready to be exported.
The user may define different boundaries, or name the eras differently by replacing the numeric values and the encoding of eras stated in single quotation marks. Less or more eras can be defined by removing a `WHEN` line or adding more to the query as appropriate.
#### Help with SQL queries
Please leave a comment or send an [email](mailto:m.ribary@surrey.ac.uk), if you would like to request a sample `SQL` query for your research, or if you need help adusting one of the existing queries.
### 4. Future steps
The current version of `digest.db` is intended to be polished with input from its users. While major flaws and inconsistencies in the data were captured during the pre-processing stage, it is expected that typographical errors and some inconsistencies remain. Please leave a comment or send an [email](mailto:m.ribary@surrey.ac.uk), if you spot an error. A reporting tool or a collaborative editing method will de added in due course. The current version of `digest.db` is intended to be polished with input from its users. While major flaws and inconsistencies in the data were captured during the pre-processing stage, it is expected that typographical errors and some inconsistencies remain. Please leave a comment or send an [email](mailto:m.ribary@surrey.ac.uk), if you spot an error. A reporting tool or a collaborative editing method will de added in due course.
The database is also intended to be enriched with additional features in its tables and additional tables including new perspectives about the textual data. One possible expansion is a high-level taxonomy of legal concepts projected onto the textual units and thematic sections which will assist topical research of Roman law. The database is also intended to be enriched with additional features in its tables and additional tables including new perspectives about the textual data. One possible expansion is a high-level taxonomy of legal concepts projected onto the textual units and thematic sections which will assist topical research of Roman law.
Currently there is no custom-made GUI for using `digest.db`. As the project and the database matures, an appropriate user-friendly interface and visualisation tool will be created to open up the database to those less familiar with the `SQL` query language. Currently there is no custom-made GUI for using `digest.db`. As the project and the database matures, an appropriate user-friendly interface and visualisation tool will be created to open up the database to those less familiar with the `SQL` query language.
\ No newline at end of file
### Footnotes
[<sup id="fn1">1</sup>](#inline1)See the method of arriving at these dates under [Jurist dataframes](https://github.com/mribary/pyDigest/blob/master/Ddf_documentation.md#3-additional-dataframes) in the Ddf documentation.
\ No newline at end of file
-- #1 PERIODISATION OF JURISTS ordered by their active date
SELECT j.name, j.date,
CASE
WHEN j.date < 0 THEN 'E'
WHEN j.date < 190 THEN 'C-'
WHEN j.date < 240 THEN 'C+'
ELSE 'P'
END AS era
FROM jurist as j
ORDER BY j.date;
-- #2 COUNT THE NUMBER OF TEXT UNITS for each jurists sorted by their eras
SELECT j.name, j.date,
COUNT(t.jurist_id) as number_of_textunits,
CASE
WHEN j.date < 0 THEN 'E'
WHEN j.date < 190 THEN 'C-'
WHEN j.date < 240 THEN 'C+'
ELSE 'P'
END AS era
FROM text as t
LEFT JOIN jurist as j
ON t.jurist_id=j.id
GROUP BY t.jurist_id
ORDER BY j.date;
-- #3 COUNT THE NUMBER OF TEXT UNTIS IN AN ERA
WITH eras AS
(SELECT j.name, j.date,
COUNT(t.jurist_id) as number_of_textunits,
CASE
WHEN j.date < 0 THEN 'E'
WHEN j.date < 190 THEN 'C-'
WHEN j.date < 240 THEN 'C+'
ELSE 'P'
END AS era
FROM text as t
LEFT JOIN jurist as j
ON t.jurist_id=j.id
GROUP BY t.jurist_id
ORDER BY j.date)
SELECT date, era, SUM(number_of_textunits) as sums
FROM eras
GROUP BY era
ORDER BY AVG(date);
\ No newline at end of file
"""
Code written by Carlos Fonseca
https://gist.github.com/carlosefonseca/8334277
A bash script to export all tables from an SQLite database
to TSV files in a directory named after the input database.
The directory name is the base name of the database (the last
dot and everything after it is discarded) with -tables
appended to it.
The directory is created if it doesn'talready exist.
Existing files named as tables from the db plus the extension .tab
are overwritten. Other files wont be touched.
"""
#!/usr/bin/env bash
# obtains all data tables from database
TS=`sqlite3 $1 "SELECT tbl_name FROM sqlite_master WHERE type='table' and tbl_name not like 'sqlite_%';"`
# exports each table to csv
for T in $TS; do
sqlite3 $1 <<!
.headers on
.mode csv
.output $T.csv
select * from $T;
!
done
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment