- Description of the files in this folder
- File name: remove_incubation_reporting_Campylobacter.R
- Brief Description:
- Key Features:
- Input Files: (with examples)
- Plot of Incubation Period
- Output Files: (with example)
- File name: Plot_reported_and_adjusted_campylobacter_cases.R
- Brief Description:
- Key Features:
- Input Files: (with examples)
- Output Files:
- File name: Pathogen_Linkage_fixed_time_lag.R
- Brief Description:
- Key Features:
- Input Files: (with examples for humidity)
- Output Files:
- File name: Conditional_incidence_and_reconstruction_one_factor.R
- File name: Plots_Campylobacter_environment_one_variables_quantile.R
- Brief Description:
- Key Features:
- Input Files:
- Output Files:
- File name: Conditional_incidence_and_reconstruction.R
- Brief Description:
- Key Features:
- Input Files:
- Output_Files: (also used as input for intermediate steps)
- Supporting Functions:
- File name: Conditional_incidence_and_reconstruction_two_factors.R
- Brief Description:
- File name: Conditional_incidence_and_reconstruction_4_factors.R
- Brief Description:
- File name: Conditional_incidence_and_reconstruction_for_constant_situation.R
- Brief Description:
- File name: Campylobacter_environment_analysis_subset_two_variables_quantile_semester.R
- File name: Plots_Campylobacter_environment_one_variables_quantile_by_semester.R
- Brief Description:
- File name: Spatial_simulation comparative_Campy.R
- Brief Description:
- Description of the supporting Functions used in the main codes
- File name: Function_Input_files.R
- Brief Description:
- File name: Function_Wavelet_analysis.R
- Brief Description:
- Output Files:
- File name: Function_Conditional_Incidence_Uniform.R
- Name of the function(s): Conditional_incidence_Uniform, Conditional_incidence_Uniform_two_factors
- Brief Description:
- Output Files:
- File name: Function_Conditional_Incidence_Uniform_two_factors.R
- File name: Function_Conditional_incidence_Quantile.R
- Name of the function(s): Conditional_incidence_quantiles, Conditional_incidence_Quantiles_one_factor
- Brief Description:
- Output Files:
- File name: Function_Function_Plot_Conditional_Incidence_Uniform.R
- Brief Description:
- File name: Function_Function_Plot_Conditional_Incidence_Quantiles.R
- Name of the function(s): Plot_Conditional_incidence_quantiles,
- Brief Description:
- File name: Function_Plot_Conditional_incidence_quantiles_two_factors.R
- File name: Function_Reconstruction.R
- Name of the function(s): Reconstruction_time_series,
- Brief Description:
- Output Files:
- File name: Function_Plot_Reconstruction.R
- Name of the function(s): Plot_Reconstruction_time_series
- Brief Description:
- Output Files:
title: "Content of Codes used in Campylobacter project"
author: "Gianni Lo Iacono"
bibliography: campylobacter.bib
output:
pdf_document:
toc: yes
highlight: tango
fig_caption: yes
citation_package: natbib
date: '2023-01-06'
knitr::opts_chunk$set(echo = TRUE)
library(knitr)
hook_output = knit_hooks$get('output')
knit_hooks$set(output = function(x, options) {
# this hook is used only when the linewidth option is not NULL
if (!is.null(n <- options$linewidth)) {
x = xfun::split_lines(x)
# any lines wider than n should be wrapped
if (any(nchar(x) > n)) x = strwrap(x, width = n)
x = paste(x, collapse = '\n')
}
hook_output(x, options)
})
Description of the files in this folder
File name: remove_incubation_reporting_Campylobacter.R
Brief Description:
The code adjust the Campylobacter data, to remove the incubation period and reporting delay.
Key Features:
The distributions of the incubation period were assumed to be log-normal [@Sartwell1995] with location parameter, \mu, and the scale parameter, \sigma inferred by the mean, m, and standard deviation, s, of the observed data of Horn and Lake [@Horn2013] for the incubation period. Delay in reporting was assumed to follow a uniform distribution between 1-4 days, based on informed opinion of one of the authors (G. Nichols).
Input Files: (with examples)
../Data_Base/Incubation_period_Horn_Lake.csv
file_name<-paste("Data_Base/Incubation_period_Horn_Lake.csv",sep="")
Incub_data<-read.csv( file_name )
head(Incub_data)
Plot of Incubation Period
plot(Incub_data$day,Incub_data$freq,xlab="Day",ylab="Frequency")
../Data_Base/Campylobacter_reported.csv
Campylobacter_data_df<-read.csv(file="Data_Base/Campylobacter_reported.csv",sep=",")
colnames(Campylobacter_data_df)<-c("Spec_Date", "POSTCODE")
head(Campylobacter_data_df)
Output Files: (with example)
../Data_Base/Campylobacter_adjusted_date_df.csv
Campylobacter_adjusted_date_df<-read.csv(file="Data_Base/Campylobacter_adjusted_date.csv",sep=",")
head(Campylobacter_adjusted_date_df[,-1])
<<<<<<< HEAD
Plot_reported_and_adjusted_campylobacter_cases.R
File name:Brief Description:
The code plots reported and adjusted campylobacter cases.
Key Features:
Data aggregated monthly
Input Files: (with examples)
../Data_Base/Campylobacter_adjusted_date.csv
Campylobacter_adjusted_date_df<-read.csv("Data_Base/Campylobacter_adjusted_date.csv")
head(Campylobacter_adjusted_date_df)
Output Files:
../Graphs/Campy_seasonal_reported
../Graphs/Campy_seasonal_adjusted
Pathogen_Linkage_fixed_time_lag.R
File name:Brief Description:
The code associates the value of selected environmental variables to the Campylobacter cases at the location of the diagnostic laboratory postcode and date of occurrence with a chosen time-time_lag. It also associate the corresponding yearly population in the catchment areas corresponding to the diagnostic laboratory postcode.
Key Features:
The time lag is an input (line 180, e.g. time_lag<-7 will take 7 days time-lag) and calculate the enviromental variables averaged (expect for cululative rainfall which calculate the cumulative sum of rain) over the past time-lag days.
The code also extract iformation about the latitude and longitude of the laboratory location. It also estimmate the day-length based on latitude from laboratory location and time of the year.
Input Files: (with examples for humidity)
../Data_Base/MEDMI/,variable,.csv where variable can be:
- humidity
- max_air_temp
- min_air_temp
- rain
- wind_speed
variable<-"humidity"
file<-paste("Data_Base/MEDMI/",variable,".csv",sep="")
variable_df_1<-read.csv(file)
head(variable_df_1[,c(1:7)])
../Data_Base/Population_catchments_1989_2016.csv
catchment_population_df<-read.csv(paste("Data_Base/Population_catchments_1989_2016.csv",sep=""))
colnames(catchment_population_df)<-c("PostCode",
"residents_1989",
"residents_1990",
"residents_1991",
"residents_1992",
"residents_1993",
"residents_1994",
"residents_1995",
"residents_1996",
"residents_1997",
"residents_1998",
"residents_1999",
"residents_2000",
"residents_2001",
"residents_2002",
"residents_2003",
"residents_2004",
"residents_2005",
"residents_2006",
"residents_2007",
"residents_2008",
"residents_2009",
"residents_2010",
"residents_2011",
"residents_2012",
"residents_2013",
"residents_2014",
"residents_2015",
"residents_2016")
head(catchment_population_df)
../Data_Base/Campylobacter_Adjusted_date.csv
Campylobacter_cases<-read.csv("Data_Base/Campylobacter_Adjusted_date.csv")
colnames(Campylobacter_cases)<-c("Date","Spec_Date", "PostCode")
head(Campylobacter_cases)
Output Files:
../Data_Base/Laboratory_environment_,time_lag_char,.csv where time_lag_char is the timelag in days
time_lag_char<-"7"
merged_lab_time_lag<-read.csv(paste("Data_Base/Laboratory_environment_",time_lag_char,".csv",sep=""))
head(merged_lab_time_lag)
../Data_Base/Campylobacter_environment_,time_lag_char,.csv where time_lag_char is the timelag in days
time_lag_char<-"7"
Campylobacter_cases_df2<-read.csv(paste("Data_Base/Campylobacter_environment_",time_lag_char,".csv",sep=""))
head(Campylobacter_cases_df2)
Conditional_incidence_and_reconstruction_one_factor.R
File name:Plots_Campylobacter_environment_one_variables_quantile.R
File name:Brief Description:
The two codes estimate and plot the incidence of the diseases conditional to one local environmental factor only. The range of the environmental variables are divided in quantiles. This ensure that in each bin we have the same number of point.
Key Features:
The year range goes from 1990 to 2009. This ensures that the surveillance data in each postcode are well represented.
The variables of interest are:
variables<-c("Maximum_air_temperature","Minimum_air_temperature","Mean_wind_speed",
"Mean_Precipitation","Relative_humidity","daylength")
and the time-lag:
time_lags <- c(7,14,30,60,90)
Input Files:
../Data_Base/Laboratories_Post_Codes.csv
Input Files: via source code: /Codes/Function_Input_files.R
../Data_Base/Campylobacter_environment_time_lag_char.csv
../Data_Base/Laboratory_environment_,time_lag_char,.csv
../Data_Base/Conditional_probability_"variable"_"time_lag_char,_Quantiles.csv
where
variable can be:
"Maximum_air_temperature","Minimum_air_temperature","Mean_wind_speed","Mean_Precipitation",
"Relative_humidity","daylength"
and the time-lag:
7,14,30,60,90
Example of visual output (see also Figure 4 in the manuscript):
../Graphs/Campylobacter_Maximum_air_temperature_quantile.tiff
Output Files:
../Data_Base/Conditional_probability_"variable_time_lag_char_Quantiles.csv.csv
Conditional_incidence_and_reconstruction.R
File name:Brief Description:
The code estimates and plots the incidence of the diseases conditional to three local environmental factors respectively. It performs a series of diagnostic and then reconstruct the time series of cases given the local environmental factors in England and Wales
Key Features:
The year range goes from 1990 to 2009. This ensures that the surveillance data in each postcode are well represented.
The variables of interest are combinations of three factors chosen from:
variables<-c("Maximum_air_temperature","Minimum_air_temperature","Mean_wind_speed",
"Mean_Precipitation","Relative_humidity","daylength")
and the time-lag:
time_lags <- c(7,14,30,60,90)
For the reconstruction, the code uses the conditional incidence, but it is calculated over bins of uniform sizes.
Input Files:
../Data_Base/Laboratories_Post_Codes.csv
Input Files: via source code: /Codes/Function_Input_files.R
../Data_Base/Campylobacter_environment_time_lag_char.csv
../Data_Base/Laboratory_environment_,time_lag_char,.csv
Output_Files: (also used as input for intermediate steps)
../Data_Base/Conditional_probability_variable_zvariable_yvariable_x_time_lag_char_Quantiles.csv
where
variable_(x,y,z) can be:
"Maximum_air_temperature","Minimum_air_temperature","Mean_wind_speed",
"Mean_Precipitation","Relative_humidity","daylength"
and the time-lag:
7,14,30,60,90
For example of visual output see desription in Function_Function_Plot_Conditional_Incidence_Quantiles.R below (also Figure 5 in the manuscript):
Supporting Functions:
wavelet_analysis(Env_Pathogen_data, Pars_wavelet_analysis) via source code: /Codes/Function_Wavelet_analysis.R
Conditional_incidence_Uniform(Env_laboratory_data,Env_Pathogen_data,Pars_cond_incidence) via source code: /Codes/Function_Conditional_Incidence_Uniform.R
Plot_Conditional_incidence_Uniform(Conditional_incidence_unif_info,Pars_plot_cond_prev_uniform) via source code: /Codes/Function_Plot_Conditional_incidence_Uniform.R
Conditional_incidence_Quantiles(Env_laboratory_data,Env_Pathogen_data,Pars_cond_prev_quantiles) via source code: /Codes/Function_Conditional_Incidence_Quantiles.R
Plot_Conditional_incidence_quantiles(Conditional_incidence_quantiles_info,Pars_plot_cond_prev_quantiles) via source code: /Codes/Function_Plot_Conditional_Incidence_Quantiles.R
Reconstruction_time_series(Conditional_incidence_unif_info,Env_laboratory_data,Pars_reconstruction) via source code: /Codes/Function_Reconstruction.R
Plot_Reconstruction_time_series(Reconstruction_time_series,campylobacter_data_national,Pars_Plot_Reconstruction) via source code: /Codes/Function_Plot_Reconstruction.R
Conditional_incidence_and_reconstruction_two_factors.R
File name:Brief Description:
As Conditional_incidence_and_reconstruction.R but focusing on two local environmental factors.
Conditional_incidence_and_reconstruction_4_factors.R
File name:Brief Description:
As Conditional_incidence_and_reconstruction.R but focusing on four local environmental factors.
Conditional_incidence_and_reconstruction_for_constant_situation.R
File name:Brief Description:
As Conditional_incidence_and_reconstruction.R but it imposes two of the environmental variables being constant.
Campylobacter_environment_analysis_subset_two_variables_quantile_semester.R
File name:Plots_Campylobacter_environment_one_variables_quantile_by_semester.R
File name:Brief Description:
The codes estimate and plot how the risk of Campylobacter in humans depends on selected environmental variables for two different semester of the year.
Spatial_simulation comparative_Campy.R
File name:Brief Description:
the code compares the reported and predicted daily number of campylobacteriosis per catchment area averaged over the entire 19 years and shows differences in scatter plot and geographically
Description of the supporting Functions used in the main codes
Function_Input_files.R
File name:Brief Description:
The code reads two files:
- one csv file containing the number of campylobacter cases, their location (closest diagnostic laboratories) and the likely date of infection (estimated from the date when the specimen reaches after removing incubation period and reporting delay) along with associated environmental factors
- one csv file containing environmental factors for all diagnostic laboratories (irrespective of a campylobacter case being recorded) for each day cosidered in the study (from 1990 to 2009).
The environmental variables have been averaged over the past time_lag days.
Example of the input files (for time_lag_char=7):
library(readr)
time_lag_char<-"7"
Env_Pathogen_data_all <-
read_csv(paste("Data_Base/Campylobacter_environment_",time_lag_char,".csv",sep=""))
colnames(Env_Pathogen_data_all)<-c("PostCode","Date",
"Maximum_air_temperature",
"Minimum_air_temperature",
"Mean_wind_speed",
"Cumul_Precipitation",
"Mean_Precipitation",
"Relative_humidity",
"daylength",
"residents","Cases")
head(Env_Pathogen_data_all)
Env_laboratory_data_all <-
read_csv(paste("Data_Base/Laboratory_environment_",time_lag_char,".csv",sep=""))
colnames(Env_laboratory_data_all) <-c("PostCode","Date",
"Maximum_air_temperature",
"Minimum_air_temperature",
"Mean_wind_speed",
"Cumul_Precipitation",
"Mean_Precipitation",
"Relative_humidity",
"daylength",
"residents")
head(Env_laboratory_data_all)
Function_Wavelet_analysis.R
File name:Brief Description:
This script performs Wavelet analysis for to check for periodicity in the data. In most case not necessary but it also estimate the number of cases at national level which is used for comparison with the model reconstruction (example below).
Output Files:
campylobacter_data_national<-read.csv("Data_Base/Pathogen_data_national.csv")
head(campylobacter_data_national[,c(-1)])
Function_Conditional_Incidence_Uniform.R
File name:Name of the function(s): Conditional_incidence_Uniform, Conditional_incidence_Uniform_two_factors
Brief Description:
The code does look at how the risk of Campylobacter in humans depends on three specific environmental variables. The range of the environmental variables are divided in bins of uniform size. The output file (see below) is used for simulating the timeseries of cases.
Output Files:
../Data_Base/Conditional_incidence_"variable_xvariable_yvariable_z_time_lag_char_Uniform.csv Example of the file:
Conditional_incidence_unif<-
read.csv("Data_Base/Conditional_incidence_Maximum_air_temperature_Relative_humidity_daylength_14_Uniform.csv")
head(Conditional_incidence_unif[,c(-1)])
Function_Conditional_Incidence_Uniform_two_factors.R
File name:as Function_onditional_incidence_Uniform.R but the code focuses on two specific environmental variables.
Function_Conditional_incidence_Quantile.R
File name:Name of the function(s): Conditional_incidence_quantiles, Conditional_incidence_Quantiles_one_factor
Brief Description:
The code Conditional_incidence_quantiles does look at how the risk of Campylobacter in humans depends on three specific environmental variables. The range of the environmental variables are divided in quantiles. This ensure that in each bin we have the same number of point. The output files are used for visualization. Conditional_incidence_Quantiles_one_factor as Conditional_incidence_quantiles but focusing on only one enviromental factor.
Output Files:
../Data_Base/Conditional_probability_"variable_zvariable_yvariable_x_time_lag_char_Quantiles.csv Example of the file:
Conditional_incidence_quantiles<-read.csv("Data_Base/Conditional_probability_daylength_Relative_humidity_Maximum_air_temperature_14_Quantiles.csv")
head(Conditional_incidence_quantiles[,c(-1)])
Note, the column residents_tot represent the total number of people in all catchments, exposed to the same environmental factors in the same row. The column residents represent the number of people in those catchments where there is at least one recorded case, exposed to the same environmental factors in the same row. This information is not used.
Function_Function_Plot_Conditional_Incidence_Uniform.R
File name:Brief Description:
The code plots the conditional incidence estimated via Function_Conditional_incidence_Uniform.R used only for internal diagnostic.
../Graphs/Conditional_probability_variable_xvariable_ytime_lag_char_uniform.tiff
Function_Function_Plot_Conditional_Incidence_Quantiles.R
File name:Name of the function(s): Plot_Conditional_incidence_quantiles,
Plot_Conditional_incidence_quantiles_two_factors
Brief Description:
The code plots the conditional incidence estimated via Function_Conditional_incidence_Quantile.R
../Graphs/Campylobacter_Maximum_air_temperature_Relative_humidity_daylength_13_14_quantile.tiff
Function_Plot_Conditional_incidence_quantiles_two_factors.R
File name:As Function_Function_Plot_Conditional_Incidence_Quantiles.R but focusing on two factors only. The code plots the conditional incidence estimated via Conditional_incidence_Quantile.R but when one variable (typically variable_z) is made redundant by choosing the cut point of the quantile for variable_z 1 (so no differentiation of variable_z)
Function_Reconstruction.R
File name:Name of the function(s): Reconstruction_time_series,
Reconstruction_time_series_two_factors
Brief Description:
The function Reconstruction_time_series uses the empirical environmental factors and calculated conditional incidence to estimate the expected number of campylobacter cases at a given location and time depending on the underlying three environmental factors. Reconstruction_time_series_two_factors as Reconstruction_time_series but focusing on two environmental factors.
Output Files:
../Data_Base/Conditional_probability_"variable_xvariable_yvariable_z_time_lag_char_reconstructed.csv Example of the file:
time_series<-read.csv("Data_Base/Maximum_air_temperature_Relative_humidity_daylength_14_reconstructed.csv")
head(time_series[,c(-1)])
where Lambda represents the rate of infection and if multiplied by the number of residents in the catchment area it gives the expected number of cases in that catchment.
Function_Plot_Reconstruction.R
File name:Name of the function(s): Plot_Reconstruction_time_series
Brief Description:
The function plots the time series of cases, also aggregated by day of hte year, and compare with empirical data.
Output Files:
Example of outputs are:
../Graphs/daylength_Maximum_air_temperature_Relative_humidity_14_time_series.tiff
../Graphs/daylength_Maximum_air_temperature_Relative_humidity_14_yearly_average.tiff