Configuration
Table of Contents
- Data input and manipulation: Time measurements
- Data input and manipulation: How to center the growth curve on zero?
- Data input and manipulation: Handling non-positive measurements?
- 96-well grid plotting
- Hypothest test ploting
- Growth parameter estimation and reporting
- Estimating variance or goodness of fit
- User communication
AMiGA
has many parameters that can be configured by the user. Some of those parameters are configured using the Command-line interface. Other parameters can be configured in the libs/config.py
file. These are parameters that define user preferenecs for common operations and it would be impractical to ask the user to define them every time they submit a command to AMiGA
. The libs/config.py
defines the default values for these parameters. Here, I elaborate on these parameters and explain how users can adjust them.
The libs/config.py
is a Python
script that define a single dictionary
called config
. You can open this Python
script in a text editor. If you are not familiar with dictionaries, it a basic data structure that operates similar to a word dictionary. In a dictionary
, there are a set of keys
and for each key there is a value
(i.e. definition). The values of the config
dictionary are esentially defining the set of parameters (or keys) used by AMiGA
.
The parameters that the user can configure can be divided into several groups based on their purpose:
- Data input and manipulation
- 96-well grid plotting
- GP regression modelling
- Hypothesis testing plotting
- Growth parameter definitions
- User communication
Data input and manipulation: Time measurements
time_input_unit
and time_output_unit
and interval
: While AMiGA does not read the time points in the input plate reader files, it does require the user to define the time interval between measurements. . Users can define the time interval via the command-line with the --interval
argument. But if the user often uses the same time interval for their growth assays, they can simply define the default interval
in the libs/config.py
file as well as the time_input_units
. For example, if OD measurements are taken every 30 minutes. The user can do either
config['interval'] = 1800
config['time_input_unit'] = 'seconds'
or
config['interval'] = 30
config['time_input_unit'] = 'minutes'
or
config['interval'] = 0.5
config['time_input_unit'] = 'hours'
The time_output_unit
basically tells AMiGA
how to convert the time interval which affects how the results (growth parameters, growth curves, and plots) are interpreted.
Data input and manipulation: How to center the growth curve on zero?
After several pre-processing steps, AMiGA
transforms growth curves with a natural logarithm. The, AMiGA
needs to adjust the growth curve so that the first OD measurement is near zero. The default approach is to simply subtract the first OD measurement from all other measurements in the growth curve (i.e., \(\text{Adjusted OD}(t) = \text{OD}(t) - \text{OD}(0)\)). When users are fitting replicate growth curves together, an alterantive approach is called PolyFit
. Here, AMiGA
will fit first five OD measurements with a third-degree polynomial. Based on the polynomial fit, it will estiamte the OD measurement at the first time point then subtract this estimated OD measurement from the other OD measurements in the growth curve (i.e., \(\text{Adjusted OD}(t) = \text{OD}(t) - f(t=0)\) where \(f(t)\) is the polynomial fit. By default, AMiGA
uses the similar approach of subtracting the first OD measurement but users can opt to use PolyFit
instead.
config['PolyFit'] = True
Data input and manipulation: Handling non-positive measurements?
Optical density measurement are supposed to be non-zero and positive values. However, there are cases where plate readers can measure negative or zero values. This is especially common if the OD measuremnets are blank-corrected. In these cases, the OD measurements over time for blank media is subtracted from the OD measurements of treatment wells where you expect to see growth. AMiGA
can handle non-positive values using two ways: the limit of detection LOD
method or the optimal offset Delta
method.
The LOD
method is premised on the fact that all plate readers have a limit of detection. This may be either the lowest optical density measurement it can accurately detect or the smallest increments of optical density that it can accurately detect. If you know this value, you can basically assume that an optical density of value zero should be raised to the limit of detection. But growth curves can also be negative (as before-mentioned due to things like blank subtraction). So, AMiGA
first translates vertically the whole growth curve such that the lowest non-positive value becomes zero then raises all the measurements by the limit of detection. In other words, \(\text{Adjusted OD}(t) = \text{OD}(t) + \mid\min \text{OD}(t)\mid + \text{LOD}\). Users can also force the LOD correction for all measurement such that no OD measurement is lower than the limit of detection.
To use the LOD
method, you can adjust the following parameters. The first selects LOD
as your method of choice; the second defines the limit of detection; and the third determines if you would like the floor for all your measurements to become the limit of detection.
config['handling_nonpositives'] = 'LOD'
config['limit of detection'] = 0.010
config['force_limit_of_detection'] = False
The Delta
method assumes that you do not know the limit of detection and would rather use the data to infer a reasonable offset. In a similar fashion, here, AMiGA
wants to vertically translate the whole growth curve such that the lowest non-positive value becomes zero then raise all the measurements buy the estimated offset. The offset is determined based on the distribution of actual changes in OD over time. In particular, AMiGA
computes the difference in OD between all or some consecutive time intervals. It can then pick the OD based one statistical descriptor of this distribution.
To use the Delta
method, you can adjust the following parameter. The first selects Delta
as your method of choice; the second determienes the number of time intervals to use for determing the optimal delta; and the third determines which delta to pick from the distribution
config['handling_nonpositives'] = 'Delta'
config['number_of_deltas'] = 5
config['choice_of_deltas'] = 'median'
For example, the above parameters state the change in OD for the first five time intervals will be computed and the optimal delta is considered their median. If the user would like to use the whole growth curve for computing delta, they can simply pick a very large number like 1000. This way they don’t have to adjut the number for growth curves with different number of time measurements.
On a final note, if you choose the LOD
method but a particular growth curve has a negative measurement that has an absolute vlaue larger than the limit of detection, then AMiGA
will be forced to use the Delta
method. In a similar fashion, if you choose number_of_deltas
where the OD turns out does not change at all, then AMiGA
will incrementally increase the number_of_deltas
by a multiplier until a change is detected or number_of_deltas
is higher than the number of time points.
96-well grid plotting
The summarize
and fit
function of AMiGA
can plot the growth curves for 96-well plate reader files in an crisp PDF format. The user can adjust several parameters for the aesthetics of these plots.
fcg
or fold-change threshold for growth and fcd
or fold-change threshold for death dictate how each growth cuve will be colored. If the user includes in the meta-data which wells correspond to control and which correspond to cases or treatments, AMiGA
will automatically compute the Fold-Change
(See Summarize and Plot for more details). Here, the parameters define the threshold at which thes growth curve would be colored: if a well has a fold-change greater than 1.5 or less than 0.5 the its curve will be colored.
config['fcg'] = 1.50 # fold-change threshold for growth
config['fcd'] = 0.50 # fold-change threshold for death
The colors can be defined with the following parameters. Here, I define the colors in (R,G,B,A) format where A is a value that adjusts the transparency fo the color. All values range from 0.0 to 1.0 (and map to 0 and 255 in decimal notation). However, you can can also define colors with text label (e.g., ‘red’) or hex stirn format (e.g., ‘#0099CC’). See List of named colors for a long list of colors that you can use with Python
.
config['fcg_line_color'] = (0.0,0.0,1.0,1.0) # blue
config['fcg_face_color'] = (0.0,0.0,1.0,0.15) # transparent blue
config['fcd_line_color'] = (1.0,0.0,0.0,1.0) # red
config['fcd_face_color'] = (1.0,0.0,0.0,0.15) # transparent red
config['fcn_line_color'] = (0.0,0.0,0.0,1.0) # black
config['fcn_face_color'] = (0.0,0.0,0.0,0.15) # transparetn black
config['gp_line_fit'] = 'yellow'
fcg_line_color
and fcg_face_color
define the color for the line and the area of the growth curve where growth is detected based on fold-change; fcd_line_color
and fcd_face_color
likewisedefine colors for wells where death is detected; and fcn_line_color
and fcn_face_color
define colors for the remaining wells. The last parameter gp_line_fit
defines the color for the dashed line that plots the curves predicted by Gaussian Process regression. These lines can only be plotted by the fit
function.
In these plots, AMiGA
also adds text in each well for the well ID (e.g. “A1” on top left corner) or Maximum OD (on top right corner). Users can also adjus the color for these wells in a similar fashion
config['fcn_well_id_color'] = (0.65,0.165,0.16,0.8) # light maroon
config['fcn_od_max_color'] = (0.0,0.0,0.0,1.0) # black
Finally, users can adjust the label for the y-axis. For the fit
function, the generated plot will be based on growth curves that are log-transformed. So the final label would look something like this “ln Optical Density”. AMiGA
was designed to analyze optical density data but it can model any count data (e.g. “CFUs” or “fluorescence”). By adjusting the y-axis label, users can distinguish the type of data that they are looking at.
config['grid_plot_y_label'] = 'Optical Density'
Hypothest test ploting
The test
function in AMiGA
can generate figures that plot the predicted fit for the two growth curves being compared. Users can adjust the colors for the lines, y-axis label, and several parameters about aesthetics of the figure.
config['hypo_colors'] = [(0.11,0.62,0.47),(0.85,0.37,0.01)] # seagreen and orange
config['hypo_plot_y_label'] = 'OD'
config['HypoPlotParams'] = {'overlay_actual_data':True,
'fontsize':15,
'tick_spacing':5,
'legend':'outside'}
The default colors for the lines correspond to seagreen and orange and the default label is “OD”. Users can also adjust the fontsize (in points), the spacing of the x-axis ticks. Here, the tick_spacing
is defined as 5
and because our default is time_output_interval
is hours
, AMiGA
will plot ticklabels at intervals of 5 hours
. By default, the legend
is plotted outside
but can instead be plotted inside
. Finally, users can opt to overlay_actual_data
or turn off this feature with False
. If True
, AMiGA
will plot the predicted growth curve in bold line and the raw growth curves in thin lines. All growth curves will however be log-transformed and centered at zero (because ln OD = 0 → OD=1, starting value of zero in log OD indicates arbitrary starting population size of one).
Growth parameter estimation and reporting
One of advantages of inferring the growth rate over time is that AMiGA
can identify the time at which growth rate begins. It does so by identifying the time point at which the growth rate is statistically different from a growth rate of zero. We refer to this Adaptation Time
as the time at which the confidence itnerval of the growth rate deviates from zero. Users can adjust the confidence for this interval.
config['confidence_adapt_time'] = 0.95
Another unique feature for AMiGA
is its novel algorithm for detecting and characterizing diauxic shifts. See Detect Diauxie for more details on the parameters.
config['diauxie_ratio_varb'] = 'K'
config['diauxie_ratio_min'] = 0.20
config['diauxie_k_min'] = 0.10
By default AMiGA
simply provides the mean estimate for each growth parameter (e.g. growth rate). However, if the user passes the --sample-posterior
argument to the fit
function, AMiGA
will also generate distributions for these parameters, in particular the mean and standard deviation. To do so, AMiGA
samples from the posterior distribution of the growth model using a certain number of sampled curves. It computes the growth rates for each of those curves and then reports the mean and standard deviation of these distributions. Users can adjust how many posterior samples are drawn but keep in mind that the more samples drawn the longer the process will take.
config['n_posterior_samples'] = 100
Finally, AMiGA
can infer up to 15 growth parameters but if the user would like the summary files to include only the ones they are interested in, they can simply alter the list below to their liking. See Fit Curves for descriptions of these parameters.
config['report_parameters'] = ['auc_lin','auc_log','k_lin','k_log','death_lin','death_log',
'gr','dr','td','lagC','lagP',
't_k','t_gr','t_dr','diauxie']
Estimating variance or goodness of fit
If the user opts to empirically-estimate time-dependent Gaussian noise, AMiGA
will compute the variance of replicate data over time. It will then apply a Gaussian filter to smoothen the variance over time. Users can adjust the window for smoothign variance. The defaul is 6
measurements, which correspond to 1 hour (defaul interval is 600 seconds, so 6 x 6 = 3600 seconds = 1 hour).
config['variance_smoothing_window'] = 6
As a quick check of the goodness of fit of predicted growth curves, AMiGA
computes the K_Error
which estiamtes the deviation of the predicted carrying capacity from the expected carrying capacity. See Fit Curves for descriptions of these parameters. Users can adjust the threshold at which AMiGA
will flag a well if it has a K_Error
above this threshold. The default is 20%
.
config['k_error_threshold'] = 20
User communication
AMiGA
implements Gaussian Process regression using the GPy
package. Many functions and features of this package often communicate warnings. By default, AMiGA
will not show these warnings to the user but if you would like to see them, you can simply turn off the below parameter wtih False
.
config['Ignore_RuntimeWarning'] = True