Normalize Parameters
Table of Contents
- What is normalization?
- How to normalize growth parameters?
- How to normalize parameters from a summary of pooled replicates?
- Command-Line arguments
What is normalization?
In a typical analysis, researchers have control samples which indicate the expected behavior under known conditions. A typical negative control may be the growth of an isolate on blank media (i.e. low or no growth) and a typical positive control may be the growth on a preferred susbtrate (e.g. glucose).
When testing the growth of an isolate on a new condition, it often helps to compare it to other known conditions especially negative or positive controls. AMiGA
has the normalization function which simplifies this process for the user. Normalization can be performed by either subtracting the the growth paameters of control sample(s) from the treatment samples, or by dividing the growth parameters of the treatment samples from the control sample(s).
Normalization by subtration will result in values that are centered around zero which would indicate no difference from the control samples, for example,
\[\text{Normalized AUC} = \text{AUC}(\text{Isolate on Substrate}) - \text{AUC}(\text{Isolate on Blank Media})\]while normalization by dividion will result in values that are centered around 1 which would indicate no difference fom the control samples, for example,
\[\text{Normalized AUC} = \frac{\text{AUC}(\text{Isolate on Substrate})}{\text{AUC}(\text{Isolate on Blank Media})}\]How to normalize growth parameters?
After modelling growth curves and estimating growth parameters using the fit
function, users can adjust the estimated growth parameters with the normalize
function. To properly normalize data, the users must indicate in their meta-data how samples are grouped and which samples are considered the control samples in each group. For example, all 96 wells belong to the same group because they are inoculated by the same cell culture and the A1
well is the Negative Control
well. As previously shown, AMiGA
automatically adds a Group
and Control
column in the meta-data for Biolog PM plates to streamline this normalization.
Example 1
As an example, let us assume that we profiled an isolate with two replicate plates of Biolog PM1
python $amiga/amiga.py fit -i /Users/firasmidani/experiment/ -o split_merged
This would have generated the following split_merged_summary.txt
table
Well_ID | Plate_ID | Isolate | Substrate | Group | Control | auc_lin | … | t_gr |
---|---|---|---|---|---|---|---|---|
A1 | erand_PM1-1 | erand | None | 1 | 1 | 0.21 | … | 4.4 |
B1 | erand_PM1-1 | erand | L-Arabinose | 1 | 0 | 0.85 | … | 4.1 |
C1 | erand_PM1-1 | erand | N-Acetyl-D-Gluosamine | 1 | 0 | 0.83 | … | 4.0 |
… | … | … | … | … | … | … | … | … |
F12 | erand_PM1-1 | erand | Inosine | 1 | 0 | 0.23 | … | 4.5 |
G12 | erand_PM1-1 | erand | L-Malic Acid | 1 | 0 | 0.33 | … | 3.8 |
H12 | erand_PM1-1 | erand | 2-Aminoethanol | 1 | 0 | 0.68 | … | 4.0 |
… | … | … | … | … | … | … | … | … |
A1 | erand_PM1-2 | erand | None | 2 | 1 | 0.22 | … | 4.5 |
B1 | erand_PM1-2 | erand | L-Arabinose | 2 | 0 | 0.85 | … | 4.0 |
C1 | erand_PM1-2 | erand | N-Acetyl-D-Gluosamine | 2 | 0 | 0.83 | … | 3.9 |
… | … | … | … | … | … | … | … | … |
F12 | erand_PM1-2 | erand | Inosine | 2 | 0 | 0.19 | … | 4.6 |
G12 | erand_PM1-2 | erand | L-Malic Acid | 2 | 0 | 0.29 | … | 3.9 |
H12 | erand_PM1-2 | erand | 2-Aminoethanol | 2 | 0 | 0.72 | … | 4.0 |
We can adjust the growth paarmeters by simply using the normalize
function.
python $amiga/amiga.py normalize -i /Users/firasmidani/experiment/summary/split_merged -o "new_normalized" --normalize-method "division"
-i
points to the summary file generated by thefit
function.-o
allows the user to pick the name for the new summary table. If the user does not pass the-o
arugment,AMiGA
will create a new file with a_normalized
suffix, for example,split_merged_normalized.txt
. If the user wants to replace the file with a new version that includes normalized parameters, they must add the--over-write
argument.--normalize-method
specifies that the user wants to adjust the growth parameters bydivision
. The user user can instead opt forsubtraction
which is the default option.
However, you can also run the above command more explicitly if you did not have the Group
and Control
columns in the data.
python $amiga/amiga.py normalize -i /Users/firasmidani/experiment/summary/split_merged --over-write --group-by "Plate_ID" --normalize-by "Substrate:None"
Example 2
Let’s say you have a simple experiment with two isolates on two conditions (minimal media or trehalose). Below we show the mapping file that we created and we have already fit the curves for which the summary file was saved as CD89_wt_summary.txt
.
Plate_ID | Isolate | Substrate | Ribotype | Comments | Group | Control | |
---|---|---|---|---|---|---|---|
A1 | CD_treA | CD89_wt | Minimal Media | RT027 | Wild-type | 1 | 1 |
A2 | CD_treA | CD89_wt | Minimal Media | RT027 | Wild-type | 1 | 1 |
A3 | CD_treA | CD89_wt | D-Trehalose | RT027 | Wild-type | 1 | 0 |
A4 | CD_treA | CD89_wt | D-Trehalose | RT027 | Wild-type | 1 | 0 |
B1 | CD_treA | CD89_ko | Minimal Media | RT027 | treA knock-out | 2 | 1 |
B2 | CD_treA | CD89_ko | Minimal Media | RT027 | treA knock-out | 2 | 1 |
B3 | CD_treA | CD89_ko | D-Trehalose | RT027 | treA knock-out | 2 | 0 |
B4 | CD_treA | CD89_ko | D-Trehalose | RT027 | treA knock-out | 2 | 0 |
We can adjust the growth paarmeters by simply using the normalize
function.
python $amiga/amiga.py normalize -i /Users/firasmidani/experiment/summary/CD89_wt_summary.txt -o "CD89_wt_summary_normalized" --normalize-method "division"
If we did not expliclty include the Group
and Control
columns, we can still normalize the data directly with the command.
python $amiga/amiga.py normalize -i /Users/firasmidani/experiment/summary/CD89_wt_summary.txt -o "CD89_wt_summary_normalized" --normalize-method "division" --group-by 'Isolate' --normalize-by 'Substrate:Minimal Media'
How to normalize parameters from a summary of pooled replicates?
In this example, I have in my working directory (/Users/firasmidani/experiment/
) multiple data files in the data
sub-folder . Let us assume that I fit growth curves for replicates pooled across all unique combinations of Isolate
and Substrate
.
python $amiga/amiga.py fit -i /Users/firasmidani/experiment/ -o "pooled_analysis" --pool-by "Isolate,Substrate"
I can specify that AMiGA should normalize each group (unique combination of Isolate
and Substrate
) by their wells where the Substrate
is also Negative Control
. In this case, each unique model is specified by an Isolate
and a Substrate
. If there are multiple control wells, AMiGA
will normalize to the mean of their growth parameters.
python $amiga/amiga.py normalize -i /Users/firasmidani/experiment/summary/pooled_normalized_merged_by_ribotype -o "pooled_normalized_merged_by_ribotype_normalized" --group-by "Isolate,Substrate" --normalize-by "Substrate:Negative Control"
Command-Line arguments
To see the full list of arguments that amiga normalize
will accept, run
python normalize.py --help
which will return the following message
usage: amiga.py [-h] -i INPUT [--over-write] [--verbose] [--group-by GROUP_BY]
[--normalize-by NORMALIZE_BY]
[--normalize-method {division,subtraction}]
Compare two growth curves
optional arguments:
-h, --help show this help message and exit
-i INPUT, --input INPUT
--over-write over-write file otherwise a new copy is made with
"_normalize" suffix
--verbose
--group-by GROUP_BY
--normalize-by NORMALIZE_BY
--normalize-method {division,subtraction}
See more details for these arguments in Command Line Interface