Subset Data
Table of Contents
- Example One: How to subset on a single variable?
- Example Two: How to subset on multiple variables?
- Example Three: How to subset on multiple variables?
- Example Four: Can you use whitespaces and quotes?
- Example Five: Can you ignore certain wells?
- Troubleshooting: Why am I getting “TypeError reduce()” error message?
You can use AMiGA
to analyze only specific wells even if they are spread across different plates. To do this, you can utilize AMiGA
subsetting function. Here are a couple of examples that will explain how to do this.
Example One: How to subset on a single variable?
You have 100s of files in your data folder. But you only want to analyze wells corresponding to two specific isolates.
python amiga.py fit -i /Users/firasmidani/experiments -s Isolate:CD_treA,CD_treX --merge-summary -o CD_treA_treX
The subsetting argument above assumes that the relevant mapping files will have the Isolate
column. This can be auto-generated by AMiGA
for Biolog plates that are correctly named for AMiGA
to recognize them. If you are not analyzing Biolog
plates, you will have to pass a mapping or mapping\meta.txt
file to communicate with AMiGA
. See Preparing Metadata for more details.
The proper syntax for the argument is to define the variable of interest (Isolate
) followed by a colon (:
) followed by the values of the variable of interest separated by commmas (,
).
The summary results are merged into a single file with a name that includes a time tamp summary_{Year}-{Month}-{Day}_{Hour}-{Minutes}-{Seconds}.txt
. However, if you pass the -o
argument, you can give the file a unique name instead of a time stamp.
Example Two: How to subset on multiple variables?
Same as above but you also want to analyze these isolates when only grown on a select set of substrates.
python amiga.py fit -i /Users/firasmidani/experiments -s Isolate:CD_treA,CD_treX;Substrate:alpha-D-Glucose,D-Fructose,D-Trehalose
If you are selecting on more than a single variable, you must separate your selections with a semi-colon (‘;’).
Example Three: How to subset on multiple variables?
You have many files in your data
folder and you pass meta-data to AMiGA
using individual mapping and mapping\meta.txt
file. The meta.txt
file included additional information about presence of antibiotics in these wells.
python amiga.py fit -i /Users/firasmidani/experiments -s Isolate:CD_treA,CD_treX;Substrate:alpha-D-Glucose,D-Fructose,D-Trehalose;Antibiotics:None,clindamycin
Example Four: Can you use whitespaces and quotes?
Arguments can get quite lengthy. To make them easier to read/write, you can use white spaces (surrounding quotations including colons, semicolons, and commas) to visually separate the contents of the subsetting (or hypothesis) argument as long as each argument is wrapped in double quotes.
python amiga.py fit -i /Users/firasmidani/experiments -s "Isolate : CD_treA , CD_treX ; Substrate : alpha-D-Glucose , D-Fructose,D-Trehalose ; Antibiotics : None , clindamycin"
Example Five: Can you ignore certain wells?
You can also specify wells that should not be analyzed. This can only be applied if you point to well locations in specific plates with the --f
or --flag
argument.
python amiga.py fit -i /Users/firasmidani/experiments --f CD_treA.txt:G7,H12;ER1_PM2-1:C3,C4,C5
This is often useful if you noticed, by visually checking figures, that certain wells did not show any growth or showed odd measurements (e.g. gas bubbles can cause sharp spikes in OD measurements).
Troubleshooting: Why am I getting “TypeError reduce()” error message?
If you get a TypeError: reduce() of empty sequence with no initial value
, check your arguments for any typos. AMiGA
is case-sensitive; for example, Substrate:alpha-D-glucose
will result in an error because the substrate in the meta-data may be capitalized differently as Substrate:alpha-D-Glucose
.