R: Analysis: Analysis of pooled CRISPR screening data using a...
stat.mageck
R Documentation
Analysis: Analysis of pooled CRISPR screening data using a MAGeCK
Description
CaRpools also uses MAGeCK to look for enriched or depleted genes within your screening data. Please note that MAGeCK needs to be installed correctly, this can be tested by 'check.caRpools'.
Within this approach, the read counts of all sgRNAs in one dataset are first normalized by the function set in the MIACCS file. By default, normalization is done by read count division with the dataset median.
Then, the fold change of each population of sgRNAs for a gene is tested against the population of either the non-targeting controls or randomly picked sgRNAs, as defined by the random picks option within the MIACCS file, using a two-sided Mann-Whitney-U test. P-values are corrected for multiple testing using FDR.
A list of untreated sample data frames of read-count data as created by load.file().
*Default* none
*Values* A list of data frames of the untreated samples
treated.list
A list of treated sample data frames of read-count data as created by load.file().
*Default* none
*Values* A list of data frames of the treated samples
namecolumn
In which column are the sgRNA identifiers?
*Default* 1
*Values* column number (numeric)
fullmatchcolumn
In which column are the read counts?
*Default* 2
*Values* column number (numeric)
extractpattern
PERL regular expression that is used to retrieve the gene identifier from the overall sgRNA identifier.
e.g. in **AAK1_107_0** it will extract **AAK1**, since this is the gene identifier beloning to this sgRNA identifier. **Please see: Read-Count Data Files**
*Default* expression("^(.+?)(_.+)"), will work for most available libraries.
*Values* PERL regular expression with parenthesis indicating the gene identifier (expression)
sort.criteria
MAGeCK argument *–sort-criteria*
*Default* "neg"
*Values* see MAGeCK documentation
mageckfolder
Folder for MAGeCK raw data output (internally used).
*Default* NULL
*Value* (character)
filename
Filename of raw MAGeCK data output.
*Default* "data"
*Values* (character)
adjust.method
Method to adjust p-value for multiple testing. See MAGeCK documentation.
*Default* "fdr"
*Values* see MAGeCK documentation
fdr.pval
FDR used for correction.
*Default* 0.05
*Values* (numeric)
norm.fun
The mathematical function to normalize data. By default, the median is used.
*Default* median
*Values* Any mathematical function of R (function)
Details
none
Value
stat.mageck retrieves a list of two data frames.
One with gene information, the other with sgRNA information.