R: Analysis: Analysis of pooled CRISPR screening data using a...
stat.wilcox
R Documentation
Analysis: Analysis of pooled CRISPR screening data using a Wilcoxon Test
Description
__Wilcox__
Within this approach, the read counts of all sgRNAs in one dataset are first normalized by the function set in the MIACCS file. By default, normalization is done by read count division with the dataset median.
Then, the fold change of each population of sgRNAs for a gene is tested against the population of either the non-targeting controls or randomly picked sgRNAs, as defined by the random picks option within the MIACCS file, using a two-sided Mann-Whitney-U test. P-values are corrected for multiple testing using FDR.
A list of data.frames of untreated, control samples. e.g. list(df.control1, df.control2)
treated.list
A list of data.frames of treated samples. e.g. list(df.treated1, df.treated2)
namecolumn
In which the target names are located, e.g. namecolumn=1 for the first columns.
fullmatchcolumn
Column, in which readcounts are located, e.g. fullmatchcolumn=2 for the second column.
normalize
Datasets can be normalized by norm.fun if normalize=TRUE.
norm.fun
The function used to normalize the datasets if normalize=TRUE. By default, normalization is done using the dataset median, but any other function e.g. mean, can be used in principle.
extractpattern
Regular Expression, used to extract the gene name from the sgRNA name. Please make sure that the gene name extracted is accesible by putting its regular expression in brackets (). The default value expression("^(.+?)_.+") will look for the gene name (.+?) in front of the separator _ and any character afterwards .+ e.g. gene1_anything .
controls
DSS requires a set of non-targeting sgRNAs (negative controls) within the datasets. You can specify the arbitrary gene name for these controls using controls="arbitrary.gene.name.of.controls".
sorting
Analysis output is by default sorted by gene name (sorting=FALSE). If desired, the output table can be sorted according to the p-value of the genes (sorting=TRUE).
control.picks
If no non-targeting controls are present or set, wilcox will pick a randum number of sgRNAs from the data set as the alternative population. This is only used if 'controls=NULL'.
*Default* 300
*Values* numeric
Value
stat.wilcox return a data.frame, which can be visualized by plot.hitident.
The data.frame has the following format:
untreated
treated
foldchange
p.value
AAK1
2.061346
3.007924
1.351672
0.2966311
AATK
3.413357
5.129985
1.398695
0.1146190
ABI1
2.997385
4.384881
1.418959
0.1437962
ABL1
2.269906
2.874087
1.211499
0.3681327
ABL2
2.519391
4.539583
1.732575
0.6335575
For each gene, the foldchange as well as the p-value, derived by the Mann-Whitney U test against the non-targeting controls, are listed.