The name of the gene for which to calculate the cryptic
initiation site(s).
bamfiles
a vector of characters indicating the BAM file paths.
types
a vector of the same length as bamfiles indicating the type
of data in each file. Example: WT vs MUT or untreated vs treated".
annotations
An object of type annotationsSet
containing information on genes.
introns
an objet of type annotationsSet
containing the annotations of the intronic regions.
Note: The introns must have same name as the gene they
are associated with.
sf
A vector of the scaling factors to apply to each
sample. Must be the same length as the fragments_file
replicates
The number of time to sample the data. Default = 200.
percentage
Th fraction of data to be removed at each simulation.
Default = 0.1.
method
A character string or a vector specifying the method to use to
calculate the cryptic initiations sites. Must be one of "methodC_gaussian"
(default), "methodA", "methodB", "methodC" or "methodD".
paired_end
logical indicating whether the bamfile
contains paired_end data.
as_fragments
logical indicating if paired_end data must paired
and merged to form fragments.
Details
By definition, the observed f value for a gene is the perpendicular
distance between the differential cumulative RNA-seq values (type1 - type2)
and a diagonal linking the first and last data points. The simulated
f max is the maximum f value for a gene after re-sampling the data.
Method A identify a cryptic zone by calculating positions for which
the observed f value is in the distribution of simulated f max.
Method B identify a cryptic zone by calculating positions for which the mean
simulated f value is in the distribution of simulated f max.
Method C identify a cryptic zone by calculating positions for which the
simulated f value is in the distribution of simulated f max.
Method D identify a cryptic zone by calculating the positions for each
simulated f max.
Method C gaussian determine the mean and standard deviation of all the
positions for which the simulated f value is in the distribution of simulated
f max.
Value
A list with the following components:
methodC_gaussian
cTSS mean and sd values using the
method C (gaussian)
methodA
cryptic zones start and end
position using the method A
methodB
cryptic zones start and end
position using the method B
methodC
cryptic zones start and end
position using the method C
methodD
cryptic zones start and end
position using the method D
gene_information
An object of class annotationsSet
containing the information on the gene.