The main function is doppelgangR(), which takes as minimal input a list of ExpressionSet object, and searches all list pairs for duplicated samples. The search is based on the genomic data (exprs(eset)), phenotype/clinical data (pData(eset)), and "smoking guns" - supposedly unique identifiers found in pData(eset).
Density function, distribution function and random number generation for the skew-t (ST) distribution. Functions copied from sn CRAN library v0.4.18 for argument name compatibility with st.mle function from the same version.
● Data Source:
BioConductor
● Keywords: distribution
● Alias: dst, pst, qst, rst
●
0 images
Fits a skew-t (ST) or multivariate skew-t (MST) distribution to data, or fits a linear regression model with (multivariate) skew-t errors, using maximum likelihood estimation. Functions copied from sn CRAN library v0.4.18 because they were later deprecated in that library.
By default uses the Fisher z-transform for Pearson correlation (atanh), and identifies outliers as those above the quantile of a skew-t distribution with mean and standard deviation estimated from the z-transformed matrix. The quantile is calculated from the Bonferroni-corrected cumulative probability of the upper tail.
phenoFinder
(Package: doppelgangR) :
Calculate pairwise similarities of phenoData between samples for a list containing two ExpressionSets
This function acts as a wrapper to phenoDist to handle cases of one ExpressionSet, a list of two identical ExpressionSets, or a list of two different ExpressionSets.
plot-methods
(Package: doppelgangR) :
Histograms of all pairwise sample correlations, showing
Identified doppelgangers are shown with a red vertical line overlaid on a histogram of pairwise sample correlations. One plot is made per pair of datasets.