numeric scalar indicating the number of most variable features to
use for the PCA. Default is 500, but any ntop argument is
overrided if the feature_set argument is non-NULL.
ncomponents
numeric scalar indicating the number of principal
components to plot, starting from the first principal component. Default is
2. If ncomponents is 2, then a scatterplot of PC2 vs PC1 is produced.
If ncomponents is greater than 2, a pairs plots for the top components
is produced.
exprs_values
character string indicating which values should be used
as the expression values for this plot. Valid arguments are "tpm"
(default; transcripts per million), "norm_tpm" (normalised TPM
values), "fpkm" (FPKM values), "norm_fpkm" (normalised FPKM
values), "counts" (counts for each feature), "norm_counts",
"cpm" (counts-per-million), "norm_cpm" (normalised
counts-per-million), "exprs" (whatever is in the 'exprs' slot
of the SCESet object; default), "norm_exprs" (normalised
expression values) or "stand_exprs" (standardised expression values)
or any other named element of the assayData slot of the SCESet
object that can be accessed with the get_exprs function.
colour_by
character string defining the column of pData(object) to
be used as a factor by which to colour the points in the plot.
shape_by
character string defining the column of pData(object) to
be used as a factor by which to define the shape of the points in the plot.
size_by
character string defining the column of pData(object) to
be used as a factor by which to define the size of points in the plot.
feature_set
character, numeric or logical vector indicating a set of
features to use for the PCA. If character, entries must all be in
featureNames(object). If numeric, values are taken to be indices for
features. If logical, vector is used to index features and should have length
equal to nrow(object).
return_SCESet
logical, should the function return an SCESet
object with principal component values for cells in the
reducedDimension slot. Default is FALSE, in which case a
ggplot object is returned.
scale_features
logical, should the expression values be standardised
so that each feature has unit variance? Default is TRUE.
draw_plot
logical, should the plot be drawn on the current graphics
device? Only used if return_SCESet is TRUE, otherwise the plot
is always produced.
pca_data_input
character argument defining which data should be used
as input for the PCA. Possible options are "exprs" (default), which
uses expression data to produce a PCA at the cell level; "pdata" which
uses numeric variables from pData(object) to do PCA at the cell level;
and "fdata" which uses numeric variables from fData(object) to
do PCA at the feature level.
selected_variables
character vector indicating which variables in
pData(object) to use for the phenotype-data based PCA. Ignored if
the argument pca_data_input is anything other than "pdata".
detect_outliers
logical, should outliers be detected in the PC plot?
Only an option when pca_data_input argument is "pdata". Default
is FALSE.
theme_size
numeric scalar giving default font size for plotting theme
(default is 10).
legend
character, specifying how the legend(s) be shown? Default is
"auto", which hides legends that have only one level and shows others.
Alternatives are "all" (show all legends) or "none" (hide all legends).
...
further arguments passed to plotPCASCESet
Details
The function prcomp is used internally to do the PCA.
The function checks whether the object has standardised
expression values (by looking at stand_exprs(object)). If yes, the
existing standardised expression values are used for the PCA. If not, then
standardised expression values are computed using scale (with
feature-wise unit variances or not according to the scale_features
argument), added to the object and PCA is done using these new standardised
expression values.
If the arguments detect_outliers and return_SCESet are both
TRUE, then the element $outlier is added to the pData
(phenotype data) slot of the SCESet object. This element contains
indicator values about whether or not each cell has been designated as an
outlier based on the PCA. These values can be accessed for filtering
low quality cells with, foe example, example_sceset$outlier.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(scater)
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: ggplot2
Attaching package: 'scater'
The following object is masked from 'package:stats':
filter
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/scater/plotPCA.Rd_%03d_medium.png", width=480, height=480)
> ### Name: plotPCA
> ### Title: Plot PCA for an SCESet object
> ### Aliases: plotPCA plotPCA,SCESet-method plotPCASCESet
>
> ### ** Examples
>
> ## Set up an example SCESet
> data("sc_example_counts")
> data("sc_example_cell_info")
> pd <- new("AnnotatedDataFrame", data = sc_example_cell_info)
> example_sceset <- newSCESet(countData = sc_example_counts, phenoData = pd)
> drop_genes <- apply(exprs(example_sceset), 1, function(x) {var(x) == 0})
> example_sceset <- example_sceset[!drop_genes, ]
>
> ## Examples plotting PC1 and PC2
> plotPCA(example_sceset)
> plotPCA(example_sceset, colour_by = "Cell_Cycle")
> plotPCA(example_sceset, colour_by = "Cell_Cycle", shape_by = "Treatment")
> plotPCA(example_sceset, colour_by = "Cell_Cycle", shape_by = "Treatment",
+ size_by = "Mutation_Status")
Warning message:
Using size for a discrete variable is not advised.
> plotPCA(example_sceset, shape_by = "Treatment", size_by = "Mutation_Status")
Warning message:
Using size for a discrete variable is not advised.
> plotPCA(example_sceset, feature_set = 1:100, colour_by = "Treatment",
+ shape_by = "Mutation_Status")
>
> ## experiment with legend
> example_subset <- example_sceset[, example_sceset$Treatment == "treat1"]
> plotPCA(example_subset, colour_by = "Cell_Cycle", shape_by = "Treatment", legend = "all")
>
> plotPCA(example_sceset, shape_by = "Treatment", return_SCESet = TRUE)
SCESet (storageMode: environment)
assayData: 1973 features, 40 samples
element names: counts, cpm, exprs, is_exprs
protocolData: none
phenoData
sampleNames: Cell_001 Cell_002 ... Cell_040 (40 total)
varLabels: Cell Mutation_Status Cell_Cycle Treatment
varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:
>
> ## Examples plotting more than 2 PCs
> plotPCA(example_sceset, ncomponents = 8)
> plotPCA(example_sceset, ncomponents = 4, colour_by = "Treatment",
+ shape_by = "Mutation_Status")
>
>
>
>
>
>
> dev.off()
null device
1
>