Last data update: 2014.03.03

R: Apply k-means clustering to profile data
kmeansDesignR Documentation

Apply k-means clustering to profile data

Description

This function performs k-means clustering on recoup generated profile matrices and stores the result as a factor in the design element. If no design is present, then one is created from the k-means result.

Usage

    kmeansDesign(input, design = NULL, kmParams)

Arguments

input

a list object created from recoup or partially processed by recoup or its data member. See the main input to recoup for further information.

design

See the respective argument in recoup for further information

kmParams

Contains parameters for k-means clustering on profiles. See the respective argument in recoup for further information.

Value

The design data frame, either created from scratch or augmented by k-means clustering.

Author(s)

Panagiotis Moulos

Examples

# Load some data
data("recoup_test_data",package="recoup")

# Calculate coverages
test.tss <- recoup(
    test.input,
    design=NULL,
    region="tss",
    type="chipseq",
    genome=test.genome,
    flank=c(1000,1000),
    selector=NULL,
    plotParams=list(plot=FALSE,profile=TRUE,
        heatmap=TRUE,device="x11"),
    rc=0.5
)

# Re-design based on k-means
kmParams=list(k=2,nstart=20,algorithm="MacQueen",iterMax=20,
    reference=NULL,seed=42)
design <- kmeansDesign(test.tss$data,kmParams=kmParams)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(recoup)
Loading required package: GenomicRanges
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicAlignments
Loading required package: SummarizedExperiment
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: Biostrings
Loading required package: XVector
Loading required package: Rsamtools
Loading required package: ggplot2
Loading required package: ComplexHeatmap
Loading required package: grid
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/recoup/kmeansDesign.Rd_%03d_medium.png", width=480, height=480)
> ### Name: kmeansDesign
> ### Title: Apply k-means clustering to profile data
> ### Aliases: kmeansDesign
> 
> ### ** Examples
> 
> # Load some data
> data("recoup_test_data",package="recoup")
> 
> # Calculate coverages
> test.tss <- recoup(
+     test.input,
+     design=NULL,
+     region="tss",
+     type="chipseq",
+     genome=test.genome,
+     flank=c(1000,1000),
+     selector=NULL,
+     plotParams=list(plot=FALSE,profile=TRUE,
+         heatmap=TRUE,device="x11"),
+     rc=0.5
+ )
Calculating tss coverage for WT H4K20me1
Calculating tss coverage for Set8KO H4K20me1
Calculating profile for WT H4K20me1
Calculating profile for Set8KO H4K20me1
Constructing genomic coverage profile curve(s)
The resolution of the requested profiles will be lowered to avoid
increased computation time and/or storage space for heatmap profiles...
Calculating tss profile for WT H4K20me1
Calculating tss profile for Set8KO H4K20me1
Constructing genomic coverage heatmap(s)
Constructing coverage correlation profile curve(s)
> 
> # Re-design based on k-means
> kmParams=list(k=2,nstart=20,algorithm="MacQueen",iterMax=20,
+     reference=NULL,seed=42)
> design <- kmeansDesign(test.tss$data,kmParams=kmParams)
Performing k-means (k=2) clustering on total profiles
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>