Last data update: 2014.03.03

R: Function to do k-means cluster analysis
kmeansMR Documentation

Function to do k-means cluster analysis

Description

This is a function to do k-means clustering analysis for objects of classes maiges, maigesRaw and maigesANOVA. Use the function kmeansMde for objects of class maigesDEcluster.

Usage

kmeansM(data, group=c("C", "R")[1], distance="correlation",
        method="complete", sampleT=NULL, doHier=FALSE, sLabelID="SAMPLE",
        gLabelID="GeneName", rmGenes=NULL, rmSamples=NULL, rmBad=TRUE,
        geneGrp=NULL, path=NULL, ...)

Arguments

data

object of class maigesRaw, maiges or maigesANOVA.

group

character string giving the type of grouping: by rows 'R' or columns 'C' (default).

distance

char string giving the type of distance to use. Here we use the function Dist and the possible values are 'euclidean', 'maximum', 'manhattan', 'canberra', 'binary', 'pearson', 'correlation' (default) and 'spearman'.

method

char string specifying the linkage method for the hierarchical cluster. Possible values are 'ward', 'single', 'complete' (default), 'average', 'mcquitty', 'median' or 'centroid'

sampleT

list with 2 vectors. The first one specify the first letter of different sample types to be coloured by distinct colours, that are given in the second vector. If NULL (default) no colour is used.

doHier

logical indicating if you want to do the hierarchical branch in the opposite dimension of clustering. Defaults to FALSE.

sLabelID

character string specifying the sample label ID to be used to label the samples.

gLabelID

character string specifying the gene label ID to be used to label the genes.

rmGenes

char list specifying genes to be removed.

rmSamples

char list specifying samples to be removed.

rmBad

logical indicating to remove or not bad spots (slot BadSpots in objects of class maiges, maigesRaw or maigesANOVA).

geneGrp

numerical or character specifying the gene group to be clustered. This is given by the columns of the slot GeneGrps in objects of classes maiges, maigesRaw and maigesANOVA.

path

numerical or character specifying the gene network to be clustered. This is given by the items of the slot Paths in objects of classes maiges, maigesRaw and maigesANOVA.

...

additional parameters for Kmeans function.

Details

This function implements the k-means clustering method for objects of microarray data defined in this package. The method uses the function Kmeans from package amap.

Value

This function display the heatmaps and return invisibly a list resulted from the function Kmeans.

Author(s)

Gustavo H. Esteves <gesteves@vision.ime.usp.br>

See Also

Kmeans from package amap. somM and hierM for displaying SOM and hierarchical clusters, respectively.

Examples

## Loading the dataset
data(gastro)

## Doing a K-means cluster with 2 groups using all genes, for maigesRaw class
kmeansM(gastro.raw, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=2)

## The same as above, but for maigesNorm class
kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=2)

## Another example with 3 groups
kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=3)

## If you want to use euclidean distance to group genes (or spots) with
## 4 groups
kmeansM(gastro.summ, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=4, group="R", distance="euclidean")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(maigesPack)
Loading required package: convert
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:BiocGenerics':

    plotMA

Loading required package: marray
Loading required package: graph
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/maigesPack/kmeansM.Rd_%03d_medium.png", width=480, height=480)
> ### Name: kmeansM
> ### Title: Function to do k-means cluster analysis
> ### Aliases: kmeansM
> ### Keywords: hplot
> 
> ### ** Examples
> 
> ## Loading the dataset
> data(gastro)
> 
> ## Doing a K-means cluster with 2 groups using all genes, for maigesRaw class
> kmeansM(gastro.raw, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=2)
Warning message:
In as.matrix(log2(tmp1$R)) : NaNs produced
> 
> ## The same as above, but for maigesNorm class
> kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=2)
> 
> ## Another example with 3 groups
> kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=3)
> 
> ## If you want to use euclidean distance to group genes (or spots) with
> ## 4 groups
> kmeansM(gastro.summ, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=4, group="R", distance="euclidean")
Warning message:
did not converge in 10 iterations 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>