R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Function to do k-means cluster analysis

kmeansM

R Documentation

Function to do k-means cluster analysis

Description

This is a function to do k-means clustering analysis for objects of classes maiges, maigesRaw and maigesANOVA. Use the function kmeansMde for objects of class maigesDEcluster.

Usage

kmeansM(data, group=c("C", "R")[1], distance="correlation",
        method="complete", sampleT=NULL, doHier=FALSE, sLabelID="SAMPLE",
        gLabelID="GeneName", rmGenes=NULL, rmSamples=NULL, rmBad=TRUE,
        geneGrp=NULL, path=NULL, ...)

Arguments

`data`	object of class `maigesRaw`, `maiges` or `maigesANOVA`.
`group`	character string giving the type of grouping: by rows 'R' or columns 'C' (default).
`distance`	char string giving the type of distance to use. Here we use the function `Dist` and the possible values are 'euclidean', 'maximum', 'manhattan', 'canberra', 'binary', 'pearson', 'correlation' (default) and 'spearman'.
`method`	char string specifying the linkage method for the hierarchical cluster. Possible values are 'ward', 'single', 'complete' (default), 'average', 'mcquitty', 'median' or 'centroid'
`sampleT`	list with 2 vectors. The first one specify the first letter of different sample types to be coloured by distinct colours, that are given in the second vector. If NULL (default) no colour is used.
`doHier`	logical indicating if you want to do the hierarchical branch in the opposite dimension of clustering. Defaults to FALSE.
`sLabelID`	character string specifying the sample label ID to be used to label the samples.
`gLabelID`	character string specifying the gene label ID to be used to label the genes.
`rmGenes`	char list specifying genes to be removed.
`rmSamples`	char list specifying samples to be removed.
`rmBad`	logical indicating to remove or not bad spots (slot `BadSpots` in objects of class `maiges`, `maigesRaw` or `maigesANOVA`).
`geneGrp`	numerical or character specifying the gene group to be clustered. This is given by the columns of the slot `GeneGrps` in objects of classes `maiges`, `maigesRaw` and `maigesANOVA`.
`path`	numerical or character specifying the gene network to be clustered. This is given by the items of the slot `Paths` in objects of classes `maiges`, `maigesRaw` and `maigesANOVA`.
`...`	additional parameters for `Kmeans` function.

Details

This function implements the k-means clustering method for objects of microarray data defined in this package. The method uses the function Kmeans from package amap.

Value

This function display the heatmaps and return invisibly a list resulted from the function Kmeans.

Author(s)

Gustavo H. Esteves <gesteves@vision.ime.usp.br>

Examples

## Loading the dataset
data(gastro)

## Doing a K-means cluster with 2 groups using all genes, for maigesRaw class
kmeansM(gastro.raw, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=2)

## The same as above, but for maigesNorm class
kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=2)

## Another example with 3 groups
kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=3)

## If you want to use euclidean distance to group genes (or spots) with
## 4 groups
kmeansM(gastro.summ, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
        sLabelID="Sample", gLabelID="Name", centers=4, group="R", distance="euclidean")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(maigesPack)
Loading required package: convert
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:BiocGenerics':

    plotMA

Loading required package: marray
Loading required package: graph
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/maigesPack/kmeansM.Rd_%03d_medium.png", width=480, height=480)
> ### Name: kmeansM
> ### Title: Function to do k-means cluster analysis
> ### Aliases: kmeansM
> ### Keywords: hplot
> 
> ### ** Examples
> 
> ## Loading the dataset
> data(gastro)
> 
> ## Doing a K-means cluster with 2 groups using all genes, for maigesRaw class
> kmeansM(gastro.raw, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=2)
Warning message:
In as.matrix(log2(tmp1$R)) : NaNs produced
> 
> ## The same as above, but for maigesNorm class
> kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=2)
> 
> ## Another example with 3 groups
> kmeansM(gastro.norm, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=3)
> 
> ## If you want to use euclidean distance to group genes (or spots) with
> ## 4 groups
> kmeansM(gastro.summ, rmGenes=c("BLANK","DAP","LYS","PHE", "Q_GENE","THR","TRP"),
+         sLabelID="Sample", gLabelID="Name", centers=4, group="R", distance="euclidean")
Warning message:
did not converge in 10 iterations 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>