Last data update: 2014.03.03

R: Simulator for gene expression data
generateDataR Documentation

Simulator for gene expression data

Description

A simulator for gene expression data, whose values are normally distributed values with zero mean. The covariances are given by a configurable block-diagonal matrix. By default, half of the samples contain differential gene expression values (see parameter diffsamples).

Usage

generateData(samples=50, genes=10000, diffgenes=200, blocksize=50, cov1=0.2, cov2=0, diff=0.6, diffsamples)

Arguments

samples

number of samples

genes

number of gene expression values per sample

diffgenes

number of differential genes for class 1

blocksize

size of each block in the blockdiagonal correlation matrix

cov1

covariance within the blocks in the correlation matrix

cov2

covariance between the blocks in the correlation matrix

diff

difference between the random gene expression values and the differential gene expression values

diffsamples

number of samples containing differential gene expression values compared to the rest (if missing, this parameter is set to half of the total number of samples)

Details

The simulator generates two labeled classes:
label 1: samples with differentially expressed genes.
label -1: samples without differentially expressed genes.

Value

'generateData' returns a list containing:

data

a (samples x features)-matrix with the simulated gene expression values

labels

a vector with labels (1,-1) for the two classes

Author(s)

Christoph Bartenhagen

Examples

## generate a dataset with 20 samples and 1.000 gene expression values
d = generateData(samples=20, genes=1000, diffgenes=100, blocksize=10)
data = d[[1]]
labels = d[[2]]

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(RDRToolbox)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/RDRToolbox/generateData.Rd_%03d_medium.png", width=480, height=480)
> ### Name: generateData
> ### Title: Simulator for gene expression data
> ### Aliases: generateData
> 
> ### ** Examples
> 
> ## generate a dataset with 20 samples and 1.000 gene expression values
> d = generateData(samples=20, genes=1000, diffgenes=100, blocksize=10)
> data = d[[1]]
> labels = d[[2]]
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>