R: Microarray gene expression dataset from Khan et al., 2001....
khan
R Documentation
Microarray gene expression dataset from Khan et al., 2001. Subset of 306 genes.
Description
Khan contains gene expression profiles of four types of small round
blue cell tumours of childhood (SRBCT) published by Khan et
al. (2001). It also contains further gene annotation retrieved from SOURCE at http://source.stanford.edu/.
Usage
data(khan)
Format
Khan is dataset containing the following:
$train:data.frame of 306 rows and 64 columns.
The training dataset of 64 arrays and 306 gene expression values
$test:data.frame, of 306 rows and 25 columns.
The test dataset of 25 arrays and 306 genes expression values
$gene.labels.imagesID:vector of 306 Image clone identifiers
corresponding to the rownames of $train and $test.
$train.classes:factor with 4 levels "EWS",
"BL-NHL", "NB" and "RMS", which correspond to the four groups in
the $train dataset
$test.classes:factor with 5 levels "EWS",
"BL-NHL", "NB", "RMS" and "Norm" which correspond to the five
groups in the $test dataset
$annotation:data.frame of 306 rows and 8 columns.
This table contains further gene annotation retrieved from SOURCE
http://SOURCE.stanford.edu in May 2004. For each of the 306 genes,
it contains:
$CloneIDImage Clone ID
$UGClusterThe Unigene cluster to which the gene is assigned
$SymbolThe HUGO gene symbol
$LLIDThe locus ID
$UGRepAccNucleotide sequence accession number
$LLRepProtAccProtein sequence accession number
$Chromosomechromosome location
$Cytobandcytoband location
Details
Khan et al., 2001 used cDNA microarrays containing 6567 clones of which
3789 were known genes and 2778 were ESTs to study the expression of
genes in of four types of small round blue cell tumours of childhood (SRBCT).
These were neuroblastoma (NB), rhabdomyosarcoma (RMS), Burkitt lymphoma, a
subset of non-Hodgkin lymphoma (BL), and the Ewing family of tumours
(EWS). Gene expression profiles from both tumour biopsy and cell line
samples were obtained and are contained in this dataset. The dataset downloaded
from the website contained the filtered dataset of 2308 gene expression profiles as described
by Khan et al., 2001. This dataset is available from the http://bioinf.ucd.ie/people/aedin/R/.
In order to reduce the size of the MADE4 package, and produce small example datasets, the top 50 genes from the
ends of 3 axes following bga were selected. This produced a reduced datasets of 306 genes.
Source
khan contains a filtered data of 2308 gene expression profiles
as published and provided by Khan et al. (2001) on the supplementary
web site to their publication
http://research.nhgri.nih.gov/microarray/Supplement/.
References
Culhane AC, et al., 2002 Between-group analysis of microarray
data. Bioinformatics. 18(12):1600-8.
Khan,J., Wei,J.S., Ringner,M., Saal,L.H., Ladanyi,M., Westermann,F.,
Berthold,F., Schwab,M., Antonescu,C.R., Peterson,C. et al. (2001) Classification and diagnostic
prediction of cancers using gene expression profiling and artificial neural networks.
Nat. Med., 7, 673-679.
Examples
data(khan)
summary(khan)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(made4)
Loading required package: ade4
Loading required package: RColorBrewer
Loading required package: gplots
Attaching package: 'gplots'
The following object is masked from 'package:stats':
lowess
Loading required package: scatterplot3d
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/made4/khan.Rd_%03d_medium.png", width=480, height=480)
> ### Name: khan
> ### Title: Microarray gene expression dataset from Khan et al., 2001.
> ### Subset of 306 genes.
> ### Aliases: khan
> ### Keywords: datasets
>
> ### ** Examples
>
> data(khan)
> summary(khan)
Length Class Mode
train 64 data.frame list
test 25 data.frame list
train.classes 64 factor numeric
test.classes 25 factor numeric
annotation 8 data.frame list
gene.labels.imagesID 306 -none- character
cellType 64 -none- character
>
>
>
>
>
> dev.off()
null device
1
>