Last data update: 2014.03.03

Data Source

R Release (3.2.3)

Data Type

Data set


Results 1 - 3 of 3 found.
[1] < 1 > [1]  Sort:

lymphoma (Package: KODAMA) : Lymphoma Gene Expression Dataset

This dataset consists of gene expression profiles of the three most prevalent adult lymphoid malignancies: diffuse large B-cell lymphoma (DLBCL), follicular lymphoma (FL), and B-cell chronic lymphocytic leukemia (B-CLL). The dataset consists of 4,682 genes in 62 mRNA samples: 42 samples of DLBCL, 9 samples of FL, and 11 samples. of B-CLL. Missing value are imputed and data are standardized as described in Dudoit, et al. (2002).
● Data Source: CranContrib
● Keywords: datasets
● Alias: lymphoma
1 images

USA (Package: KODAMA) : State of the Union Data Set

This dataset consists of the spoken, not written, addresses from 1900 until the sixth address by Barack Obama in 2014. Punctuation characters, numbers, words shorter than three characters, and stop-words (e.g., "that", "and", and "which") were removed from the dataset. This resulted in a dataset of 86 speeches containing 834 different meaningful words each. Term frequency-inverse document frequency (TF-IDF) was used to get the feature vectors. It is often used as a weighting factor in information retrieval and text mining. The TF-IDF value increases proportionally to the number of times a word appears in the document, but is offset by the frequency of the word in the corpus, which helps to control for the fact that some words are generally more common than others.
● Data Source: CranContrib
● Keywords: datasets
● Alias: USA
● 0 images

MetRef (Package: KODAMA) : Nuclear Magnetic Resonance Spectra of Urines

Nuclear magnetic resonance spectra of urines. The data belong to a cohort of 22 healthy donors (11 male and 11 female) where each provided about 40 urine samples over the time course of approximately 2 months, for a total of 873 samples.
● Data Source: CranContrib
● Keywords: datasets
● Alias: MetRef
1 images