Last data update: 2014.03.03

R: Clinical Data Table & Variable Definitions
clinicalDataR Documentation

Clinical Data Table & Variable Definitions

Description

Clinical data for all samples across all studies, and corresponding variable definitions. Rownames are the GEO_GSMID feature, which corresponds to the sample names in the expression object for a certain study. Includes treatment information.

Usage

data("clinicalData")

Format

A list with the following two items: -clinicalTable:A data frame. Rownames are the GEO_GSMID feature, which corresponds to the sample names in the expression object for a certain study. -clinicalVarDef:Character string descriptions of each variable.

Details

GEO study ID can be found form the study_ID variable. If site_ID is NA, it pertains to the batch ID, which may be due to different platforms being used in the same study or different tissue site collections. Columns 112-151 pertaint to treatment information. radiotherapyClass, chemotherapyClass, and hormone_therapyClass are indicator variables used to signal whether a patient had radiotherapy, chemotherapy, and/or some form of hormone therapy (usually an estrogen or aromatse inhibitor.)

More granular information, when available, is provided: for example, whether the chemotherapy drug was capecitabine is coded as the indicator "capecitabine" variable. A value of 1 = yes, 0 = no, NA = not recorded/could not infer from publically available information. "Other" means that most likely, gleaned from the study's Pubmed publication, that the patient may have had other treatments that were not recorded (oftentimes radiotherapy, as this is not always recorded and up to a clinician's discretion in a clinical trial.)

Survival information, such as DFS, RFS, OS, and treatment response information, such pCR and RCB, is also recorded when available.

Value

No return value as this is not a function but rather a data object.

References

Planey, Butte. Database integration of 4923 publicly-available samples of breast cancer molecular and clinical data. AMIA Joint Summits Translational Science Proceedings. (2003) PMC3814460

Examples

data(clinicalData)
#check out some of the variable name/definitions
clinicalData$clinicalVarDef[c(1:2),]
#Check out the treatment information. 
#look at first three patients
head(clinicalData$clinicalTable)[c(1:3),c(112:ncol(clinicalData$clinicalTable))]
#how many had chemotherapy?
numChemoPatients <- length(which(
clinicalData$clinicalTable$chemotherapyClass==1))
#how many patients have non-NA OS binary data?
length(which(!is.na(clinicalData$clinicalTable$OS)))
#how many have OS data in the more granular form of months until OS? 
#this variable includes studies that had a cieling for tracking OS
length(which(!is.na(clinicalData$clinicalTable$OS_months_or_MIN_months_of_OS)))
#how many patients have OS information that is definitively 
#followed up until their death
#(details on how studies collect OS data can be surprising!)
length(which(!is.na(clinicalData$clinicalTable$OS_up_until_death)))

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(curatedBreastData)
Loading required package: ggplot2
Loading required package: impute
Loading required package: XML
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: BiocStyle
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/curatedBreastData/clinicalData.Rd_%03d_medium.png", width=480, height=480)
> ### Name: clinicalData
> ### Title: Clinical Data Table & Variable Definitions
> ### Aliases: clinicalData
> ### Keywords: datasets
> 
> ### ** Examples
> 
> data(clinicalData)
> #check out some of the variable name/definitions
> clinicalData$clinicalVarDef[c(1:2),]
                       variableName
dbUniquePatientID dbUniquePatientID
study_ID                   study_ID
                              definition.NA.means.not.recorded
dbUniquePatientID Unique patient id created for this database.
study_ID                                     GEO GSE study ID.
> #Check out the treatment information. 
> #look at first three patients
> head(clinicalData$clinicalTable)[c(1:3),c(112:ncol(clinicalData$clinicalTable))]
  aromatase_inhibitor estrogen_receptor_blocker
2                   0                         0
3                   0                         0
4                   0                         0
  estrogen_receptor_blocker_and_stops_production
2                                              0
3                                              0
4                                              0
  estrogen_receptor_blocker_and_eliminator anti_HER2 tamoxifen doxorubicin
2                                        0         0         0           0
3                                        0         0         0           0
4                                        0         0         0           0
  epirubicin docetaxel capecitabine fluorouracil paclitaxel cyclophosphamide
2          1         0            0            1          1                1
3          1         0            0            1          1                1
4          1         0            0            1          1                1
  anastrozole fulvestrant gefitinib trastuzumab letrozole chemotherapy
2           0           0         0           0         0            0
3           0           0         0           0         0            0
4           0           0         0           0         0            0
  hormone_therapy no_treatment methotrexate cetuximab carboplatin other
2               0            0            0         0           0     0
3               0            0            0         0           0     0
4               0            0            0         0           0     0
  taxaneGeneral neoadjuvant_or_adjuvant study_specific_protocol_number
2             0                     neo                              1
3             0                     neo                              1
4             0                     neo                              1
> #how many had chemotherapy?
> numChemoPatients <- length(which(
+ clinicalData$clinicalTable$chemotherapyClass==1))
> #how many patients have non-NA OS binary data?
> length(which(!is.na(clinicalData$clinicalTable$OS)))
[1] 409
> #how many have OS data in the more granular form of months until OS? 
> #this variable includes studies that had a cieling for tracking OS
> length(which(!is.na(clinicalData$clinicalTable$OS_months_or_MIN_months_of_OS)))
[1] 406
> #how many patients have OS information that is definitively 
> #followed up until their death
> #(details on how studies collect OS data can be surprising!)
> length(which(!is.na(clinicalData$clinicalTable$OS_up_until_death)))
[1] 211
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>