This package contains tools for facilitating cleaning (quality control
and quality assurance) and analysis of GWAS data.
Details
GWASTools provides a set of classes for storing data and annotation
from Genome Wide Association studies, and a set of functions for data
cleaning and analysis that operate on those classes.
Genotype and intensity data are stored in external files (GDS or NetCDF), so it is
possible to analyze data sets that are too large to be contained in
memory. The GenotypeReader class and IntensityReader
class unions provide a common interface for GDS and NetCDF files.
Two sets of classes for annotation are provided.
SnpAnnotationDataFrame and
ScanAnnotationDataFrame extend
AnnotatedDataFrame and provide in-memory containers for SNP and
scan annotation and metadata.
SnpAnnotationSQLite and
ScanAnnotationSQLite provide interfaces to SNP and scan
annotation and metadata stored in SQLite databases.
The GenotypeData and IntensityData classes
combine genotype or intensity data with SNP and scan annotation,
ensuring that the data in the NetCDF files is consistent with
annotation through unique SNP and scan IDs. A majority of the
functions in the GWASTools package take GenotypeData
and/or IntensityData objects as arguments.
Author(s)
Stephanie M. Gogarten, Cathy Laurie, Tushar Bhangale, Matthew
P. Conomos, Cecelia Laurie, Caitlin McHugh, Ian Painter, Xiuwen Zheng,
Jess Shen, Rohit Swarnkar, Adrienne Stilp
Laurie, C. C., Doheny, K. F., Mirel, D. B., Pugh, E. W., Bierut, L. J.,
Bhangale, T., Boehm, F., Caporaso, N. E., Cornelis, M. C., Edenberg,
H. J., Gabriel, S. B., Harris, E. L., Hu, F. B., Jacobs, K. B., Kraft,
P., Landi, M. T., Lumley, T., Manolio, T. A., McHugh, C., Painter, I.,
Paschall, J., Rice, J. P., Rice, K. M., Zheng, X., and Weir, B. S., for
the GENEVA Investigators (2010), Quality control and quality assurance
in genotypic data for genome-wide association studies. Genetic
Epidemiology, 34: 591-602. doi: 10.1002/gepi.20516