Last data update: 2014.03.03

R: Human Protein Atlas in R
hparR Documentation

Human Protein Atlas in R

Description

This package provides a simple interface to the Human Protein Atlas. From the Human Protein Atlas Project page: The Swedish Human Protein Atlas project, funded by the Knut and Alice Wallenberg Foundation, has been set up to allow for a systematic exploration of the human proteome using Antibody-Based Proteomics. This is accomplished by combining high-throughput generation of affinity-purified antibodies with protein profiling in a multitude of tissues and cells assembled in tissue microarrays. Confocal microscopy analysis using human cell lines is performed for more detailed protein localization. The program hosts the Human Protein Atlas portal with expression profiles of human proteins in tissues and cells.

Usage

hpaRna
hpaNormalTissue
hpaSubcellularLoc

Details

Three data tables are distributed by the HPA project and available within the package as dataframes. The description below is adapted from the HPA site:

  1. Normal tissue data (hpaNormalTissue) Expression profiles for proteins in human tissues based on immunohistochemisty using tissue micro arrays. The dataframe includes Ensembl gene identifier ("Gene"), tissue name ("Tissue"), annotated cell type ("Cell.type"), expression value ("Level"), the type of annotation (annotated protein expression (APE), based on more than one antibody, or staining, based on one antibody only) ("Expression.type"), and the reliability or validation of the expression value ("Reliability").

  2. Subcellular location data (hpaSubcellularLoc) Subcellular localisation of proteins based on immunofluorescently stained cells. The dataframe includes Ensembl gene identifier ("Gene"), main subcellular location of the protein ("Main.location"), other locations ("Other.location"), the type of annotation (annotated protein expression (APE), based on more than one antibody, or staining, based on one antibody only) ("Expression.type"), and the reliability or validation of the expression value ("Reliability"). Note that the Gene Ontology and UniProt are both importing part of this data into their respective databases. UniProt cite the source with source:HPA.

  3. RNA data (hpaRna) RNA levels in three different cell lines, based on RNAseq. The dataframe includes Ensembl gene identifier ("Gene"), analysed cell line ("Cell.line"), number of reads per kilobase gene model and million reads ("RPKM"), and abundance class ("Abundance").

Detailed description for gene entries and images and not included in the package but can be accessed from within the R environment through a web browser while on-line.

The full data sets can be individually loaded using the data function (see example below). Data about individual genes of interest can retrived with the getHpa function.

HPA data usage policy: The use of data and images from this site in publications and presentations is permitted provided that the following conditions are met:

  1. The publication and/or presentation are solely for informational and non-commercial purposes.

  2. The source of the data and/or image is referred to this site (www.proteinatlas.org) and/or one or more of our publications are cited.

Author(s)

Laurent Gatto <lg390@cam.ac.uk>

References

See the Human Protein Atlas Project page http://www.proteinatlas.org/ for more details and documentation.

Uhlen et al (2010). Towards a knowledge-based Human Protein Atlas. Nat Biotechnol. 28(12):1248-50.

Berglund et al (2008). A gene-centric Human Protein Atlas for expression profiles based on antibodies. Mol Cell Proteomics. 7(10):2019-27.

Uhlen et al (2005). A human protein atlas for normal and cancer tissues based on antibody proteomics. Mol Cell Proteomics. 4(12):1920-1932.

Ponten et al (2008). The Human Protein Atlas - a tool for pathology. J Pathology. 216(4):387-93.

See Also

getHpaDate for release information. Gene-specific information should be accessed using the getHpa function.

The package vignette can be accessed with vignette("hpar").

Examples

data(hpaRna)
head(hpaRna)
dim(hpaRna)
data(hpaNormalTissue)
head(hpaNormalTissue)
dim(hpaNormalTissue)
data(hpaSubcellularLoc)
head(hpaSubcellularLoc)
dim(hpaSubcellularLoc)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(hpar)
This is hpar 1.14.2. For more information, 
please type '?hpar' or 'vignette('hpar')'.

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/hpar/hpar.Rd_%03d_medium.png", width=480, height=480)
> ### Name: hpar
> ### Title: Human Protein Atlas in R
> ### Aliases: hpar hpaRna hpaRna hpaNormalTissue hpaSubcellularLoc
> ### Keywords: datasets
> 
> ### ** Examples
> 
> data(hpaRna)
> head(hpaRna)
             Gene  Sample Value Unit Abundance
1 ENSG00000000003   A-431  21.3 FPKM    Medium
2 ENSG00000000003    A549  32.5 FPKM    Medium
3 ENSG00000000003  AN3-CA  38.2 FPKM    Medium
4 ENSG00000000003    BEWO  31.4 FPKM    Medium
5 ENSG00000000003  CACO-2  63.9 FPKM      High
6 ENSG00000000003 CAPAN-2  34.2 FPKM    Medium
> dim(hpaRna)
[1] 1546127       5
> data(hpaNormalTissue)
> head(hpaNormalTissue)
             Gene        Tissue           Cell.type        Level
1 ENSG00000000003 adrenal gland     glandular cells Not detected
2 ENSG00000000003      appendix     glandular cells       Medium
3 ENSG00000000003      appendix     lymphoid tissue Not detected
4 ENSG00000000003   bone marrow hematopoietic cells Not detected
5 ENSG00000000003        breast          adipocytes Not detected
6 ENSG00000000003        breast     glandular cells         High
  Expression.type Reliability
1             APE  Supportive
2             APE  Supportive
3             APE  Supportive
4             APE  Supportive
5             APE  Supportive
6             APE  Supportive
> dim(hpaNormalTissue)
[1] 1319440       6
> data(hpaSubcellularLoc)
> head(hpaSubcellularLoc)
             Gene                         Main.location
1 ENSG00000000003                             Cytoplasm
2 ENSG00000000457           Cytoskeleton (Microtubules)
3 ENSG00000000460 Nucleus but not nucleoli;Mitochondria
4 ENSG00000001036 Nucleus but not nucleoli;Mitochondria
5 ENSG00000001084                      Nucleus;Nucleoli
6 ENSG00000001460                               Nucleus
                            Other.location Expression.type Reliability
1                                                      APE   Uncertain
2 Nucleus but not nucleoli;Golgi apparatus             APE   Uncertain
3                                                      APE   Uncertain
4                                                      APE   Uncertain
5                                Cytoplasm             APE  Supportive
6                         Nuclear membrane             APE  Supportive
> dim(hpaSubcellularLoc)
[1] 8857    5
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>