Last data update: 2014.03.03

R: UniProt.ws objects and their related methods and functions
UniProt.ws-objectsR Documentation

UniProt.ws objects and their related methods and functions

Description

UniProt.ws is the base class for interacting with the Uniprot web services from Bioconductor.

In much the same way as an AnnotationDb object allows acces to select for many other annotation packages, UniProt.ws is meant to allow usage of select methods and other supporting methods to enable the easy extraction of data from the Uniprot web services.

select, columns and keys are used together to extract data via an UniProt.ws object.

columns shows which kinds of data can be returned for the UniProt.ws object.

keytypes allows the user to discover which keytypes can be passed in to select or keys via the keytype argument.

keys returns keys for the database contained in the UniProt.ws object . By default it will return the primary keys for the database, which are UNIPROTKB keys, but if used with the keytype argument, it will return the keys from that keytype.

select will retrieve the data as a data.frame based on parameters for selected keys and columns and keytype arguments.

The UniProt.ws will be loaded whenever you load the UniProt.ws package. This object will be set up to retrieve information from Homo sapiens by default, but this value can be changed to any of the species supported by Uniprot. The species and taxId methods allow users to see what species is currently being accessed, and taxId<- allows them to change this value.

species shows the genus and species label currently attached to the UniProt.ws objects database.

taxId shows the NCBI taxonomy ID currently attached to the AnnotationDb objects database. Using the equivalently names replace method (taxId<-) allows the user to change the taxon ID, and the species represented along with it.

availableUniprotSpecies is a helper function to list out the available Species along with their official taxonomy IDs that are available by Uniprot. Because there are so many species represented at UniProt, there is also a pattern argument that can be used to restrict the range of things returned to be only those whose species names match the searth term. Please remember when using this argument that the Genus is always capitalized and the species never is.

lookupUniprotSpeciesFromTaxId is another helper that will look up the species of any tax ID that is supported by Uniprot.

Usage

  columns(x)
  keytypes(x)
  select(x, keys, columns, keytype, ...)
  species(object)
  taxId(x)

  availableUniprotSpecies(pattern, n=Inf)
  lookupUniprotSpeciesFromTaxId(taxId)
  UniProt.ws(taxId, ...)

Arguments

x

the UniProt.ws object.

object

the UniProt.ws object.

keys

the keys to select records for from the database. All possible keys are returned by using the keys method.

columns

the columns or kinds of things that can be retrieved from the database. As with keys, all possible columns are returned by using the columns method.

keytype

the keytype that matches the keys used. For the select methods, this is used to indicate the kind of ID being used with the keys argument. For the keys method this is used to indicate which kind of keys are desired from keys

pattern

A string passed in to limit the results

n

the maximim number of results to return.

taxId

a taxonomy id

...

other arguments

Value

keys,columns,keytypes, species and lookupUniprotSpeciesFromTaxId each return a character vector of possible values.

taxId returns a numeric value that corresponds to the taxonomy ID.

select and availableUniprotSpecies each return a data.frame.

Author(s)

Marc Carlson

See Also

select

Examples

## Make a UniProt.ws object
up <- UniProt.ws(taxId=9606)

## look at the object
up

## get the current species
species(up)

## look up available species with their tax ids
availableUniprotSpecies("musculus")

## get the current taxId
taxId(up)

## look up the species that goes with a tax id
lookupUniprotSpeciesFromTaxId(9606)

## set the taxId to something else
taxId(up) <- 10090
up

## list the possible key types
head(keytypes(up))

## list the columns that can be retreived
head(columns(up))

## list all possible keys of type entrez gene ID.
## (this process is not instantaneous)
if(interactive()){
  egs = keys(up, "ENTREZ_GENE")
}

## use select to extract some data
res <- select(up, 
              keys = c("22627","22629"), 
              columns = c("PDB","UNIGENE","SEQUENCE"),
              keytype = "ENTREZ_GENE")
head(res)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(UniProt.ws)
Loading required package: RSQLite
Loading required package: DBI
Loading required package: RCurl
Loading required package: bitops
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/UniProt.ws/UniProtWS-class.Rd_%03d_medium.png", width=480, height=480)
> ### Name: UniProt.ws-objects
> ### Title: UniProt.ws objects and their related methods and functions
> ### Aliases: UniProt.ws class:UniProt.ws UniProt.ws-class
> ###   show,UniProt.ws-method species species,UniProt.ws-method taxId
> ###   taxId,UniProt.ws-method taxId<- taxId<-,UniProt.ws-method cols
> ###   columns columns,UniProt.ws-method keytypes keytypes,UniProt.ws-method
> ###   keys keys,UniProt.ws-method select select,UniProt.ws-method
> ###   availableUniprotSpecies lookupUniprotSpeciesFromTaxId
> ### Keywords: classes methods
> 
> ### ** Examples
> 
> ## Make a UniProt.ws object
> up <- UniProt.ws(taxId=9606)
> 
> ## look at the object
> up
"UniProt.ws" object:
An interface object for UniProt web services
Current Taxonomy ID:
9606
Current Species name:
Homo sapiens
To change Species see: help('availableUniprotSpecies')
> 
> ## get the current species
> species(up)
[1] "Homo sapiens"
> 
> ## look up available species with their tax ids
> availableUniprotSpecies("musculus")
  taxon ID            Species name
1   520121     Anthocoris musculus
2   208057    Anthoscopus musculus
3   238007         Apomys musculus
4     9771   Balaenoptera musculus
5   197864    Blepharisma musculus
6    10090            Mus musculus
7    35531 Mus musculus bactrianus
8    10091  Mus musculus castaneus
9    57486 Mus musculus molossinus
> 
> ## get the current taxId
> taxId(up)
[1] 9606
> 
> ## look up the species that goes with a tax id
> lookupUniprotSpeciesFromTaxId(9606)
[1] "Homo sapiens"
> 
> ## set the taxId to something else
> taxId(up) <- 10090
> up
"UniProt.ws" object:
An interface object for UniProt web services
Current Taxonomy ID:
10090
Current Species name:
Mus musculus
To change Species see: help('availableUniprotSpecies')
> 
> ## list the possible key types
> head(keytypes(up))
[1] "AARHUS/GHENT-2DPAGE" "AGD"                 "ALLERGOME"          
[4] "ARACHNOSERVER"       "BIOCYC"              "CGD"                
> 
> ## list the columns that can be retreived
> head(columns(up))
[1] "3D"                  "AARHUS/GHENT-2DPAGE" "AGD"                
[4] "ALLERGOME"           "ARACHNOSERVER"       "BIOCYC"             
> 
> ## list all possible keys of type entrez gene ID.
> ## (this process is not instantaneous)
> #if(interactive()){
>   egs = keys(up, "ENTREZ_GENE")
Getting mapping data for P68510 ... and P_ENTREZGENEID
> #}
> 
> ## use select to extract some data
> res <- select(up, 
+               keys = c("22627","22629"), 
+               columns = c("PDB","UNIGENE","SEQUENCE"),
+               keytype = "ENTREZ_GENE")
Getting mapping data for 22627 ... and ACC
Getting mapping data for P62259 ... and UNIGENE_ID
Getting mapping data for P62259 ... and PDB_ID
Getting extra data for P62259,Q5SS40,P68510
'select()' returned 1:many mapping between keys and columns
> head(res)
  ENTREZ_GENE  PDB   UNIGENE
1       22627 <NA> Mm.471625
2       22627 <NA> Mm.234700
3       22629 <NA> Mm.332314
                                                                                                                                                                                                                                                         SEQUENCE
1 MDDREDLVYQAKLAEQAERYDEMVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASWRIISSIEQKEENKGGEDKLKMIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVFYYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVFYYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGEEQNKEALQDVEDENQ
2 MDDREDLVYQAKLAEQAERYDEMVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASWRIISSIEQKEENKGGEDKLKMIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVFYYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVFYYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGEEQNKEALQDVEDENQ
3          MGDREQLLQRARLAEQAERYDDMASAMKAVTELNEPLSNEDRNLLSVAYKNVVGARRSSWRVISSIEQKTMADGNEKKLEKVKAYREKIEKELETVCNDVLALLDKFLIKNCNDFQYESKVFYLKMKGDYYRYLAEVASGEKKNSVVEASEAAYKEAFEISKEHMQPTHPIRLGLALNFSVFYYEIQNAPEQACLLAKQAFDDAIAELDTLNEDSYKDSTLIMQLLRDNLTLWTSDQQDEEAGEGN
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>