R: Descriptions of available values for 'columns' and...
UNIPROTKB
R Documentation
Descriptions of available values for columns and keytypes.
Description
This manual page enumerates the kinds of data represented by the
values returned when the user calls columns or keytypes
Details
All the possible values for columns and keytypes are listed
below. Users will have to actually use these methods to learn which
of the following possible values actually apply in their case.
:UNIPROTKB
The central ID for UniProt and swissprot
:UNIPARC
UniParc
:UNIREF50
UniRef50
:UNIREF90
UniRef90
:UNIREF100
UniRef100
:EMBL/GENBANK/DDBJ
EMBL/GenBank/DDBJ
:EMBL/GENBANK/DDBJ_CDS
EMBL/GenBank/DDBJ CDS
:PIR
PIR
:UNIGENE
UniGene
:ENTREZ_GENE
Entrez Gene (GeneID)
:GI_NUMBER*
GI number
:IPI
IPI
:REFSEQ_PROTEIN
RefSeq Protein
:REFSEQ_NUCLEOTIDE
RefSeq Nucleotide
:PDB
PDB
:DISPROT
DisProt
:HSSP
HSSP
:DIP
DIP
:MINT
MINT
:ALLERGOME
Allergome
:MEROPS
MEROPS
:PEROXIBASE
PeroxiBase
:PPTASEDB
PptaseDB
:REBASE
REBASE
:TCDB
TCDB
:PHOSSITE
PhosSite
:DMDM
DMDM
:AARHUS/GHENT-2DPAGE
Aarhus/Ghent-2DPAGE
:ECO2DBASE
ECO2DBASE
:WORLD-2DPAGE
World-2DPAGE
:DNASU
DNASU
:ENSEMBL
Ensembl
:ENSEMBL_PROTEIN
Ensembl Protein
:ENSEMBL_TRANSCRIPT
Ensembl Transcript
:ENSEMBL_GENOMES
Ensembl Genomes
:ENSEMBL_GENOMES PROTEIN
Ensembl Genomes Protein
:ENSEMBL_GENOMES TRANSCRIPT
Ensembl Genomes Transcript
:KEGG
KEGG
:PATRIC
PATRIC
:TIGR
TIGR
:UCSC
UCSC
:VECTORBASE
VectorBase
:AGD
AGD
:ARACHNOSERVER
ArachnoServer
:CGD
CGD
:CONOSERVER
ConoServer
:CYGD
CYGD
:DICTYBASE
dictyBase
:ECHOBASE
EchoBASE
:ECOGENE
EcoGene
:EUHCVDB
euHCVdb
:EUPATHDB
EuPathDB
:FLYBASE
FlyBase
:GENECARDS
GeneCards
:GENEFARM
GeneFarm
:GENOLIST
GenoList
:H-INVDB
H-InvDB
:HGNC
HGNC
:HPA
HPA
:LEGIOLIST
LegioList
:LEPROMA
Leproma
:MAIZEGDB
MaizeGDB
:MIM
MIM
:MGI
MGI
:NEXTPROT
neXtProt
:ORPHANET
Orphanet
:PHARMGKB
PharmGKB
:POMBASE
PomBase
:PSEUDOCAP
PseudoCAP
:RGD
RGD
:SGD
SGD
:TAIR
TAIR
:TUBERCULIST
TubercuList
:WORMBASE
WormBase
:WORMBASE_TRANSCRIPT
WormBase Transcript
:WORMBASE_PROTEIN
WormBase Protein
:XENBASE
Xenbase
:ZFIN
ZFIN
:EGGNOG
eggNOG
:GENETREE
GeneTree
:HOVERGEN
HOVERGEN
:KO
KO
:OMA
OMA
:ORTHODB
OrthoDB
:PROTCLUSTDB
ProtClustDB
:BIOCYC
BioCyc
:REACTOME
Reactome
:UNIPATHWAY
UniPathWay
:CLEANEX
CleanEx
:GERMONLINE
GermOnline
:DRUGBANK
DrugBank
:GENOMERNAI
GenomeRNAi
:NEXTBIO
NextBio
:CITATION
citations
:CLUSTERS
clusters
:COMMENTS
comments
:DOMAINS
domains
:DOMAIN
domain
:EC
ec ID
:ID
ID
:EXISTENCE
existence
:FAMILIES
families
:FEATURES
features
:GENES
genes
:GO
go term
:GO-ID
go id
:INTERPRO
interpro
:INTERACTOR
interactor
:KEYWORDS
keywords
:KEYWORD-ID
keyword-id
:LAST-MODIFIED
last-modified
:LENGTH
length
:ORGANISM
organism
:ORGANISM-ID
organism-id
:PATHWAY
pathway
:PROTEIN NAMES
protein names
:REVIEWED
reviewed
:SCORE
score
:SEQUENCE
sequence
:3D
3d
:TAXON
taxon
:TOOLS
tools
:VERSION
version
:DATABASE(PFAM)
PFAM ids
:DATABASE(PDB)
PDB ids
:
Author(s)
Marc Carlson
Examples
library(UniProt.ws)
up <- UniProt.ws(taxId=9606)
## List the possible values for columns
columns(up)
## List the possible values for keytypes
keytypes(up)
## get some values back
## list all possible keys of type entrez gene ID.
## (this process is not instantaneous)
if(interactive()){
keys <- head(keys(up, keytype="UNIPROTKB"))
keys
}
select(up, keys=c("P31946","P62258"), columns=c("PDB","SEQUENCE"),
keytype="UNIPROTKB")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(UniProt.ws)
Loading required package: RSQLite
Loading required package: DBI
Loading required package: RCurl
Loading required package: bitops
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/UniProt.ws/colsAndKeytypes.Rd_%03d_medium.png", width=480, height=480)
> ### Name: UNIPROTKB
> ### Title: Descriptions of available values for 'columns' and 'keytypes'.
> ### Aliases: UNIPROTKB UNIPARC UNIREF50 UNIREF90 UNIREF100
> ### EMBL/GENBANK/DDBJ EMBL/GENBANK/DDBJ_CDS PIR UNIGENE ENTREZ_GENE
> ### GI_NUMBER* IPI REFSEQ_PROTEIN REFSEQ_NUCLEOTIDE PDB DISPROT HSSP DIP
> ### MINT ALLERGOME MEROPS PEROXIBASE PPTASEDB REBASE TCDB PHOSSITE DMDM
> ### AARHUS/GHENT-2DPAGE ECO2DBASE WORLD-2DPAGE DNASU ENSEMBL
> ### ENSEMBL_PROTEIN ENSEMBL_TRANSCRIPT ENSEMBL_GENOMES 'ENSEMBL_GENOMES
> ### PROTEIN' 'ENSEMBL_GENOMES TRANSCRIPT' KEGG PATRIC TIGR UCSC
> ### VECTORBASE AGD ARACHNOSERVER CGD CONOSERVER CYGD DICTYBASE ECHOBASE
> ### ECOGENE EUHCVDB EUPATHDB FLYBASE GENECARDS GENEFARM GENOLIST H-INVDB
> ### HGNC HPA LEGIOLIST LEPROMA MAIZEGDB MIM MGI NEXTPROT ORPHANET
> ### PHARMGKB POMBASE PSEUDOCAP RGD SGD TAIR TUBERCULIST WORMBASE
> ### WORMBASE_TRANSCRIPT WORMBASE_PROTEIN XENBASE ZFIN EGGNOG GENETREE
> ### HOGENOM HOVERGEN KO OMA ORTHODB PROTCLUSTDB BIOCYC REACTOME
> ### UNIPATHWAY CLEANEX GERMONLINE DRUGBANK GENOMERNAI NEXTBIO CITATION
> ### CLUSTERS COMMENTS DOMAINS DOMAIN EC ID EXISTENCE FAMILIES FEATURES
> ### GENES GO GO-ID INTERPRO INTERACTOR KEYWORDS KEYWORD-ID LAST-MODIFIED
> ### LENGTH ORGANISM ORGANISM-ID PATHWAY 'PROTEIN NAMES' REVIEWED SCORE
> ### SEQUENCE 3D TAXONOMY-LINEAGE TOOLS VERSION DATABASE(PFAM)
> ### DATABASE(PDB)
> ### Keywords: utilities manip
>
> ### ** Examples
>
> library(UniProt.ws)
> up <- UniProt.ws(taxId=9606)
> ## List the possible values for columns
> columns(up)
[1] "3D" "AARHUS/GHENT-2DPAGE"
[3] "AGD" "ALLERGOME"
[5] "ARACHNOSERVER" "BIOCYC"
[7] "CGD" "CITATION"
[9] "CLEANEX" "CLUSTERS"
[11] "COMMENTS" "CONOSERVER"
[13] "CYGD" "DATABASE(PDB)"
[15] "DATABASE(PFAM)" "DICTYBASE"
[17] "DIP" "DISPROT"
[19] "DMDM" "DNASU"
[21] "DOMAIN" "DOMAINS"
[23] "DRUGBANK" "EC"
[25] "ECHOBASE" "ECO2DBASE"
[27] "ECOGENE" "EGGNOG"
[29] "EMBL/GENBANK/DDBJ" "EMBL/GENBANK/DDBJ_CDS"
[31] "ENSEMBL" "ENSEMBL_GENOMES"
[33] "ENSEMBL_GENOMES PROTEIN" "ENSEMBL_GENOMES TRANSCRIPT"
[35] "ENSEMBL_PROTEIN" "ENSEMBL_TRANSCRIPT"
[37] "ENTREZ_GENE" "ENTRY-NAME"
[39] "EUHCVDB" "EUPATHDB"
[41] "EXISTENCE" "FAMILIES"
[43] "FEATURES" "FLYBASE"
[45] "GENECARDS" "GENEFARM"
[47] "GENES" "GENETREE"
[49] "GENOLIST" "GENOMERNAI"
[51] "GERMONLINE" "GI_NUMBER*"
[53] "GO" "GO-ID"
[55] "H-INVDB" "HGNC"
[57] "HOGENOM" "HPA"
[59] "HSSP" "ID"
[61] "INTERACTOR" "INTERPRO"
[63] "KEGG" "KEYWORD-ID"
[65] "KEYWORDS" "KO"
[67] "LAST-MODIFIED" "LEGIOLIST"
[69] "LENGTH" "LEPROMA"
[71] "MAIZEGDB" "MEROPS"
[73] "MGI" "MIM"
[75] "MINT" "NEXTBIO"
[77] "NEXTPROT" "OMA"
[79] "ORGANISM" "ORGANISM-ID"
[81] "ORPHANET" "ORTHODB"
[83] "PATHWAY" "PATRIC"
[85] "PDB" "PEROXIBASE"
[87] "PHARMGKB" "PHOSSITE"
[89] "PIR" "POMBASE"
[91] "PPTASEDB" "PROTCLUSTDB"
[93] "PROTEIN-NAMES" "PSEUDOCAP"
[95] "REACTOME" "REBASE"
[97] "REFSEQ_NUCLEOTIDE" "REFSEQ_PROTEIN"
[99] "REVIEWED" "RGD"
[101] "SCORE" "SEQUENCE"
[103] "SGD" "SUBCELLULAR-LOCATIONS"
[105] "TAIR" "TAXONOMIC-LINEAGE"
[107] "TCDB" "TIGR"
[109] "TOOLS" "TUBERCULIST"
[111] "UCSC" "UNIGENE"
[113] "UNIPARC" "UNIPATHWAY"
[115] "UNIPROTKB" "UNIREF100"
[117] "UNIREF50" "UNIREF90"
[119] "VECTORBASE" "VERSION"
[121] "VIRUS-HOSTS" "WORLD-2DPAGE"
[123] "WORMBASE" "WORMBASE_PROTEIN"
[125] "WORMBASE_TRANSCRIPT" "XENBASE"
[127] "ZFIN"
> ## List the possible values for keytypes
> keytypes(up)
[1] "AARHUS/GHENT-2DPAGE" "AGD"
[3] "ALLERGOME" "ARACHNOSERVER"
[5] "BIOCYC" "CGD"
[7] "CLEANEX" "CONOSERVER"
[9] "CYGD" "DICTYBASE"
[11] "DIP" "DISPROT"
[13] "DMDM" "DNASU"
[15] "DRUGBANK" "ECHOBASE"
[17] "ECO2DBASE" "ECOGENE"
[19] "EGGNOG" "EMBL/GENBANK/DDBJ"
[21] "EMBL/GENBANK/DDBJ_CDS" "ENSEMBL"
[23] "ENSEMBL_GENOMES" "ENSEMBL_GENOMES PROTEIN"
[25] "ENSEMBL_GENOMES TRANSCRIPT" "ENSEMBL_PROTEIN"
[27] "ENSEMBL_TRANSCRIPT" "ENTREZ_GENE"
[29] "EUHCVDB" "EUPATHDB"
[31] "FLYBASE" "GENECARDS"
[33] "GENEFARM" "GENETREE"
[35] "GENOLIST" "GENOMERNAI"
[37] "GERMONLINE" "GI_NUMBER*"
[39] "H-INVDB" "HGNC"
[41] "HOGENOM" "HPA"
[43] "HSSP" "KEGG"
[45] "KO" "LEGIOLIST"
[47] "LEPROMA" "MAIZEGDB"
[49] "MEROPS" "MGI"
[51] "MIM" "MINT"
[53] "NEXTBIO" "NEXTPROT"
[55] "OMA" "ORPHANET"
[57] "ORTHODB" "PATRIC"
[59] "PDB" "PEROXIBASE"
[61] "PHARMGKB" "PHOSSITE"
[63] "PIR" "POMBASE"
[65] "PPTASEDB" "PROTCLUSTDB"
[67] "PSEUDOCAP" "REACTOME"
[69] "REBASE" "REFSEQ_NUCLEOTIDE"
[71] "REFSEQ_PROTEIN" "RGD"
[73] "SGD" "TAIR"
[75] "TCDB" "TIGR"
[77] "TUBERCULIST" "UCSC"
[79] "UNIGENE" "UNIPARC"
[81] "UNIPATHWAY" "UNIPROTKB"
[83] "UNIREF100" "UNIREF50"
[85] "UNIREF90" "VECTORBASE"
[87] "WORLD-2DPAGE" "WORMBASE"
[89] "WORMBASE_PROTEIN" "WORMBASE_TRANSCRIPT"
[91] "XENBASE" "ZFIN"
> ## get some values back
> ## list all possible keys of type entrez gene ID.
> ## (this process is not instantaneous)
> # if(interactive()){
> keys <- head(keys(up, keytype="UNIPROTKB"))
> keys
[1] "P61981" "P04439" "P30456" "P10316" "P30460" "Q95365"
> # }
> select(up, keys=c("P31946","P62258"), columns=c("PDB","SEQUENCE"),
+ keytype="UNIPROTKB")
Getting mapping data for P31946 ... and PDB_ID
Getting extra data for P31946,P62258
'select()' returned 1:many mapping between keys and columns
UNIPROTKB PDB
1 P31946 2BQ0
2 P31946 2C23
3 P31946 4DNK
4 P62258 2BR9
5 P62258 3UAL
6 P62258 3UBW
SEQUENCE
1 MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVLELLDKYLIPNATQPESKVFYLKMKGDYFRYLSEVASGDNKQTTVSNSQQAYQEAFEISKKEMQPTHPIRLGLALNFSVFYYEILNSPEKACSLAKTAFDEAIAELDTLNEESYKDSTLIMQLLRDNLTLWTSENQGDEGDAGEGEN
2 MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVLELLDKYLIPNATQPESKVFYLKMKGDYFRYLSEVASGDNKQTTVSNSQQAYQEAFEISKKEMQPTHPIRLGLALNFSVFYYEILNSPEKACSLAKTAFDEAIAELDTLNEESYKDSTLIMQLLRDNLTLWTSENQGDEGDAGEGEN
3 MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTERNEKKQQMGKEYREKIEAELQDICNDVLELLDKYLIPNATQPESKVFYLKMKGDYFRYLSEVASGDNKQTTVSNSQQAYQEAFEISKKEMQPTHPIRLGLALNFSVFYYEILNSPEKACSLAKTAFDEAIAELDTLNEESYKDSTLIMQLLRDNLTLWTSENQGDEGDAGEGEN
4 MDDREDLVYQAKLAEQAERYDEMVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASWRIISSIEQKEENKGGEDKLKMIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVFYYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVFYYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGEEQNKEALQDVEDENQ
5 MDDREDLVYQAKLAEQAERYDEMVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASWRIISSIEQKEENKGGEDKLKMIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVFYYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVFYYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGEEQNKEALQDVEDENQ
6 MDDREDLVYQAKLAEQAERYDEMVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASWRIISSIEQKEENKGGEDKLKMIREYRQMVETELKLICCDILDVLDKHLIPAANTGESKVFYYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVFYYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGEEQNKEALQDVEDENQ
>
>
>
>
>
> dev.off()
null device
1
>