Last data update: 2014.03.03
R: Read an 'mzTab' file
readMzTabData R Documentation
Read an 'mzTab' file
Description
This function can be used to create an
"MSnSet"
by reading and parsing an
mzTab
file. The metadata section is always used to populate
the MSnSet
's experimentData()@other$mzTab
slot.
Usage
readMzTabData(file, what = c("PRT", "PEP", "PSM"), version = c("1.0",
"0.9"), verbose = TRUE)
Arguments
file
A character
with the mzTab
file to
be read in.
what
One of "PRT"
, "PEP"
or "PSM"
,
defining which of protein, peptide PSMs section should be returned
as an MSnSet
.
version
A character
defining the format
specification version of the mzTab file. Default is
"1.0"
. Version "0.9"
is available of backwards
compatibility. See readMzTabData_v0.9
for details.
verbose
Produce verbose output.
Value
An instance of class MSnSet
.
Author(s)
Laurent Gatto
See Also
See MzTab
and MSnSetList
for
details about the inners of readMzTabData
.
Examples
testfile <- "https://raw.githubusercontent.com/HUPO-PSI/mzTab/master/examples/PRIDE_Exp_Complete_Ac_16649.xml-mztab.txt"
prot <- readMzTabData(testfile, "PRT")
prot
head(fData(prot))
head(exprs(prot))
psms <- readMzTabData(testfile, "PSM")
psms
head(fData(psms))
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(MSnbase)
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: mzR
Loading required package: Rcpp
Loading required package: BiocParallel
Loading required package: ProtGenerics
This is MSnbase version 1.20.7
Read '?MSnbase' and references therein for information
about the package and how to get started.
Attaching package: 'MSnbase'
The following object is masked from 'package:stats':
smooth
The following object is masked from 'package:base':
trimws
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/MSnbase/readMzTabData.Rd_%03d_medium.png", width=480, height=480)
> ### Name: readMzTabData
> ### Title: Read an 'mzTab' file
> ### Aliases: readMzTabData
>
> ### ** Examples
>
> testfile <- "https://raw.githubusercontent.com/HUPO-PSI/mzTab/master/examples/PRIDE_Exp_Complete_Ac_16649.xml-mztab.txt"
> prot <- readMzTabData(testfile, "PRT")
> prot
MSnSet (storageMode: lockedEnvironment)
assayData: 1249 features, 4 samples
element names: exprs
protocolData: none
phenoData: none
featureData
featureNames: X223462890 X19855078 ... X26329627 (1249 total)
fvarLabels: accession description ... protein_coverage (15 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
pubMedIds: pubmed:21398567
Annotation:
- - - Processing information - - -
MSnbase version: 1.20.7
> head(fData(prot))
accession
X223462890 223462890
X19855078 19855078
X21450277 21450277
X6978545 6978545
X51315739 51315739
X117938332 117938332
description
X223462890 Spna2 protein [Mus musculus]
X19855078 RecName: Full=Sodium/potassium-transporting ATPase subunit alpha-3; Short=Na(+)/K(+) ATPase alpha-3 subunit; AltName: Full=Na(+)/K(+) ATPase alpha(III) subunit; AltName: Full=Sodium pump subunit alpha-3
X21450277 sodium/potassium-transporting ATPase subunit alpha-1 precursor [Mus musculus]
X6978545 sodium/potassium-transporting ATPase subunit alpha-2 precursor [Rattus norvegicus]
X51315739 RecName: Full=Protein bassoon
X117938332 spectrin beta chain, brain 1 isoform 1 [Mus musculus]
taxid species database database_version
X223462890 10090 Mus musculus (Mouse) NCBInr_2010_10 nr_101020.fasta
X19855078 10090 Mus musculus (Mouse) NCBInr_2010_10 nr_101020.fasta
X21450277 10090 Mus musculus (Mouse) NCBInr_2010_10 nr_101020.fasta
X6978545 10090 Mus musculus (Mouse) NCBInr_2010_10 nr_101020.fasta
X51315739 10090 Mus musculus (Mouse) NCBInr_2010_10 nr_101020.fasta
X117938332 10090 Mus musculus (Mouse) NCBInr_2010_10 nr_101020.fasta
search_engine best_search_engine_score.1.
X223462890 [MS, MS:1001207, Mascot, ] 6539.67
X19855078 [MS, MS:1001207, Mascot, ] 6331.91
X21450277 [MS, MS:1001207, Mascot, ] 4577.11
X6978545 [MS, MS:1001207, Mascot, ] 4342.81
X51315739 [MS, MS:1001207, Mascot, ] 4177.55
X117938332 [MS, MS:1001207, Mascot, ] 4001.66
search_engine_score.1._ms_run.1. num_psms_ms_run.1.
X223462890 6539.67 157
X19855078 6331.91 144
X21450277 4577.11 112
X6978545 4342.81 108
X51315739 4177.55 100
X117938332 4001.66 109
num_peptides_distinct_ms_run.1. num_peptides_unique_ms_run.1.
X223462890 92 NA
X19855078 49 NA
X21450277 39 NA
X6978545 42 NA
X51315739 59 NA
X117938332 72 NA
ambiguity_members
X223462890 NA
X19855078 NA
X21450277 NA
X6978545 NA
X51315739 NA
X117938332 NA
modifications
X223462890 <NA>
X19855078 32-MOD:00425,525-MOD:00425,606-MOD:00425,725-MOD:00425,739-MOD:00425,940-MOD:00425
X21450277 42-MOD:00425,616-MOD:00425,749-MOD:00425,950-MOD:00425
X6978545 40-MOD:00425,613-MOD:00425,746-MOD:00425,947-MOD:00425
X51315739 <NA>
X117938332 <NA>
protein_coverage
X223462890 0
X19855078 0
X21450277 0
X6978545 0
X51315739 0
X117938332 0
> head(exprs(prot))
protein_abundance_assay.1. protein_abundance_assay.2.
X223462890 1 0.853
X19855078 NA NA
X21450277 1 0.776
X6978545 1 0.784
X51315739 NA NA
X117938332 1 0.865
protein_abundance_assay.3. protein_abundance_assay.4.
X223462890 0.864 0.791
X19855078 NA NA
X21450277 0.819 0.687
X6978545 0.848 0.693
X51315739 NA NA
X117938332 0.861 0.795
> psms <- readMzTabData(testfile, "PSM")
> psms
MSnSet (storageMode: lockedEnvironment)
assayData: 8761 features, 0 samples
element names: exprs
protocolData: none
phenoData: none
featureData
featureNames: X1661 X2280 ... X20346 (8761 total)
fvarLabels: sequence PSM_ID ... end (18 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
pubMedIds: pubmed:21398567
Annotation:
- - - Processing information - - -
MSnbase version: 1.20.7
> head(fData(psms))
sequence PSM_ID accession unique database database_version
X1661 QQVLDR 1661 223462890 NA NCBInr_2010_10 nr_101020.fasta
X2280 LVQYLR 2280 223462890 NA NCBInr_2010_10 nr_101020.fasta
X2281 LVQYLR 2281 223462890 NA NCBInr_2010_10 nr_101020.fasta
X2537 LQQLFR 2537 223462890 NA NCBInr_2010_10 nr_101020.fasta
X2809 EAGSVSLR 2809 223462890 NA NCBInr_2010_10 nr_101020.fasta
X5465 LSILSEER 5465 223462890 NA NCBInr_2010_10 nr_101020.fasta
search_engine search_engine_score.1. modifications
X1661 [MS, MS:1001207, Mascot, ] 37.76 0-MOD:01499
X2280 [MS, MS:1001207, Mascot, ] 44.64 0-MOD:01499
X2281 [MS, MS:1001207, Mascot, ] 44.76 0-MOD:01499
X2537 [MS, MS:1001207, Mascot, ] 45.41 0-MOD:01499
X2809 [MS, MS:1001207, Mascot, ] 55.05 0-MOD:01499
X5465 [MS, MS:1001207, Mascot, ] 39.82 0-MOD:01499
retention_time charge exp_mass_to_charge calc_mass_to_charge
X1661 NA 1 902.4821 902.5181
X2280 NA 1 935.5775 935.5800
X2281 NA 1 935.5833 935.5800
X2537 NA 1 948.5956 948.5753
X2809 NA 1 962.5098 962.5393
X5465 NA 1 1090.6232 1090.6230
spectra_ref pre post start end
X1661 ms_run[1]:spectrum=1661 R Y 20 25
X2280 ms_run[1]:spectrum=2280 K E 151 156
X2281 ms_run[1]:spectrum=2281 K E 151 156
X2537 ms_run[1]:spectrum=2537 R D 786 791
X2809 ms_run[1]:spectrum=2809 K M 1058 1065
X5465 ms_run[1]:spectrum=5465 K T 442 449
>
>
>
>
>
> dev.off()
null device
1
>