Last data update: 2014.03.03

R: Parse an mzIdentML file
mzIDR Documentation

Parse an mzIdentML file

Description

This function takes a single mzIdentML file and parses it into an mzID object.

Usage

mzID(file, verbose = TRUE)

Arguments

file

A character string giving the location of the mzIdentML file to be parsed

verbose

Logical Should information be printed to the console? Default is TRUE

Details

The mzID function uses the XML package to read the content of an mzIdentML file and store it in an mzID object. Unlike how mzR handles mzML files, mzID parses everything in one chunk. Memory can thus be a problem for very big datasets, but as mzIdentML files are not indexed, it is ineficient to access the data dynamically.

If multiple filenames are passed to the function they will be processed in parallel using foreach and doParallel. The number of workers spawned is either the maximal number of available cores or the number of files to parse, whichever is smallest. The return value will in these cases be an mzIDCollection object. If some of the files cannot be parsed they will not be contained in the returned object and a warning will be issued. No errors will be thrown.

Value

An mzID object

See Also

mzID-class mzIDCollection-class

Examples


# Parsing of the example files provided by HUPO:
exampleFiles <- list.files(system.file('extdata', package = 'mzID'), 
                           pattern = '*.mzid', full.names = TRUE)
mzID(exampleFiles[1])

mzID(exampleFiles[2])

mzID(exampleFiles[3])

mzID(exampleFiles[4])

mzID(exampleFiles[5])

mzID(exampleFiles[6])

mzID(exampleFiles[7])

mzID(exampleFiles[8])

mzID(exampleFiles[9])

# Parsing into an mzIDCollection
collection <- mzID(exampleFiles[1:3])
names(collection)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(mzID)
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/mzID/mzID.Rd_%03d_medium.png", width=480, height=480)
> ### Name: mzID
> ### Title: Parse an mzIdentML file
> ### Aliases: mzID
> 
> ### ** Examples
> 
> 
> # Parsing of the example files provided by HUPO:
> exampleFiles <- list.files(system.file('extdata', package = 'mzID'), 
+                            pattern = '*.mzid', full.names = TRUE)
> mzID(exampleFiles[1])
reading 55merge_omssa.mzid... DONE!
An mzID object

Software used:   OMSSA (version: NA)

Rawfile:         D:/TestSpace/NeoTestMarch2011/55merge.mgf

Database:        D:/Software/Databases/Neospora_3rndTryp/Neo_rndTryp_3times.fasta

Number of scans: 39
Number of PSM's: 99
> 
> mzID(exampleFiles[2])
reading 55merge_tandem.mzid... DONE!
An mzID object

Software used:   X!Tandem (version: x! tandem CYCLONE (2010.06.01.5))

Rawfile:         D:/TestSpace/NeoTestMarch2011/55merge.mgf

Database:        D:/Software/Databases/Neospora_3rndTryp/Neo_rndTryp_3times.fasta.pro

Number of scans: 169
Number of PSM's: 170
> 
> mzID(exampleFiles[3])
reading MPC_example_Multiple_search_engines.mzid... DONE!
An mzID object

Software used:   Sequest (version: PVM Slave v.27 (rev. 12))
                 Mascot (version 2.2.0)
                 ProteinScape (version NA)

Rawfile:         proteinscape://www.medizinisches-proteom-center.de/PSServer/Project/Sample/Separation_1D_LC/Fraction_X

Database:        file://www.medizinisches-proteom-center.de/DBServer/ipi.HUMAN/3.15/ipi.HUMAN_decoy.fasta

Number of scans: 18
Number of PSM's: 22
> 
> mzID(exampleFiles[4])
reading Mascot_MSMS_example.mzid... DONE!
An mzID object

Software used:   Mascot (version: 2.2.03)
                 Mascot Parser (version 2.3.0.0)

Rawfile:         file:///dyckall.asc

Database:        file:///C:/inetpub/mascot/sequence/SwissProt/current/SwissProt_51.6.fasta

Number of scans: 4
Number of PSM's: 40
> 
> mzID(exampleFiles[5])
reading Mascot_MSMS_example1.0.mzid... DONE!
An mzID object

Software used:   Mascot (version: 2.2.03)
                 Mascot Parser (version 2.3.0.0)

Rawfile:         file:///dyckall.asc

Database:        file:///C:/inetpub/mascot/sequence/SwissProt/current/SwissProt_51.6.fasta

Number of scans: 4
Number of PSM's: 40
> 
> mzID(exampleFiles[6])
reading Mascot_NA_example.mzid... DONE!
An mzID object

Software used:   Mascot (version: 2.2.03)
                 Mascot Parser (version 2.3.0.0)

Rawfile:         file:///est_coding_test.mgf

Database:        file:///C:/inetpub/mascot/sequence/EST_mini/current/EST_mini_20080623.fasta

Number of scans: 4
Number of PSM's: 4
> 
> mzID(exampleFiles[7])
reading Mascot_top_down_example.mzid... DONE!
An mzID object

Software used:   Mascot (version: 2.2.03)
                 Mascot Parser (version 2.3.0.0)

Rawfile:         file:///MYOGLOBIN_ECD.mgf

Database:        file:///C:/inetpub/mascot/sequence/SwissProt/current/SwissProt_51.6.fasta

Number of scans: 1
Number of PSM's: 5
> 
> mzID(exampleFiles[8])
reading Sequest_example_ver1.1.mzid... DONE!
An mzID object

Software used:   Sequest (version: PVM Master v.27 (rev. 12), (c) 1998-2007)

Rawfiles:        file://www.medizinisches-proteom-center.de/martinlap/C/Eisi/MPC/ProCon/SEQUEST/example_folder/PMXPWE080620_38.187.257.1.dta
                 file://www.medizinisches-proteom-center.de/martinlap/C/Eisi/MPC/ProCon/SEQUEST/example_folder/PMXPWE080620_38.2.69.1.dta
                 file://www.medizinisches-proteom-center.de/martinlap/C/Eisi/MPC/ProCon/SEQUEST/example_folder/PMXPWE080620_38.282.282.1.dta
                 file://www.medizinisches-proteom-center.de/martinlap/C/Eisi/MPC/ProCon/SEQUEST/example_folder/PMXPWE080620_38.505.808.1.dta
                 file://www.medizinisches-proteom-center.de/martinlap/C/Eisi/MPC/ProCon/SEQUEST/example_folder/PMXPWE080620_38.687.687.2.dta
                 file://www.medizinisches-proteom-center.de/martinlap/C/Eisi/MPC/ProCon/SEQUEST/example_folder/PMXPWE080620_38.687.687.3.dta
                 file://www.medizinisches-proteom-center.de/martinlap/C/Eisi/MPC/ProCon/SEQUEST/example_folder/PMXPWE080620_38.693.693.2.dta

Database:        file://www.medizinisches-proteom-center.de/sequestmaster/work/Datenbank/StdCry_nr.fasta

Number of scans: 7
Number of PSM's: 88
Warning message:
In mzIDpsm(doc, ns, addFinalizer = addFinalizer) :
  NAs introduced by coercion
> 
> mzID(exampleFiles[9])
reading mascot_pmf_example.mzid... DONE!
An mzID object

Software used:   Mascot (version: 2.3.02)
                 Mascot Parser (version 2.3.3.0)

Rawfile:         file:///

Database:        file:////usr/local/mascot/sequence/SwissProt/current/SwissProt_57.15.fasta

Number of scans: 1
Number of PSM's: 629
Warning message:
In mzIDpsm(doc, ns, addFinalizer = addFinalizer) :
  NAs introduced by coercion
> 
> # Parsing into an mzIDCollection
> collection <- mzID(exampleFiles[1:3])
starting worker pid=30587 on localhost:11859 at 00:45:32.060
starting worker pid=30596 on localhost:11859 at 00:45:32.177
starting worker pid=30605 on localhost:11859 at 00:45:32.297
Loading required package: mzID
Loading required package: mzID
Loading required package: mzID
loaded mzID and set parent environment
loaded mzID and set parent environment
loaded mzID and set parent environment
reading MPC_example_Multiple_search_engines.mzid...
reading 55merge_omssa.mzid...
reading 55merge_tandem.mzid...
MPC_example_Multiple_search_engines.mzid DONE!
55merge_omssa.mzid DONE!
55merge_tandem.mzid DONE!
> names(collection)
[1] "55merge_omssa"                       "55merge_tandem"                     
[3] "MPC_example_Multiple_search_engines"
> 
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>