Last data update: 2014.03.03
R: Cleavage of polypeptide sequences
Cleavage of polypeptide sequences
Description
This functions cleave polypeptide sequences. Use cleavageSites
to find
the cleavage sites, cleavageRanges
to find the cleavage
ranges and cleave
to get the cleavage products.
Usage
## S4 method for signature 'character'
cleave(x, enzym = "trypsin", missedCleavages = 0,
custom = NULL, unique = TRUE)
## S4 method for signature 'AAString'
cleave(x, enzym = "trypsin", missedCleavages = 0,
custom = NULL, unique = TRUE)
## S4 method for signature 'AAStringSet'
cleave(x, enzym = "trypsin", missedCleavages = 0,
custom = NULL, unique = TRUE)
## S4 method for signature 'character'
cleavageRanges(x, enzym = "trypsin", missedCleavages = 0,
custom = NULL)
## S4 method for signature 'AAString'
cleavageRanges(x, enzym = "trypsin", missedCleavages = 0,
custom = NULL)
## S4 method for signature 'AAStringSet'
cleavageRanges(x, enzym = "trypsin", missedCleavages = 0,
custom = NULL)
## S4 method for signature 'character'
cleavageSites(x, enzym = "trypsin", custom = NULL)
## S4 method for signature 'AAString'
cleavageSites(x, enzym = "trypsin", custom = NULL)
## S4 method for signature 'AAStringSet'
cleavageSites(x, enzym = "trypsin", custom = NULL)
Arguments
x
polypeptide sequences.
enzym
character
, cleavage rule.
missedCleavages
numeric
, number of missed
cleavages.
custom
character
, of length 1 or 2. Could be used to define own
cleaveage rules. The first element would be the pattern and the optional
second element would be an exception (non-cleavage) pattern. Perl-like
regular expressions are supported, see gregexpr
for
details. If custom
is set the enzym
is ignored.
unique
logical
, if TRUE
all duplicated cleavage products
per peptide are removed.
Details
The cleavage rules are taken from:
http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html
Cleavage rules (cleavage between P1 and P1'):
Rule name P4 P3 P2 P1 P1' P2'
arg-c proteinase
- - - R - -
asp-n endopeptidase
- - - - D -
bnps-skatole-c
- - - W - -
caspase1
F,W,Y,L - H,A,T D not P,E,D,Q,K,R
-
caspase2
D V A D not P,E,D,Q,K,R -
caspase3
D M Q D not P,E,D,Q,K,R -
caspase4
L E V D not P,E,D,Q,K,R -
caspase5
L,W E H D - -
caspase6
V E H,I D
not P,E,D,Q,K,R -
caspase7
D E V D not P,E,D,Q,K,R -
caspase8
I,L E T D not P,E,D,Q,K,R
-
caspase9
L E H D - -
caspase10
I E A D - -
chymotrypsin-high
- - - F,Y not P -
- - - W not M,P -
chymotrypsin-low
- - - F,L,Y not P
-
- - - W not M,P -
- - - M not P,Y -
- - - H not D,M,P,W -
clostripain
- - - R - -
cnbr
- - - M - -
enterokinase
D,E D,E D,E K - -
factor xa
A,F,G,I,L,T,V,M D,E G R -
-
formic acid
- - - D - -
glutamyl endopeptidase
- - - D - -
granzyme-b
I E P D - -
hydroxylamine
- - - N G -
iodosobenzoic acid
- - - W - -
lysc
- - - K - -
lysn
- - - - K -
neutrophil elastase
- - - A,V - -
ntcb
- - - - C -
pepsin1.3
- not H,K,R not P not R F,L,W,Y
not P
- not H,K,R not P F,L,W,Y - not P
pepsin
- not H,K,R not P not R F,L
not P
- not H,K,R not P F,L - not P
proline endopeptidase
- - not H,K,R P not P
-
proteinase k
- - - A,E,F,I,L,T,V,W,Y -
-
staphylococcal peptidase i
- - not E E -
-
thermolysin
- - - not D,E A,F,I,L,M,V
-
thrombin
- - G R G -
A,F,G,I,L,T,V,M A,F,G,I,L,T,V,W P R not D,E
not D,E
trypsin
- - - K,R not P -
- - W K P -
- - M R P -
Exceptions:
Rule name Enzyme name P4 P3 P2 P1 P1'
P2'
trypsin - - C,D K D -
-
- C K H,Y -
- - C R K -
- - R R H,R -
Rule name Enzyme name
arg-c proteinase
Arg-C proteinase
asp-n endopeptidase
Asp-N endopeptidase
bnps-skatole-c
BNPS-Skatole
caspase1
Caspase 1
caspase2
Caspase 2
caspase3
Caspase 3
caspase4
Caspase 4
caspase5
Caspase 5
caspase6
Caspase 6
caspase7
Caspase 7
caspase8
Caspase 8
caspase9
Caspase 9
caspase10
Caspase 10
chymotrypsin-high
Chymotrypsin-high specificity (C-term to [FYW], not before P)
chymotrypsin-low
Chymotrypsin-low specificity (C-term to [FYWML], not before P)
clostripain
Clostripain (Clostridiopeptidase B)
cnbr
CNBr
enterokinase
Enterokinase
factor xa
Factor Xa
formic acid
Formic acid
glutamyl endopeptidase
Glutamyl endopeptidase
granzyme-b
Granzyme B
hydroxylamine
Hydroxylamine
iodosobenzoic acid
Iodosobenzoic acid
lysc
LysC
lysn
LysN
neutrophil elastase
Neutrophil elastase
ntcb
NTCB (2-nitro-5-thiocyanobenzoic acid)
pepsin1.3
Pepsin (pH == 1.3)
pepsin
Pepsin (pH > 2)
proline endopeptidase
Proline-endopeptidase
proteinase k
Proteinase K
staphylococcal peptidase i
Staphylococcal Peptidase I
thermolysin
Thermolysin
thrombin
Thrombin
trypsin
Trypsin
Value
cleave
If x
is a character
it returns a list
of the same
length as x
. Each element contains a character
vector with
the corresponding cleavage products of the polypeptides.
If x
is an AAString
or an
AAStringSet
an
AAStringSet
or an
AAStringSetList
instance of the same length as
x
is returned.
Each element contains an
AAString
or an AAStringSet
instance with the corresponding cleavage products of the polypeptides.
cleavageRanges
If x
is a character
it returns a list
of the same
length as x
. Each element contains a two-column matrix
with
the start and end positions of the peptides.
If x
is an AAString
or an
AAStringSet
instance an
IRanges
or an IRangesList
of the
same length as x
is returned.
cleavageSites
Returns a list
of the same length as x
. Each element
contains an integer
vector with the cleavage positions.
Overview:
Input cleave
cleavageRanges
cleavageSites
character
list
of character
list
of matrix
list
of integer
AAString
AAStringSet
IRanges
list
of integer
AAStringSet
AAStringSetList
IRangesList
list
of integer
Author(s)
Sebastian Gibb <mail@sebastiangibb.de>
References
Gasteiger E., Hoogland C., Gattiker A., Duvaud S.,
Wilkins M.R., Appel R.D., Bairoch A.; "Protein
Identification and Analysis Tools on the ExPASy Server".
(In) John M. Walker (ed): The Proteomics Protocols
Handbook, Humana Press (2005).
http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html
See Also
AAString
,
AAStringSet
,
AAStringSetList
,
IRanges
,
IRangesList
Examples
library("cleaver")
## Gastric juice peptide 1 (UniProtKB/Swiss-Prot: GAJU_HUMAN/P01358)
gaju <- "LAAGKVEDSD"
cleave(gaju, "trypsin")
# $LAAGKVEDSD
# [1] "LAAGK" "VEDSD"
cleavageRanges(gaju, "trypsin")
# $LAAGKVEDSD
# start end
# [1,] 1 5
# [2,] 6 10
cleavageSites(gaju, "trypsin")
# $LAAGKVEDSD
# [1] 5
cleave(gaju, "trypsin", missedCleavages=1)
# $LAAGKVEDSD
# [1] "LAAGKVEDSD"
cleavageRanges(gaju, "trypsin", missedCleavages=1)
# $LAAGKVEDSD
# start end
# [1,] 1 10
cleave(gaju, "trypsin", missedCleavages=0:1)
# $LAAGKVEDSD
# [1] "LAAGK" "VEDSD" "LAAGKVEDSD"
cleavageRanges(gaju, "trypsin", missedCleavages=0:1)
# $LAAGKVEDSD
# start end
# [1,] 1 5
# [2,] 6 10
# [3,] 1 10
cleave(gaju, "pepsin")
# $LAAGKVEDSD
# [1] "LAAGKVEDSD"
# (no cleavage)
## use AAStringSet
gaju <- AAStringSet("LAAGKVEDSD")
cleave(gaju)
# AAStringSetList of length 1
# [["LAAGKVEDSD"]] LAAGK VEDSD
## Beta-enolase (UniProtKB/Swiss-Prot: ENOB_THUAL/P86978)
enob <- "SITKIKAREILD"
cleave(enob, "trypsin")
# $SITKIKAREILD
# [1] "SITK" "IK" "AR" "EILD"
cleave(enob, "trypsin", missedCleavages=2)
# $SITKIKAREILD
# [1] "SITKIKAR" "IKAREILD"
cleave(enob, "trypsin", missedCleavages=0:2)
# $SITKIKAREILD
# [1] "SITK" "IK" "AR" "EILD" "SITKIK" "IKAR"
# [7] "AREILD" "SITKIKAR" "IKAREILD"
## define own cleavage rule: cleave at K
cleave(enob, custom="K")
# $SITKIKAREILD
# [1] "SITK" "IK" "AREILD"
cleavageRanges(enob, custom="K")
# $SITKIKAREILD
# start end
# [1,] 1 4
# [2,] 5 6
# [3,] 7 12
## define own cleavage rule: cleave at K but not if followed by A
cleave(enob, custom=c("K", "K(?=A)"))
# $SITKIKAREILD
# [1] "SITK" "IKAREILD"
cleavageRanges(enob, custom=c("K", "K(?=A)"))
# $SITKIKAREILD
# start end
# [1,] 1 4
# [2,] 5 12
cleavageSites(enob, custom=c("K", "K(?=A)"))
# $SITKIKAREILD
# [1] 4
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(cleaver)
Loading required package: Biostrings
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: XVector
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/cleaver/cleave-methods.Rd_%03d_medium.png", width=480, height=480)
> ### Name: cleave
> ### Title: Cleavage of polypeptide sequences
> ### Aliases: cleave cleave,character-method cleave,AAString-method
> ### cleave,AAStringSet-method cleavageRanges
> ### cleavageRanges,character-method cleavageRanges,AAString-method
> ### cleavageRanges,AAStringSet-method cleavageSites
> ### cleavageSites,character-method cleavageSites,AAString-method
> ### cleavageSites,AAStringSet-method
> ### Keywords: methods
>
> ### ** Examples
>
> library("cleaver")
>
> ## Gastric juice peptide 1 (UniProtKB/Swiss-Prot: GAJU_HUMAN/P01358)
> gaju <- "LAAGKVEDSD"
>
> cleave(gaju, "trypsin")
$LAAGKVEDSD
[1] "LAAGK" "VEDSD"
> # $LAAGKVEDSD
> # [1] "LAAGK" "VEDSD"
>
> cleavageRanges(gaju, "trypsin")
$LAAGKVEDSD
start end
[1,] 1 5
[2,] 6 10
> # $LAAGKVEDSD
> # start end
> # [1,] 1 5
> # [2,] 6 10
>
> cleavageSites(gaju, "trypsin")
$LAAGKVEDSD
[1] 5
> # $LAAGKVEDSD
> # [1] 5
>
> cleave(gaju, "trypsin", missedCleavages=1)
$LAAGKVEDSD
[1] "LAAGKVEDSD"
> # $LAAGKVEDSD
> # [1] "LAAGKVEDSD"
>
> cleavageRanges(gaju, "trypsin", missedCleavages=1)
$LAAGKVEDSD
start end
[1,] 1 10
> # $LAAGKVEDSD
> # start end
> # [1,] 1 10
>
> cleave(gaju, "trypsin", missedCleavages=0:1)
$LAAGKVEDSD
[1] "LAAGK" "VEDSD" "LAAGKVEDSD"
> # $LAAGKVEDSD
> # [1] "LAAGK" "VEDSD" "LAAGKVEDSD"
>
> cleavageRanges(gaju, "trypsin", missedCleavages=0:1)
$LAAGKVEDSD
start end
[1,] 1 5
[2,] 6 10
[3,] 1 10
> # $LAAGKVEDSD
> # start end
> # [1,] 1 5
> # [2,] 6 10
> # [3,] 1 10
>
>
> cleave(gaju, "pepsin")
$LAAGKVEDSD
[1] "LAAGKVEDSD"
> # $LAAGKVEDSD
> # [1] "LAAGKVEDSD"
> # (no cleavage)
>
>
> ## use AAStringSet
> gaju <- AAStringSet("LAAGKVEDSD")
>
> cleave(gaju)
AAStringSetList of length 1
[["LAAGKVEDSD"]] LAAGK VEDSD
> # AAStringSetList of length 1
> # [["LAAGKVEDSD"]] LAAGK VEDSD
>
>
> ## Beta-enolase (UniProtKB/Swiss-Prot: ENOB_THUAL/P86978)
> enob <- "SITKIKAREILD"
>
> cleave(enob, "trypsin")
$SITKIKAREILD
[1] "SITK" "IK" "AR" "EILD"
> # $SITKIKAREILD
> # [1] "SITK" "IK" "AR" "EILD"
>
> cleave(enob, "trypsin", missedCleavages=2)
$SITKIKAREILD
[1] "SITKIKAR" "IKAREILD"
> # $SITKIKAREILD
> # [1] "SITKIKAR" "IKAREILD"
>
> cleave(enob, "trypsin", missedCleavages=0:2)
$SITKIKAREILD
[1] "SITK" "IK" "AR" "EILD" "SITKIK" "IKAR" "AREILD"
[8] "SITKIKAR" "IKAREILD"
> # $SITKIKAREILD
> # [1] "SITK" "IK" "AR" "EILD" "SITKIK" "IKAR"
> # [7] "AREILD" "SITKIKAR" "IKAREILD"
>
> ## define own cleavage rule: cleave at K
> cleave(enob, custom="K")
$SITKIKAREILD
[1] "SITK" "IK" "AREILD"
> # $SITKIKAREILD
> # [1] "SITK" "IK" "AREILD"
>
> cleavageRanges(enob, custom="K")
$SITKIKAREILD
start end
[1,] 1 4
[2,] 5 6
[3,] 7 12
> # $SITKIKAREILD
> # start end
> # [1,] 1 4
> # [2,] 5 6
> # [3,] 7 12
>
> ## define own cleavage rule: cleave at K but not if followed by A
> cleave(enob, custom=c("K", "K(?=A)"))
$SITKIKAREILD
[1] "SITK" "IKAREILD"
> # $SITKIKAREILD
> # [1] "SITK" "IKAREILD"
>
> cleavageRanges(enob, custom=c("K", "K(?=A)"))
$SITKIKAREILD
start end
[1,] 1 4
[2,] 5 12
> # $SITKIKAREILD
> # start end
> # [1,] 1 4
> # [2,] 5 12
>
> cleavageSites(enob, custom=c("K", "K(?=A)"))
$SITKIKAREILD
[1] 4
> # $SITKIKAREILD
> # [1] 4
>
>
>
>
>
>
> dev.off()
null device
1
>