R: countDnaKmers: Counting k-mers in DNA sequence.
countDnaKmers
R Documentation
countDnaKmers: Counting k-mers in DNA sequence.
Description
Counts occurrence of DNA k-mers in given DNA sequence.
The k-mers are searched in a set of search windows,
which are defined by start and width parameter.
From each position of the search window, a DNA k-mer is identified
on the right hand side on the given DNA sequence.
Each value in the start vector defines the left border
of a search window.
The size of the search window is given by the appropriate value in the
width vector.
The function is intended to count DNA k-mers in selected regions (e.g. exons)
on DNA sequence.
Usage
countDnaKmers(dna,k,start,width)
Arguments
dna
character. Single DNA sequence (vector of length 1).
dna must not contain other characters than "ATCGN".
Capitalization does not matter.
When a 'N' character is found, the current DNA k-mer is skipped.
k
numeric. Number of nucleotides in tabled DNA motifs.
start
numeric. Vector of (1-based) start positions for
reading frames.
Reading frame is counted to the right side of the DNA string.
width
numeric. Defines size of search window for each start
position. Must have the same length as start or length 1
(in which case the values of width are recycled.
Details
The start positions for counting of DNA k-mers are all positions in
{start,...,start+width-1}.
As the identification of a DNA k-mer scans a sequence window of size k,
the last allowed start position counting a k-mer is nchar(dna)-k+1.
The function throws the error 'Search region exceeds string end'
when a value start + width + k > nchar(dna) + 2 occurs.
Value
matrix. Each colum contains the motif-count values for one frame.
The column names are the values in the start vector.
Each row represents one DNA motif.
The DNA sequence of the DNA motif is given as row.name.
Author(s)
Wolfgang Kaisers
See Also
countGenomeKmers
Examples
seq <- "ATAAATA"
countDnaKmers(seq, 2, 1:3, 3)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(seqTools)
Loading required package: zlibbioc
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/seqTools/countDnaKmers.Rd_%03d_medium.png", width=480, height=480)
> ### Name: countDnaKmers
> ### Title: countDnaKmers: Counting k-mers in DNA sequence.
> ### Aliases: countDnaKmers
> ### Keywords: countDnaKmers
>
> ### ** Examples
>
> seq <- "ATAAATA"
> countDnaKmers(seq, 2, 1:3, 3)
1 2 3
AA 1 2 2
AC 0 0 0
AG 0 0 0
AT 1 0 1
CA 0 0 0
CC 0 0 0
CG 0 0 0
CT 0 0 0
GA 0 0 0
GC 0 0 0
GG 0 0 0
GT 0 0 0
TA 1 1 0
TC 0 0 0
TG 0 0 0
TT 0 0 0
>
>
>
>
>
> dev.off()
null device
1
>