Last data update: 2014.03.03

R: countGenomeKmers: Counting K-mers in DNA sequences.
countGenomeKmersR Documentation

countGenomeKmers: Counting K-mers in DNA sequences.

Description

Counts K-mers of DNA sequences inside a vector of DNA sequences. The k-mers are searched in a set of search windows, which are defined by start and width parameter. From each position of the search window, a DNA k-mer is identified on the right hand side on the given DNA sequence. Each value in the start vector defindes the left border of a search window. The size of the search window is given by the appropriate value in the width vector. The function is intended to count DNA k-mers in selected regions (e.g. exons) on DNA chromosomes while respecting strand orientation.

Usage

countGenomeKmers(dna, seqid, start, width, strand, k)

Arguments

dna

character. Vector of DNA sequences. dna must not contain other characters than "ATCGN". Capitalization does not matter. When a 'N' character is found, the current DNA k-mer is skipped.

seqid

numeric. Vector of (1-based) values describing the index of the analyzed sequences inside the given dna vector.

start

numeric. Vector of (1-based) start positions for reading windows.

width

numeric. Vector of window width values.

strand

factor or numeric. First factor level (or numeric: 1) value will be interpreted as (+)-strand. For any other values, the reversed complement sequence will be counted (in left direction from start value).

k

numeric. Number of nucleotides in tabled DNA motifs. Only a single value is allowed (length(n) = 1!)

Details

The function returns a matrix. Each colum contains the motif-count values for one frame. Each row represents one DNA motif. The DNA sequence of the DNA motif is given as row.name.

Value

matrix.

Author(s)

Wolfgang Kaisers

Examples

sq <- "TTTTTCCCCGGGGAAAA"
seqid <- as.integer(c(1, 1))
start <- as.integer(c(6, 14))
width <- as.integer(c(4, 4))
strand <- as.integer(c(1, 0))
k <- 2
countGenomeKmers(sq, seqid, start, width, strand, k)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(seqTools)
Loading required package: zlibbioc
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/seqTools/countGenomeKmers.Rd_%03d_medium.png", width=480, height=480)
> ### Name: countGenomeKmers
> ### Title: countGenomeKmers: Counting K-mers in DNA sequences.
> ### Aliases: countGenomeKmers
> ### Keywords: countGenomeKmers
> 
> ### ** Examples
> 
> sq <- "TTTTTCCCCGGGGAAAA"
> seqid <- as.integer(c(1, 1))
> start <- as.integer(c(6, 14))
> width <- as.integer(c(4, 4))
> strand <- as.integer(c(1, 0))
> k <- 2
> countGenomeKmers(sq, seqid, start, width, strand, k)
   1 2
AA 0 0
AC 0 0
AG 0 0
AT 0 0
CA 0 0
CC 3 0
CG 1 0
CT 0 0
GA 0 0
GC 0 0
GG 0 0
GT 0 0
TA 0 0
TC 0 1
TG 0 0
TT 0 3
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>