Last data update: 2014.03.03

R: Synteny blocks and hits
SyntenyR Documentation

Synteny blocks and hits

Description

Syntenic blocks are DNA segments composed of conserved hits occurring in the same order on two sequences. The two sequences are typically chromosomes of different species that are hypothesized to contain homology. Class "Synteny" provides objects and functions for storing and viewing syntenic blocks and hits that are shared between sequences.

Usage

## S3 method for class 'Synteny'
pairs(x,
     bounds = TRUE,
     boxBlocks = FALSE,
     labels = abbreviate(rownames(x), 9),
     gap = 0.5,
     line.main = 3,
     cex.labels = NULL,
     font.labels = 1,
     ...)

## S3 method for class 'Synteny'
plot(x,
     colorBy = 1,
     colorRamp = colorRampPalette(c("#FCF9EE", "#FFF272",
                                    "#FFAC28", "#EC5931",
                                    "#EC354D", "#ECA6B1")),
     barColor = "#CCCCCC",
     barSides = ifelse(nrow(x) < 100, TRUE, FALSE),
     horizontal = TRUE,
     labels = abbreviate(rownames(x), 9),
     cex.labels = NULL,
     width = 0.7,
     ...)

## S3 method for class 'Synteny'
print(x,
      quote = FALSE,
      right = TRUE,
      ...)

Arguments

x

An object of class “Synteny”.

bounds

Logical specifying whether to plot sequence boundaries as horizontal or vertical lines.

boxBlocks

Logical indicating whether to draw a rectangle around hits belonging to the same block of synteny.

colorBy

Numeric giving the index of a reference sequence, or a character string indicating to color by “neighbor”, “frequency”, or “none”. (See details section below.)

colorRamp

A function that will return n colors when given a number n. Examples are rainbow, heat.colors, terrain.colors, cm.colors, or (the default) colorRampPalette.

barColor

Character string giving the background color of each bar.

barSides

Logical indicating whether to draw black lines along the long-sides of each bar.

horizontal

Logical indicating whether to plot the sequences horizontally (TRUE) or vertically (FALSE).

labels

Character vector providing names corresponding to each “identifier” for labels on the diagonal.

width

Numeric giving the fractional width of each bar between zero and one.

gap

Distance between subplots, in margin lines.

line.main

If main is specified, line.main provides the line argument to mtext.

cex.labels

Magnification of the labels.

font.labels

Font of labels on the diagonal.

quote

Logical indicating whether to print the output surrounded by quotes.

right

Logical specifying whether to right align strings.

...

Other graphical parameters for pairs or plot, including: main, cex.main, font.main, and oma. Other arguments for print, including print.gap and max.

Details

Objects of class Synteny are stored as square matrices of list elements with dimnames giving the “identifier” of the corresponding sequences. The synteny matrix can be separated into three parts: along, above, and below the diagonal. Each list element along the diagonal contains an integer vector with the width of the sequence(s) belonging to that “identifier”. List elements above the diagonal (column j > row i) each contain a matrix with “hits” corresponding to matches between sequences i and j. List elements below the diagonal each contain a matrix with “blocks” of synteny between sequences j and i.

The pairs method creates a scatterplot matrix from a Synteny object. Dot plots above the diagonal show hits between identifier i and j, where forward hits are colored in black, and hits to the reverse strand of identifier j are colored in red. Plots below the diagonal show blocks of synteny colored by their score, from green (highest scoring) to blue to magenta (lowest scoring).

The plot method displays a bar view of the sequences in the same order as the input object (x). The coloring scheme of each bar is determined by the colorBy argument, and the color palette is set by colorRamp. When colorBy is an index, the sequences are colored according to regions of shared homology with the specified reference sequence (by default 1). If colorBy is “neighbor” then shared syntenic blocks are connected between neighboring sequences. If colorBy is “frequency” then positions in each sequence are colored based on the degree of conservation with the other sequences. In each case, regions that have no correspondence in a sequence are colored barColor.

Author(s)

Erik Wright DECIPHER@cae.wisc.edu

See Also

AlignSynteny, FindSynteny

Examples

# a small example:
dbConn <- dbConnect(SQLite(), ":memory:")
s1 <- DNAStringSet("ACTAGACCCAGACCGATAAACGGACTGGACAAG")
s3 <- reverseComplement(s1)
s2 <- c(s1, s3)
Seqs2DB(c(c(s1, s2), s3),
        "XStringSet",
        dbConn,
        c("s1", "s2", "s2", "s3"))
syn <- FindSynteny(dbConn, minScore=1)
syn # Note:  > 100% hits because of sequence reuse across blocks
pairs(syn, boxBlocks=TRUE)
plot(syn)
dbDisconnect(dbConn)

# a larger example:
db <- system.file("extdata", "Influenza.sqlite", package="DECIPHER")
synteny <- FindSynteny(db, minScore=50)
class(synteny) # 'Synteny'
synteny

# accessing parts
i <- 1
j <- 2
synteny[i, i][[1]] # width of sequences in i
synteny[j, j][[1]] # width of sequences in j
head(synteny[i, j][[1]]) # hits between i & j
synteny[j, i][[1]] # blocks between i & j

# plotting
pairs(synteny) # dot plots

plot(synteny) # bar view colored by position in genome 1
plot(synteny, barColor="#268FD6") # emphasize missing regions
plot(synteny, "frequency") # most regions are shared by all
plot(synteny, "neighbor")

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(DECIPHER)
Loading required package: Biostrings
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
    get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
    match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
    rbind, rownames, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    colMeans, colSums, expand.grid, rowMeans, rowSums

Loading required package: IRanges
Loading required package: XVector
Loading required package: RSQLite
Loading required package: DBI
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/DECIPHER/Synteny-class.Rd_%03d_medium.png", width=480, height=480)
> ### Name: Synteny
> ### Title: Synteny blocks and hits
> ### Aliases: Synteny-class [.Synteny print.Synteny plot.Synteny
> ###   pairs.Synteny
> 
> ### ** Examples
> 
> # a small example:
> dbConn <- dbConnect(SQLite(), ":memory:")
> s1 <- DNAStringSet("ACTAGACCCAGACCGATAAACGGACTGGACAAG")
> s3 <- reverseComplement(s1)
> s2 <- c(s1, s3)
> Seqs2DB(c(c(s1, s2), s3),
+         "XStringSet",
+         dbConn,
+         c("s1", "s2", "s2", "s3"))
Adding 4 sequences to the database.

4 total sequences in table Seqs.
Time difference of 0.02 secs

> syn <- FindSynteny(dbConn, minScore=1)
   |                                                                               |                                                                      |   0%   |                                                                               |============                                                          |  17%   |                                                                               |=======================                                               |  33%   |                                                                               |===================================                                   |  50%   |                                                                               |===============================================                       |  67%   |                                                                               |==========================================================            |  83%   |                                                                               |======================================================================| 100%

Time difference of 0.09 secs

> syn # Note:  > 100% hits because of sequence reuse across blocks
         s1        s2        s3
s1    1 seq 200% hits 100% hits
s2 2 blocks    2 seqs 200% hits
s3  1 block  2 blocks     1 seq
> pairs(syn, boxBlocks=TRUE)
> plot(syn)
> dbDisconnect(dbConn)
[1] TRUE
> 
> # a larger example:
> db <- system.file("extdata", "Influenza.sqlite", package="DECIPHER")
> synteny <- FindSynteny(db, minScore=50)
   |                                                                               |                                                                      |   0%   |                                                                               |====                                                                  |   5%   |                                                                               |=======                                                               |  10%   |                                                                               |==========                                                            |  15%   |                                                                               |==============                                                        |  20%   |                                                                               |==================                                                    |  25%   |                                                                               |=====================                                                 |  30%   |                                                                               |========================                                              |  35%   |                                                                               |============================                                          |  40%   |                                                                               |================================                                      |  45%   |                                                                               |===================================                                   |  50%   |                                                                               |======================================                                |  55%   |                                                                               |==========================================                            |  60%   |                                                                               |==============================================                        |  65%   |                                                                               |=================================================                     |  70%   |                                                                               |====================================================                  |  75%   |                                                                               |========================================================              |  80%   |                                                                               |============================================================          |  85%   |                                                                               |===============================================================       |  90%   |                                                                               |==================================================================    |  95%   |                                                                               |======================================================================| 100%

Time difference of 0.32 secs

> class(synteny) # 'Synteny'
[1] "Synteny"
> synteny
         H9N2     H5N1     H2N2     H7N9     H1N1
H9N2   8 seqs 53% hits 34% hits 48% hits 34% hits
H5N1 7 blocks   8 seqs 30% hits 47% hits 44% hits
H2N2 7 blocks 8 blocks   8 seqs 29% hits 35% hits
H7N9 7 blocks 6 blocks 8 blocks   8 seqs 32% hits
H1N1 6 blocks 8 blocks 6 blocks 6 blocks   8 seqs
> 
> # accessing parts
> i <- 1
> j <- 2
> synteny[i, i][[1]] # width of sequences in i
[1] 1557  890 1025 1714 1418 2341 2328 2225
> synteny[j, j][[1]] # width of sequences in j
[1] 2341 2341 2233 1565 1458 1760 1027  865
> head(synteny[i, j][[1]]) # hits between i & j
     index1 index2 strand width start1 start2 frame1 frame2
[1,]      7      2      0    73      1      2      0      0
[2,]      7      2      0    32     75     76      0      0
[3,]      7      2      0    64    115    116      0      0
[4,]      7      2      0    11    195    196      0      0
[5,]      7      2      0    17    207    208      0      0
[6,]      7      2      0    12    228    229      0      0
> synteny[j, i][[1]] # blocks between i & j
     index1 index2 strand score start1 start2 end1 end2 first_hit last_hit
[1,]      7      2      0  2443      1      2 2328 2329         1       76
[2,]      8      3      0  1776      1      5 2225 2229        77      148
[3,]      1      4      0  1664      1      7 1557 1563       149      190
[4,]      6      1      0  1568      3      3 2341 2341       191      259
[5,]      3      7      0   978     10      3 1018 1011       260      290
[6,]      2      8      0   255     13      1  877  865       291      304
[7,]      4      6      0   192    491    493 1714 1728       305      310
> 
> # plotting
> pairs(synteny) # dot plots
> 
> plot(synteny) # bar view colored by position in genome 1
> plot(synteny, barColor="#268FD6") # emphasize missing regions
> plot(synteny, "frequency") # most regions are shared by all
> plot(synteny, "neighbor")
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>