R: Full masked genome sequences for Mus musculus (UCSC version...
BSgenome.Mmusculus.UCSC.mm10.masked
R Documentation
Full masked genome sequences for Mus musculus (UCSC version mm10)
Description
Full genome sequences for Mus musculus (Mouse) as provided by UCSC (mm10, Dec. 2011) and stored in Biostrings objects. The sequences are the same as in BSgenome.Mmusculus.UCSC.mm10, except that each of them has the 2 following masks on top: (1) the mask of assembly gaps (AGAPS mask), and (2) the mask of intra-contig ambiguities (AMB mask).
Note
The masks in this BSgenome data package were made from the following
source data files:
See ?BSgenome.Mmusculus.UCSC.mm10 in the
BSgenome.Mmusculus.UCSC.mm10 package for information about how the sequences
were obtained.
See ?BSgenomeForge and the BSgenomeForge
vignette (vignette("BSgenomeForge")) in the BSgenome
software package for how to make a BSgenome data package.
Author(s)
The Bioconductor Dev Team
See Also
BSgenome.Mmusculus.UCSC.mm10 in the BSgenome.Mmusculus.UCSC.mm10 package
for information about how the sequences were obtained.
BSgenome objects and the
the available.genomes function
in the BSgenome software package.
MaskedDNAString objects in the Biostrings
package.
The BSgenomeForge vignette (vignette("BSgenomeForge"))
in the BSgenome software package for how to make a BSgenome
data package.
Examples
BSgenome.Mmusculus.UCSC.mm10.masked
genome <- BSgenome.Mmusculus.UCSC.mm10.masked
seqlengths(genome)
genome$chr1 # a MaskedDNAString object!
## To get rid of the masks altogether:
unmasked(genome$chr1) # same as BSgenome.Mmusculus.UCSC.mm10$chr1
if ("AGAPS" %in% masknames(genome)) {
## Check that the assembly gaps contain only Ns:
checkOnlyNsInGaps <- function(seq)
{
## Replace all masks by the inverted AGAPS mask
masks(seq) <- gaps(masks(seq)["AGAPS"])
unique_letters <- uniqueLetters(seq)
if (any(unique_letters != "N"))
stop("assembly gaps contain more than just Ns")
}
## A message will be printed each time a sequence is removed
## from the cache:
options(verbose=TRUE)
for (seqname in seqnames(genome)) {
cat("Checking sequence", seqname, "... ")
seq <- genome[[seqname]]
checkOnlyNsInGaps(seq)
cat("OK\n")
}
}
## See the GenomeSearching vignette in the BSgenome software
## package for some examples of genome-wide motif searching using
## Biostrings and the BSgenome data packages:
if (interactive())
vignette("GenomeSearching", package="BSgenome")
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(BSgenome.Mmusculus.UCSC.mm10.masked)
Loading required package: BSgenome
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: S4Vectors
Loading required package: stats4
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: GenomicRanges
Loading required package: Biostrings
Loading required package: XVector
Loading required package: rtracklayer
Loading required package: BSgenome.Mmusculus.UCSC.mm10
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/BSgenome.Mmusculus.UCSC.mm10.masked/package.Rd_%03d_medium.png", width=480, height=480)
> ### Name: BSgenome.Mmusculus.UCSC.mm10.masked
> ### Title: Full masked genome sequences for Mus musculus (UCSC version
> ### mm10)
> ### Aliases: BSgenome.Mmusculus.UCSC.mm10.masked-package
> ### BSgenome.Mmusculus.UCSC.mm10.masked
> ### Keywords: package data
>
> ### ** Examples
>
> BSgenome.Mmusculus.UCSC.mm10.masked
Mouse genome:
# organism: Mus musculus (Mouse)
# provider: UCSC
# provider version: mm10
# release date: Dec. 2011
# release name: Genome Reference Consortium GRCm38
# 66 sequences:
# chr1 chr2 chr3
# chr4 chr5 chr6
# chr7 chr8 chr9
# chr10 chr11 chr12
# chr13 chr14 chr15
# ... ... ...
# chrUn_GL456372 chrUn_GL456378 chrUn_GL456379
# chrUn_GL456381 chrUn_GL456382 chrUn_GL456383
# chrUn_GL456385 chrUn_GL456387 chrUn_GL456389
# chrUn_GL456390 chrUn_GL456392 chrUn_GL456393
# chrUn_GL456394 chrUn_GL456396 chrUn_JH584304
# (use 'seqnames()' to see all the sequence names, use the '$' or '[[' operator
# to access a given sequence)
> genome <- BSgenome.Mmusculus.UCSC.mm10.masked
> seqlengths(genome)
chr1 chr2 chr3
195471971 182113224 160039680
chr4 chr5 chr6
156508116 151834684 149736546
chr7 chr8 chr9
145441459 129401213 124595110
chr10 chr11 chr12
130694993 122082543 120129022
chr13 chr14 chr15
120421639 124902244 104043685
chr16 chr17 chr18
98207768 94987271 90702639
chr19 chrX chrY
61431566 171031299 91744698
chrM chr1_GL456210_random chr1_GL456211_random
16299 169725 241735
chr1_GL456212_random chr1_GL456213_random chr1_GL456221_random
153618 39340 206961
chr4_GL456216_random chr4_GL456350_random chr4_JH584292_random
66673 227966 14945
chr4_JH584293_random chr4_JH584294_random chr4_JH584295_random
207968 191905 1976
chr5_GL456354_random chr5_JH584296_random chr5_JH584297_random
195993 199368 205776
chr5_JH584298_random chr5_JH584299_random chr7_GL456219_random
184189 953012 175968
chrX_GL456233_random chrY_JH584300_random chrY_JH584301_random
336933 182347 259875
chrY_JH584302_random chrY_JH584303_random chrUn_GL456239
155838 158099 40056
chrUn_GL456359 chrUn_GL456360 chrUn_GL456366
22974 31704 47073
chrUn_GL456367 chrUn_GL456368 chrUn_GL456370
42057 20208 26764
chrUn_GL456372 chrUn_GL456378 chrUn_GL456379
28664 31602 72385
chrUn_GL456381 chrUn_GL456382 chrUn_GL456383
25871 23158 38659
chrUn_GL456385 chrUn_GL456387 chrUn_GL456389
35240 24685 28772
chrUn_GL456390 chrUn_GL456392 chrUn_GL456393
24668 23629 55711
chrUn_GL456394 chrUn_GL456396 chrUn_JH584304
24323 21240 114452
> genome$chr1 # a MaskedDNAString object!
195471971-letter "MaskedDNAString" instance (# for masking)
seq: ####################################...####################################
masks:
maskedwidth maskedratio active names desc
1 3562779 0.01822655 TRUE AGAPS assembly gaps
2 0 0.00000000 TRUE AMB intra-contig ambiguities (empty)
all masks together:
maskedwidth maskedratio
3562779 0.01822655
> ## To get rid of the masks altogether:
> unmasked(genome$chr1) # same as BSgenome.Mmusculus.UCSC.mm10$chr1
195471971-letter "DNAString" instance
seq: NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN...NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
>
> if ("AGAPS" %in% masknames(genome)) {
+
+ ## Check that the assembly gaps contain only Ns:
+ checkOnlyNsInGaps <- function(seq)
+ {
+ ## Replace all masks by the inverted AGAPS mask
+ masks(seq) <- gaps(masks(seq)["AGAPS"])
+ unique_letters <- uniqueLetters(seq)
+ if (any(unique_letters != "N"))
+ stop("assembly gaps contain more than just Ns")
+ }
+
+ ## A message will be printed each time a sequence is removed
+ ## from the cache:
+ options(verbose=TRUE)
+
+ for (seqname in seqnames(genome)) {
+ cat("Checking sequence", seqname, "... ")
+ seq <- genome[[seqname]]
+ checkOnlyNsInGaps(seq)
+ cat("OK\n")
+ }
+ }
Checking sequence chr1 ... OK
Checking sequence chr2 ... caching chr2
OK
Checking sequence chr3 ... caching chr3
OK
Checking sequence chr4 ... uncaching chr2
caching chr4
OK
Checking sequence chr5 ... uncaching chr3
caching chr5
OK
Checking sequence chr6 ... caching chr6
OK
Checking sequence chr7 ... caching chr7
OK
Checking sequence chr8 ... uncaching chr6
uncaching chr5
uncaching chr4
caching chr8
OK
Checking sequence chr9 ... caching chr9
OK
Checking sequence chr10 ... caching chr10
OK
Checking sequence chr11 ... uncaching chr9
uncaching chr8
uncaching chr7
caching chr11
OK
Checking sequence chr12 ... caching chr12
OK
Checking sequence chr13 ... caching chr13
OK
Checking sequence chr14 ... caching chr14
OK
Checking sequence chr15 ... caching chr15
OK
Checking sequence chr16 ... caching chr16
OK
Checking sequence chr17 ... uncaching chr15
uncaching chr14
uncaching chr13
uncaching chr12
uncaching chr11
uncaching chr10
caching chr17
OK
Checking sequence chr18 ... caching chr18
OK
Checking sequence chr19 ... caching chr19
OK
Checking sequence chrX ... caching chrX
OK
Checking sequence chrY ... caching chrY
OK
Checking sequence chrM ... caching chrM
OK
Checking sequence chr1_GL456210_random ... caching chr1_GL456210_random
OK
Checking sequence chr1_GL456211_random ... caching chr1_GL456211_random
OK
Checking sequence chr1_GL456212_random ... caching chr1_GL456212_random
OK
Checking sequence chr1_GL456213_random ... caching chr1_GL456213_random
OK
Checking sequence chr1_GL456221_random ... caching chr1_GL456221_random
OK
Checking sequence chr4_GL456216_random ... caching chr4_GL456216_random
OK
Checking sequence chr4_GL456350_random ... caching chr4_GL456350_random
OK
Checking sequence chr4_JH584292_random ... caching chr4_JH584292_random
OK
Checking sequence chr4_JH584293_random ... uncaching chr4_GL456350_random
uncaching chr4_GL456216_random
uncaching chr1_GL456221_random
uncaching chr1_GL456213_random
uncaching chr1_GL456212_random
uncaching chr1_GL456211_random
uncaching chr1_GL456210_random
uncaching chrM
uncaching chrY
uncaching chrX
uncaching chr19
uncaching chr18
caching chr4_JH584293_random
OK
Checking sequence chr4_JH584294_random ... caching chr4_JH584294_random
OK
Checking sequence chr4_JH584295_random ... caching chr4_JH584295_random
OK
Checking sequence chr5_GL456354_random ... caching chr5_GL456354_random
OK
Checking sequence chr5_JH584296_random ... caching chr5_JH584296_random
OK
Checking sequence chr5_JH584297_random ... caching chr5_JH584297_random
OK
Checking sequence chr5_JH584298_random ... caching chr5_JH584298_random
OK
Checking sequence chr5_JH584299_random ... caching chr5_JH584299_random
OK
Checking sequence chr7_GL456219_random ... caching chr7_GL456219_random
OK
Checking sequence chrX_GL456233_random ... caching chrX_GL456233_random
OK
Checking sequence chrY_JH584300_random ... caching chrY_JH584300_random
OK
Checking sequence chrY_JH584301_random ... caching chrY_JH584301_random
OK
Checking sequence chrY_JH584302_random ... caching chrY_JH584302_random
OK
Checking sequence chrY_JH584303_random ... caching chrY_JH584303_random
OK
Checking sequence chrUn_GL456239 ... uncaching chrY_JH584302_random
uncaching chrY_JH584301_random
uncaching chrY_JH584300_random
uncaching chrX_GL456233_random
uncaching chr7_GL456219_random
uncaching chr5_JH584299_random
uncaching chr5_JH584298_random
uncaching chr5_JH584297_random
uncaching chr5_JH584296_random
uncaching chr5_GL456354_random
uncaching chr4_JH584295_random
uncaching chr4_JH584294_random
uncaching chr4_JH584293_random
uncaching chr4_JH584292_random
caching chrUn_GL456239
OK
Checking sequence chrUn_GL456359 ... caching chrUn_GL456359
OK
Checking sequence chrUn_GL456360 ... caching chrUn_GL456360
OK
Checking sequence chrUn_GL456366 ... caching chrUn_GL456366
OK
Checking sequence chrUn_GL456367 ... caching chrUn_GL456367
OK
Checking sequence chrUn_GL456368 ... caching chrUn_GL456368
OK
Checking sequence chrUn_GL456370 ... caching chrUn_GL456370
OK
Checking sequence chrUn_GL456372 ... caching chrUn_GL456372
OK
Checking sequence chrUn_GL456378 ... caching chrUn_GL456378
OK
Checking sequence chrUn_GL456379 ... caching chrUn_GL456379
OK
Checking sequence chrUn_GL456381 ... caching chrUn_GL456381
OK
Checking sequence chrUn_GL456382 ... caching chrUn_GL456382
OK
Checking sequence chrUn_GL456383 ... caching chrUn_GL456383
OK
Checking sequence chrUn_GL456385 ... caching chrUn_GL456385
uncaching chrUn_GL456383
uncaching chrUn_GL456382
uncaching chrUn_GL456381
uncaching chrUn_GL456379
uncaching chrUn_GL456378
uncaching chrUn_GL456372
uncaching chrUn_GL456370
uncaching chrUn_GL456368
uncaching chrUn_GL456367
uncaching chrUn_GL456366
uncaching chrUn_GL456360
uncaching chrUn_GL456359
OK
Checking sequence chrUn_GL456387 ... caching chrUn_GL456387
OK
Checking sequence chrUn_GL456389 ... caching chrUn_GL456389
OK
Checking sequence chrUn_GL456390 ... caching chrUn_GL456390
OK
Checking sequence chrUn_GL456392 ... caching chrUn_GL456392
OK
Checking sequence chrUn_GL456393 ... caching chrUn_GL456393
OK
Checking sequence chrUn_GL456394 ... caching chrUn_GL456394
OK
Checking sequence chrUn_GL456396 ... caching chrUn_GL456396
OK
Checking sequence chrUn_JH584304 ... caching chrUn_JH584304
OK
>
> ## See the GenomeSearching vignette in the BSgenome software
> ## package for some examples of genome-wide motif searching using
> ## Biostrings and the BSgenome data packages:
> #if (interactive())
> vignette("GenomeSearching", package="BSgenome")
>
>
>
>
>
> dev.off()
null device
1
>
Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/%{ <-- HERE (.*?)}/ at /usr/bin/run-mailcap line 528.
(atril:27094): GLib-GObject-WARNING **: invalid uninstantiatable type '(null)' in cast to 'EvMediaPlayerKeys'
(atril:27094): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'TotemScrsaver'