We have used an extensive re-annotation of the illuminaRatv1 probe sequences to provide additional information that is not captured in the standard Bioconductor packages. Whereas Bioconductor annotations are based on the RefSeq ID that each probe maps to, our additional mappings provide data specific to each probe on the platform. See below for details. We recommend using the probe quality as a form of filtering, and retaining only perfect or good probes for an analysis.
Details of custom mappings
illuminaRatv1listNewMappings
List all the custom re-annotation mappings provided by the package
illuminaRatv1fullReannotation
Return all the re-annotation information as a matrix
illuminaRatv1ARRAYADDRESS
Array Address code used to identify the probe at the bead-level
illuminaRatv1NUID
Lumi's nuID (universal naming scheme for oligonucleotides) Reference: Du et al. (2007), Biol Direct 2:16
illuminaRatv1PROBESEQUENCE
The 50 base sequence for the probe
illuminaRatv1PROBEQUALITY
Quality grade assigned to the probe: “Perfect” if it perfectly and uniquely matches the target transcript; “Good” if the probe, although imperfectly matching the target transcript, is still likely to provide considerably sensitive signal (up to two mismatches are allowed, based on empirical evidence that the signal intensity for 50-mer probes with less than 95% identity to the respective targets is less than 50% of the signal associated with perfect matches *); “Bad” if the probe matches repeat sequences, intergenic or intronic regions, or is unlikely to provide specific signal for any transcript; “No match” if it does not match any genomic region or transcript.
illuminaRatv1CODINGZONE
Coding status of target sequence: intergenic / intronic / Transcriptomic (“Transcriptomic” when the target transcript is non-coding or there is no information on the coding sequence)
illuminaRatv1GENOMICLOCATION
Probe's genomic coordinates (hg19 for human, mm9 for mouse or rn4 for rat)
illuminaRatv1GENOMICMATCHSIMILARITY
Percentage of similarity between the probe and its best genomic match in the alignable region, taking the probe as reference
illuminaRatv1SECONDMATCHES
Genomic coordinates of second best matches between the probe and the genome
illuminaRatv1SECONDMATCHSIMILARITY
Percentage of similarity between the probe and its second best genomic match in the alignable region, taking the probe as reference
illuminaRatv1TRANSCRIPTOMICMATCHSIMILARITY
Percentage of similarity between the probe and its target transcript in the alignable region, taking the probe as reference
illuminaRatv1OTHERGENOMICMATCHES
Genomic coordinates of sequences as alignable with the probe (in terms of number of matching nucleotides) as its main target
illuminaRatv1REPEATMASK
Overlapping RepeatMasked sequences, with number of bases overlapped by the repeat
illuminaRatv1OVERLAPPINGSNP
Overlapping annotated SNPs
illuminaRatv1ENTREZREANNOTATED
Entrez IDs
illuminaRatv1ENSEMBLREANNOTATED
Ensembl IDs
illuminaRatv1SYMBOLREANNOTATED
Gene symbol derived by re-annotation
illuminaRatv1REPORTERGROUPID
For probes marked as controls in Illuminas annotation file, these gives the type of control
illuminaRatv1REPORTERGROUPNAME
Usually a more informative name for the control type
Barbosa-Morais et al. (2010) A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data. Nucleic Acids Research
Examples
##See what new mappings are available
illuminaRatv1listNewMappings()
x <- illuminaRatv1PROBEQUALITY
mapped_probes <- mappedkeys(x)
# Convert to a list
xx <- as.list(x[mapped_probes])
if(length(xx) > 0) {
# Get the PROBEQUALITY for the first five probes
xx[1:5]
# Get the first one
xx[[1]]
}
##Overall table of qualities
table(unlist(xx))
x <- illuminaRatv1ARRAYADDRESS
mapped_probes <- mappedkeys(x)
# Convert to a list
xx <- as.list(x[mapped_probes])
if(length(xx) > 0) {
# Get the ARRAYADDRESS for the first five probes
xx[1:5]
# Get the first one
xx[[1]]
}
##Can do the mapping from array address to illumina ID using a revmap
y<- revmap(illuminaRatv1ARRAYADDRESS)
mapped_probes <- mappedkeys(y)
# Convert to a list
yy <- as.list(y[mapped_probes])
if(length(yy) > 0) {
# Get the ARRAYADDRESS for the first five probes
yy[1:5]
# Get the first one
yy[[1]]
}
x <- illuminaRatv1CODINGZONE
mapped_probes <- mappedkeys(x)
# Convert to a list
xx <- as.list(x[mapped_probes])
if(length(xx) > 0) {
# Get the CODINGZONE for the first five probes
xx[1:5]
# Get the first one
xx[[1]]
}
x <- illuminaRatv1PROBESEQUENCE
mapped_probes <- mappedkeys(x)
# Convert to a list
xx <- as.list(x[mapped_probes])
if(length(xx) > 0) {
# Get the PROBESEQUENCE for the first five probes
xx[1:5]
# Get the first one
xx[[1]]
}
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(illuminaRatv1.db)
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, cbind, colnames, do.call, duplicated, eval, evalq,
get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply,
match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank,
rbind, rownames, sapply, setdiff, sort, table, tapply, union,
unique, unsplit
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Loading required package: IRanges
Loading required package: S4Vectors
Attaching package: 'S4Vectors'
The following objects are masked from 'package:base':
colMeans, colSums, expand.grid, rowMeans, rowSums
Loading required package: org.Rn.eg.db
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/illuminaRatv1.db/illuminaRatv1NewMappings.Rd_%03d_medium.png", width=480, height=480)
> ### Name: illuminaRatv1listNewMappings
> ### Title: Custom mappings added to the package
> ### Aliases: illuminaRatv1ARRAYADDRESS illuminaRatv1NUID
> ### illuminaRatv1PROBESEQUENCE illuminaRatv1PROBEQUALITY
> ### illuminaRatv1CODINGZONE illuminaRatv1GENOMICLOCATION
> ### illuminaRatv1GENOMICMATCHSIMILARITY illuminaRatv1SECONDMATCHES
> ### illuminaRatv1SECONDMATCHSIMILARITY
> ### illuminaRatv1TRANSCRIPTOMICMATCHSIMILARITY
> ### illuminaRatv1OTHERGENOMICMATCHES illuminaRatv1REPEATMASK
> ### illuminaRatv1OVERLAPPINGSNP illuminaRatv1ENTREZREANNOTATED
> ### illuminaRatv1ENSEMBLREANNOTATED illuminaRatv1SYMBOLREANNOTATED
> ### illuminaRatv1listNewMappings illuminaRatv1fullReannotation
> ### illuminaRatv1REPORTERGROUPNAME illuminaRatv1REPORTERGROUPID
> ### Keywords: datasets
>
> ### ** Examples
>
>
> ##See what new mappings are available
>
> illuminaRatv1listNewMappings()
illuminaRatv1ARRAYADDRESS()
illuminaRatv1NUID()
illuminaRatv1PROBEQUALITY()
illuminaRatv1CODINGZONE()
illuminaRatv1PROBESEQUENCE()
illuminaRatv1SECONDMATCHES()
illuminaRatv1OTHERGENOMICMATCHES()
illuminaRatv1REPEATMASK()
illuminaRatv1OVERLAPPINGSNP()
illuminaRatv1ENTREZREANNOTATED()
illuminaRatv1GENOMICLOCATION()
illuminaRatv1SYMBOLREANNOTATED()
illuminaRatv1REPORTERGROUPNAME()
illuminaRatv1REPORTERGROUPID()
illuminaRatv1ENSEMBLREANNOTATED()
>
>
> x <- illuminaRatv1PROBEQUALITY
>
> mapped_probes <- mappedkeys(x)
> # Convert to a list
> xx <- as.list(x[mapped_probes])
> if(length(xx) > 0) {
+ # Get the PROBEQUALITY for the first five probes
+ xx[1:5]
+ # Get the first one
+ xx[[1]]
+ }
[1] "Bad"
>
>
> ##Overall table of qualities
> table(unlist(xx))
Bad Good Good*** Good**** No match Perfect
6305 367 59 204 1229 11967
Perfect*** Perfect****
2646 587
>
>
>
> x <- illuminaRatv1ARRAYADDRESS
>
> mapped_probes <- mappedkeys(x)
> # Convert to a list
> xx <- as.list(x[mapped_probes])
> if(length(xx) > 0) {
+ # Get the ARRAYADDRESS for the first five probes
+ xx[1:5]
+ # Get the first one
+ xx[[1]]
+ }
[1] "1570300"
>
> ##Can do the mapping from array address to illumina ID using a revmap
>
> y<- revmap(illuminaRatv1ARRAYADDRESS)
>
> mapped_probes <- mappedkeys(y)
> # Convert to a list
> yy <- as.list(y[mapped_probes])
> if(length(yy) > 0) {
+ # Get the ARRAYADDRESS for the first five probes
+ yy[1:5]
+ # Get the first one
+ yy[[1]]
+ }
[1] "ILMN_1356720"
>
>
>
> x <- illuminaRatv1CODINGZONE
>
> mapped_probes <- mappedkeys(x)
> # Convert to a list
> xx <- as.list(x[mapped_probes])
> if(length(xx) > 0) {
+ # Get the CODINGZONE for the first five probes
+ xx[1:5]
+ # Get the first one
+ xx[[1]]
+ }
[1] "Intronic"
>
> x <- illuminaRatv1PROBESEQUENCE
>
> mapped_probes <- mappedkeys(x)
> # Convert to a list
> xx <- as.list(x[mapped_probes])
> if(length(xx) > 0) {
+ # Get the PROBESEQUENCE for the first five probes
+ xx[1:5]
+ # Get the first one
+ xx[[1]]
+ }
[1] "GAGAGTTGAGCTTTTCGGCCTATATCCGGCGTGGGCGGAGCAACATCCGT"
>
>
>
>
>
>
>
> dev.off()
null device
1
>