Load raw data from a text file coming from image analysis and convert it to an arrayCGH
object, using additional information about the array design.
Supported file types are Genepix Results file (.gpr), outputs from
SPOT, or any text file with appropriate fields "Row" and "Column" and
specified array design
a connection or character string giving the name of the
file to import.
var.names
a vector of variables names used to compute the array
design. If default is not
overwritten, it is set to c("Block", "Column", "Row", "X", "Y") for
gpr files, c("Arr.colx", "Arr.rowy", "Spot.colx", "Spot.rowy") for
SPOT files, and c("Col", "Row") for other text files
spot.names
a list with spot-level variable names to be added to
arrayCGH$arrayValues
clone.names
a list with clone-level variable names to be added
to arrayCGH$cloneValues (only used in case of within-slide replicates)
type
a character value specifying the type of input file:
currently .gpr files ("gpr"), spot files ("spot") and other text
files with fields 'Col' and 'Row' ("default") are supported
id.rep
index of the replicate identifier (e.g. the name of the clone) in the
vector(clone.names)
design
a numeric vector of length 4 specifying array design as
number of blocks per column, number of blocks per row, number of columns by block, number of
rows per block. This field is mandatory
for "default" text files, optional for "gpr" files, and not used for
"SPOT" files
add.lines
boolean value to handle the case when array design
does not match number of lines. If TRUE, empty lines are added; if
FALSE, execution is stopped
...
additional import parameters (e.g. 'sep=', or 'comment.char=', to be passed to read.delim
function. Note that argument as.is=TRUE is always passed to
read.delim, in order to avoid unapropriate conversion of character
vectors to factors
Details
Mandatory elements of arrayCGH objects are the array design and the x and y
absolute coordinates of each spot on the array. Output files
from SPOT contain x and y relative coordinates of each spot within a
block, as well as block coordinates on the array; one can therefore
easily construct te corresponding arrayCGH
object.
.gpr files currently only contain x and y relative coordinates of each
spot within a block, and block index with no specification of the
spatial block design: if block design is not specified by user, we
compute it using the real pixel locations of each spot (X and Y variables in
usual .gpr files)
If clone.names is provided, an additional data frame is created with
clone-level information (e.g. clone names, positions,
chromosomes, quality marks), aggregated from array-level information
using the identifier specified by id.rep. This identifier is also
added to the arrayCGH object created, with name 'id.rep'.
Due to space limitations, only the first 100 lines of sample 'gpr' and
'spot' files are given in the standard distribution of
MANOR. Complete files are available at http://bioinfo.curie.fr/projects/manor/index.html
Value
an object of class arrayCGH
Note
People interested in tools for array-CGH analysis can
visit our web-page: http://bioinfo.curie.fr.
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(MANOR)
Loading required package: GLAD
######################################################################################
Have fun with GLAD
For smoothing it is possible to use either
the AWS algorithm (Polzehl and Spokoiny, 2002,
or the HaarSeg algorithm (Ben-Yaacov and Eldar, Bioinformatics, 2008,
If you use the package with AWS, please cite:
Hupe et al. (Bioinformatics, 2004, and Polzehl and Spokoiny (2002,
If you use the package with HaarSeg, please cite:
Hupe et al. (Bioinformatics, 2004, and (Ben-Yaacov and Eldar, Bioinformatics, 2008,
For fast computation it is recommanded to use
the daglad function with smoothfunc=haarseg
######################################################################################
New options are available in daglad: see help for details.
Attaching package: 'MANOR'
The following object is masked from 'package:base':
norm
> png(filename="/home/ddbj/snapshot/RGM3/R_BC/result/MANOR/import.Rd_%03d_medium.png", width=480, height=480)
> ### Name: import
> ### Title: Import raw file to an arrayCGH object
> ### Aliases: import import.default.aux import.gpr.aux import.spot.aux
> ### Keywords: IO file
>
> ### ** Examples
>
> dir.in <- system.file("extdata", package="MANOR")
>
> ## import from 'spot' files
> spot.names <- c("LogRatio", "RefFore", "RefBack", "DapiFore", "DapiBack", "SpotFlag", "ScaledLogRatio")
> clone.names <- c("PosOrder", "Chromosome")
> edge <- import(paste(dir.in, "/edge.txt", sep=""), type="spot",
+ spot.names=spot.names, clone.names=clone.names, add.lines=TRUE)
[1] "number of lines does not match array design: adding empty lines..."
>
> ## import from 'gpr' files
> spot.names <- c("Clone", "FLAG", "TEST_B_MEAN", "REF_B_MEAN",
+ "TEST_F_MEAN", "REF_F_MEAN", "ChromosomeArm")
> clone.names <- c("Clone", "Chromosome", "Position", "Validation")
>
> ac <- import(paste(dir.in, "/gradient.gpr", sep=""), type="gpr",
+ spot.names=spot.names, clone.names=clone.names, sep="\t",
+ comment.char="@", add.lines=TRUE)
[1] "number of lines does not match array design: adding empty lines..."
[1] "calculating array design..."
>
>
>
>
>
> dev.off()
null device
1
>