RVPedigree(method = "ASKAT", y = NULL, X = NULL, Phi = NULL,
filename = NULL, type = "bed", regions = NULL, weights = NULL,
Nperm = 100, pvalThreshold = 0.1, VCC3afterVCC1 = FALSE, Ncores = 1)
Arguments
method
character, selects the method to use for the
association testing. Can be one of the following:
"ASKAT" (default)
"NASKAT", normalized ASKAT
"VCC1", VC-C1
"VCC2", VC-C2
"VCC3", VC-C3
y
vector of phenotype data (one entry per individual), of
length n.
X
matrix of covariates including intercept (dimension:
n \times p, with p the number of covariates)
Phi
Relationship matrix (i.e. twice the kinship matrix); an
n \times n square symmetric positive-definite matrix.
filename
character, path to input file containing haplotype data
type
character, 'ped', 'bed' (default) or
'shapeit-haps' format of input file containing haplotype
data
regions
a data frame with details of the genomic regions in
which the association test specified by the method
parameter should be run. The data frame should have one row
per region and (at least) four columns with the following
names:
Name: Name of the region (e.g. Gene 01)
Chr: Chromosome on which the region is located.
StartPos: The base pair coordinate at which the
region starts
EndPos: The base pair coordinate at which the
region ends.
Any other columns will be ignored.
weights
optional numeric vector of genotype weights. If
this option is not specified, the beta distribution is used
for weighting the variants, with each weight given by
w_i = dbeta(f_i, 1, 25)^2, with f_i the minor
allele frequency (MAF) of variant i. This default is the
same as used by the
SKAT
package. This vector is used as the diagonal of the
m \times m matrix W, with m the number of
variants.
Nperm
(integer) The number of permutations to be done to
calculate the empirical p-value if the VCC2 or VCC3 method is
used. For other methods this parameter is ignored (default:
100).
pvalThreshold
(numeric) Threshold for the association
p-value. Regions with a p-value below this threshold will not
be present in the output data frame (default: 0.1).
VCC3afterVCC1
(logical) Boolean value that indicates
whether the VC-C3 method should automatically be run on the
variants passing the p-value threshold set using the
pvalThreshold parameter (default: FALSE).
Ncores
(integer) Number of processor (CPU) cores to be used
in parallel when doing running the association analysis. If
the number of regions is larger than the number of cores, then
each region gets to use maximum one core. If the number of
cores is larger than the number of regions and the VCC2 or
VCC3 methods are selected, the remaining cores are distributed
among the regions to parallelize the permutations used to
determine the p-value (default: 1).
Details
The RVPedigree function is the main function of the RVPedigree
used package.
Under the hood this function calls ASKAT.region,
NormalizedASKAT.region,
VCC1.region, VCC2.region or
VCC3.region, depending on the method
parameter specified by the user.
Value
A data frame containing results of the association test
specified by the method parameter for each region in
the data frame specified by the regions parameter. The
output data frame contains the following columns:
Score.Test: the score of the given association test
P.value: the p-value of the association test
N.Markers: the number of markers in the region
regionname: Name of the regions/genes on which you
are running the association tests
Note that regions that do not contain any genetic variants
will be removed from the output.