R Graphical Manual

Browse All

Last data update: 2014.03.03

R: Oja Ranks - Affine Equivariant Multivariate Ranks

ojaRank

R Documentation

Oja Ranks – Affine Equivariant Multivariate Ranks

Description

The function computes the Oja rank of a point x w.r.t. a data set X or, if no point x is given, the Oja ranks of all points in X.

Usage

ojaRank(X, x = NULL, p = NULL, silent = FALSE, na.action = na.fail)

Arguments

`X`	numeric data.frame or matrix containing the data points as rows.
`x`	`NULL` or a numeric vector, the point for which the Oja rank should be computed.
`p`	`NULL` or a number between 0 and 1 which specifies the fraction of hyperplanes to be used for subsampling. If `p = 1`, no subsampling is done. If `p = NULL`, the value of `p` is determined based on the size of the data set. See details.
`silent`	logical, if subsampling is done or the expected computation time is too long, a warning message will be printed unless `silent` is `TRUE`. The default is `FALSE`.
`na.action`	a function which indicates what should happen when the data contain 'NA's. Default is to fail.

Details

The function computes the Oja rank of the point x w.r.t. the data set X or, if no x is specified, the Oja ranks of all data points in X w.r.t. X. For a definition of Oja rank see reference below.

The matrix X needs to have at least as many rows as columns in order to give sensible results. The vector x has to be of length ncol(X). If x is specified, a vector of length ncol(X) is returned. Otherwise the return value is a matrix of the same dimensions as X where the i-th row contains the Oja rank of the i-th row of X.

The function will also work for matrices X with only one column and also vectors. Then centered and normalized (univariate) ranks are returned.

For n = nrow(X) data points in R^k, where k = ncol(X), the computation of the Oja rank necessitates the evaluation of N = choose(n,k) hyperplanes in R^k. Thus for large data sets the function offers a subsampling option in order to deliver (approximate) results within reasonable time. The subsampling fraction is controlled by the parameter p: If p < 1 is passed to the function, the computation is based on a random sample of only pN of all possible N hyperplanes. If p is not specified (which defaults to p = NULL), it is automatically determined based on n and k to yield a sensible trade-off between accuracy and computing time. If choose(n,k)*k^3 < 6e+06, the sample fraction p is set to 1 (no subsampling). Otherwise p is choosen such that the computation (of one rank) usually takes around 20 seconds (on a 1.66 GHz CPU and 1 GB RAM). If all Oja ranks of X are requested, a hyperplane sample is drawn once, all Oja ranks are then computed based on this sample.

Finally, subsampling is feasible. Even for very small p useable results can be expected, see e.g. the examples for the function ojaRCM.

Claudia K<c3><83><c2><b6>llmann is acknowledged for bug-fixing this function.

Value

either a numeric vector, the Oja rank of x, or a matrix of the same dimensions as X containing the Oja ranks of X as rows.

Author(s)

Daniel Vogel, Jyrki M<c3><83><c2><b6>tt<c3><83><c2><b6>nen

References

Oja, H. (1999), Affine invariant multivariate sign and rank tests and corresponding estimates: A review, Scand. J. Statist., 26, 319–343.

Examples

### ----<< Example 1 >>---- : 30 points in R^2
set.seed(123)
X <- rmvnorm(n = 30,mean = c(0,0)) # from package 'mvtnorm'
ojaRank(X)
ojaRank(X, x = c(100,100))
ojaRank(X, x = ojaMedian(X, alg="exact")) # close to zero

# The following two return the same (only in different time)
ojaRank(X)
t(apply(X, 1, function(y){ojaRank(X,y)}))
# but the following two do not (due to different subsampling).
# 1)
set.seed(123); ojaRank(X, p = 0.9, silent = TRUE)
# 2)
set.seed(123) 
t(apply(X, 1, function(y){ojaRank(X, y, p = 0.9, silent = TRUE)}))
# In 1) one subsample for all ranks is drawn, whereas in 2)
# a different sample for each rank is drawn.


### ----<< Example 2 >>---- : three points in R^3: only one hyperplane
# The following commands return the same result.
ojaRank(X = diag(rep(1, 3)), x = c(0,0,0))
ojaRank(X = diag(rep(1, 3)), x = c(-100,-110,-5550))
hyperplane(X = diag(rep(1,3)))[1:3]



### ----<< Example 3 >>---- : 300 points in R^7
# Subsampling is done.
# The following example might take a bit longer:
## Not run: 
set.seed(123)
X <- rmvnorm(n = 300, mean = rep(0, 7))
system.time(or <- ojaRank(x = 1:7, X = X))
# PLEASE NOTE: The computation of the Oja rank is based on a 
# random sub-sample of less than 1% of all possible hyperplanes.
#
#       user      system     elapsed 
#      18.47        0.00       18.47 
print(or,d=4)
# [1]  7.733  6.613  6.839  7.383 18.237 21.851 23.700

## End(Not run)


### ----<< Example 4 >>---- : univariate ranks
ojaRank(1:10)
ojaRank(X = 1:10, x = 5.5)

Results


R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(OjaNP)
Loading required package: ICS
Loading required package: mvtnorm
Loading required package: ICSNP
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/OjaNP/ojaRank.Rd_%03d_medium.png", width=480, height=480)
> ### Name: ojaRank
> ### Title: Oja Ranks - Affine Equivariant Multivariate Ranks
> ### Aliases: ojaRank
> ### Keywords: multivariate nonparametric
> 
> ### ** Examples
> 
> ### ----<< Example 1 >>---- : 30 points in R^2
> set.seed(123)
> X <- rmvnorm(n = 30,mean = c(0,0)) # from package 'mvtnorm'
> ojaRank(X)
             [,1]        [,2]
 [1,] -0.56228243 -0.17795924
 [2,]  0.98048593  0.03176296
 [3,]  0.37565777  0.84091824
 [4,]  0.14494860 -0.72689765
 [5,] -0.69952345 -0.41167858
 [6,]  0.89831793  0.35024385
 [7,]  0.27450234  0.16397828
 [8,] -0.24716321  0.75351274
 [9,]  0.07040030 -0.85071109
[10,]  0.52136138 -0.43239617
[11,] -0.99425381 -0.13395380
[12,] -0.87011671 -0.58489231
[13,] -0.47220638 -0.83785899
[14,]  0.75404098  0.16333734
[15,] -0.78834753  0.38087950
[16,]  0.26894568 -0.39500475
[17,]  0.69727577  0.72548046
[18,]  0.66539344  0.57634894
[19,]  0.47238119 -0.07565708
[20,] -0.25578192 -0.39135486
[21,] -0.74332050 -0.14158425
[22,] -0.58314496  0.58715787
[23,]  0.70740499 -0.52904203
[24,] -0.38067106 -0.51466130
[25,]  0.71958632 -0.09646419
[26,]  0.06103402 -0.04257788
[27,]  0.12542546  0.70658736
[28,] -0.07891634  0.72498679
[29,] -0.99274170  0.14495552
[30,] -0.06869210  0.19254429
> ojaRank(X, x = c(100,100))
[1] 0.8792920 0.7312051
> ojaRank(X, x = ojaMedian(X, alg="exact")) # close to zero
[1] 0.003498390 0.002613336
> 
> # The following two return the same (only in different time)
> ojaRank(X)
             [,1]        [,2]
 [1,] -0.56228243 -0.17795924
 [2,]  0.98048593  0.03176296
 [3,]  0.37565777  0.84091824
 [4,]  0.14494860 -0.72689765
 [5,] -0.69952345 -0.41167858
 [6,]  0.89831793  0.35024385
 [7,]  0.27450234  0.16397828
 [8,] -0.24716321  0.75351274
 [9,]  0.07040030 -0.85071109
[10,]  0.52136138 -0.43239617
[11,] -0.99425381 -0.13395380
[12,] -0.87011671 -0.58489231
[13,] -0.47220638 -0.83785899
[14,]  0.75404098  0.16333734
[15,] -0.78834753  0.38087950
[16,]  0.26894568 -0.39500475
[17,]  0.69727577  0.72548046
[18,]  0.66539344  0.57634894
[19,]  0.47238119 -0.07565708
[20,] -0.25578192 -0.39135486
[21,] -0.74332050 -0.14158425
[22,] -0.58314496  0.58715787
[23,]  0.70740499 -0.52904203
[24,] -0.38067106 -0.51466130
[25,]  0.71958632 -0.09646419
[26,]  0.06103402 -0.04257788
[27,]  0.12542546  0.70658736
[28,] -0.07891634  0.72498679
[29,] -0.99274170  0.14495552
[30,] -0.06869210  0.19254429
> t(apply(X, 1, function(y){ojaRank(X,y)}))
             [,1]        [,2]
 [1,] -0.56228243 -0.17795924
 [2,]  0.98048593  0.03176296
 [3,]  0.37565777  0.84091824
 [4,]  0.14494860 -0.72689765
 [5,] -0.69952345 -0.41167858
 [6,]  0.89831793  0.35024385
 [7,]  0.27450234  0.16397828
 [8,] -0.24716321  0.75351274
 [9,]  0.07040030 -0.85071109
[10,]  0.52136138 -0.43239617
[11,] -0.99425381 -0.13395380
[12,] -0.87011671 -0.58489231
[13,] -0.47220638 -0.83785899
[14,]  0.75404098  0.16333734
[15,] -0.78834753  0.38087950
[16,]  0.26894568 -0.39500475
[17,]  0.69727577  0.72548046
[18,]  0.66539344  0.57634894
[19,]  0.47238119 -0.07565708
[20,] -0.25578192 -0.39135486
[21,] -0.74332050 -0.14158425
[22,] -0.58314496  0.58715787
[23,]  0.70740499 -0.52904203
[24,] -0.38067106 -0.51466130
[25,]  0.71958632 -0.09646419
[26,]  0.06103402 -0.04257788
[27,]  0.12542546  0.70658736
[28,] -0.07891634  0.72498679
[29,] -0.99274170  0.14495552
[30,] -0.06869210  0.19254429
> # but the following two do not (due to different subsampling).
> # 1)
> set.seed(123); ojaRank(X, p = 0.9, silent = TRUE)
             [,1]        [,2]
 [1,] -0.55802960 -0.15266959
 [2,]  0.95149804  0.01557666
 [3,]  0.36492913  0.85347230
 [4,]  0.15544607 -0.73120690
 [5,] -0.70292529 -0.38090489
 [6,]  0.86221221  0.34550849
 [7,]  0.25435541  0.17666418
 [8,] -0.22770509  0.76311443
 [9,]  0.09043469 -0.85178895
[10,]  0.52125782 -0.42575411
[11,] -0.97219538 -0.12312685
[12,] -0.86403094 -0.58119067
[13,] -0.47721156 -0.83665374
[14,]  0.71443743  0.15646975
[15,] -0.75917951  0.38658504
[16,]  0.27521202 -0.37978103
[17,]  0.64835631  0.74712446
[18,]  0.61921478  0.59199352
[19,]  0.45785554 -0.07644911
[20,] -0.25531108 -0.37217913
[21,] -0.72017945 -0.12182411
[22,] -0.56036043  0.59334621
[23,]  0.69097806 -0.54700704
[24,] -0.37151514 -0.50083834
[25,]  0.68576666 -0.10239716
[26,]  0.05612722 -0.02715184
[27,]  0.11098299  0.71145994
[28,] -0.08027701  0.72946955
[29,] -0.96704771  0.15048485
[30,] -0.08554257  0.22117992
> # 2)
> set.seed(123) 
> t(apply(X, 1, function(y){ojaRank(X, y, p = 0.9, silent = TRUE)}))
             [,1]        [,2]
 [1,] -0.55802960 -0.15266959
 [2,]  0.99218753  0.05502594
 [3,]  0.39624887  0.84608702
 [4,]  0.15678627 -0.72498381
 [5,] -0.67453509 -0.41070044
 [6,]  0.86651591  0.35037156
 [7,]  0.27575847  0.16208564
 [8,] -0.25591875  0.75496747
 [9,]  0.07808861 -0.86158873
[10,]  0.55759710 -0.42828275
[11,] -0.97554779 -0.10002450
[12,] -0.86590098 -0.57683292
[13,] -0.46155176 -0.84726202
[14,]  0.72618622  0.15258497
[15,] -0.77951527  0.38306997
[16,]  0.25237210 -0.40842626
[17,]  0.70421832  0.73208227
[18,]  0.67453583  0.57203996
[19,]  0.45476981 -0.06638932
[20,] -0.25739940 -0.39966981
[21,] -0.75137051 -0.12424319
[22,] -0.57129219  0.59650076
[23,]  0.68232088 -0.56446510
[24,] -0.35339958 -0.50187433
[25,]  0.73201306 -0.08733235
[26,]  0.07384314 -0.03424140
[27,]  0.13295764  0.68870267
[28,] -0.06923768  0.70016491
[29,] -1.02213127  0.13234891
[30,] -0.08592449  0.18621720
> # In 1) one subsample for all ranks is drawn, whereas in 2)
> # a different sample for each rank is drawn.
> 
> 
> ### ----<< Example 2 >>---- : three points in R^3: only one hyperplane
> # The following commands return the same result.
> ojaRank(X = diag(rep(1, 3)), x = c(0,0,0))
[1] -1 -1 -1
> ojaRank(X = diag(rep(1, 3)), x = c(-100,-110,-5550))
[1] -1 -1 -1
> hyperplane(X = diag(rep(1,3)))[1:3]
[1] -1 -1 -1
> 
> 
> 
> ### ----<< Example 3 >>---- : 300 points in R^7
> # Subsampling is done.
> # The following example might take a bit longer:
> ## Not run: 
> ##D set.seed(123)
> ##D X <- rmvnorm(n = 300, mean = rep(0, 7))
> ##D system.time(or <- ojaRank(x = 1:7, X = X))
> ##D # PLEASE NOTE: The computation of the Oja rank is based on a 
> ##D # random sub-sample of less than 1% of all possible hyperplanes.
> ##D #
> ##D #       user      system     elapsed 
> ##D #      18.47        0.00       18.47 
> ##D print(or,d=4)
> ##D # [1]  7.733  6.613  6.839  7.383 18.237 21.851 23.700
> ## End(Not run)
> 
> 
> ### ----<< Example 4 >>---- : univariate ranks
> ojaRank(1:10)
      [,1]
 [1,] -0.9
 [2,] -0.7
 [3,] -0.5
 [4,] -0.3
 [5,] -0.1
 [6,]  0.1
 [7,]  0.3
 [8,]  0.5
 [9,]  0.7
[10,]  0.9
> ojaRank(X = 1:10, x = 5.5)
[1] -1.110223e-17
> 
> 
> 
> 
> 
> dev.off()
null device 
          1 
>