This routine returns a matrix or data frame containing all the unique rows of the
matrix or data frame supplied as its argument. That is, all the duplicate rows are
stripped out. Note that the ordering of the rows on exit is not the same
as on entry. It also returns an index attribute for relating the result back
to the original matrix.
Usage
uniquecombs(x)
Arguments
x
is an R matrix (numeric), or data frame.
Details
Models with more parameters than unique combinations of
covariates are not identifiable. This routine provides a means of
evaluating the number of unique combinations of coavariates in a
model. The routine calls compiled C code, and is based on sorting,
with consequent O(nlog(n)) cost. In principle a hash table based solution
should be only O(n).
unique and duplicated, can sometimes be used
in place of this, if the full index is not needed. Relative performance is variable.
If x is not a matrix or data frame on entry then an attmept is made to coerce
it to a data frame.
Value
A matrix or data frame consisting of the unique rows of x (in arbitrary order).
The matrix or data frame has an "index" attribute. index[i] gives the row of the returned
matrix that contains row i of the original matrix.
require(mgcv)
## matrix example...
X <- matrix(c(1,2,3,1,2,3,4,5,6,1,3,2,4,5,6,1,1,1),6,3,byrow=TRUE)
print(X)
Xu <- uniquecombs(X);Xu
ind <- attr(Xu,"index")
## find the value for row 3 of the original from Xu
Xu[ind[3],];X[3,]
## data frame example...
df <- data.frame(f=factor(c("er",3,"b","er",3,3,1,2,"b")),
x=c(.5,1,1.4,.5,1,.6,4,3,1.7))
uniquecombs(df)