Last data update: 2014.03.03
|
R: Random sample of clustered data
computeClusterSample | R Documentation |
Random sample of clustered data
Description
Random sample of clustered data
Usage
computeClusterSample(channel, km, sampleFraction, sampleSize, scaled = FALSE,
includeId = FALSE, test = FALSE)
Arguments
channel |
connection object as returned by odbcConnect .
|
km |
an object of class "toakmeans" obtained with computeKmeans .
|
sampleFraction |
one or more sample fractions to use in the sampling of data. (multipe
sampling fractions are not yet supported.)
|
sampleSize |
total sample size (applies only when sampleFraction is missing).
|
scaled |
logical: indicates if original (default) or scaled data returned.
|
includeId |
logical indicates if sample should include the key uniquely identifying
each data row.
|
test |
logical: if TRUE show what would be done, only (similar to parameter test in RODBC
functions: sqlQuery and sqlSave).
|
Value
computeClusterSample returns an object of class "toakmeans" (compatible with class "kmeans" ).
See Also
computeKmeans
Examples
if(interactive()){
# initialize connection to Lahman baseball database in Aster
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};
server=<dbhost>;port=2406;database=<dbname>;uid=<user>;pwd=<pw>")
km = computeKmeans(conn, "batting", centers=5, iterMax = 25,
aggregates = c("COUNT(*) cnt", "AVG(g) avg_g", "AVG(r) avg_r", "AVG(h) avg_h"),
id="playerid || '-' || stint || '-' || teamid || '-' || yearid",
include=c('g','r','h'), scaledTableName='kmeans_test_scaled',
centroidTableName='kmeans_test_centroids',
where="yearid > 2000")
km = computeClusterSample(conn, km, 0.01)
km
createClusterPairsPlot(km, title="Batters Clustered by G, H, R", ticks=FALSE)
}
Results
|