R: Write ffdf data to a database table by using a DBI...
write.dbi.ffdf
R Documentation
Write ffdf data to a database table by using a DBI connection.
Description
Write ffdf data to a database table by using a DBI connection.
This can for example be used to store large ffdf datasets from R in
Oracle, SQLite, MySQL, PostgreSQL, Hive or other SQL databases.
Mark that for very large datasets, these SQL databases might have tools to speed up by bulk loading.
You might also consider that as an alternative to using this procedure.
character string with the name of the table to store the data in. Passed on to dbWriteTable.
dbConnect.args
a list of arguments to pass to DBI's dbConnect (like drv, dbname, username, password). See the examples.
RECORDBYTES
optional integer scalar representing the bytes needed to process a single row of the ffdf
BATCHBYTES
integer: bytes allowed for the size of the data.frame storing the result of reading one chunk.
See documentation in read.table.ffdf for more details.
by
integer passed on to chunk indicating to write to the database in chunks of this size. Overwrites
the behaviour of BATCHBYTES and RECORDBYTES.
VERBOSE
logical: TRUE to verbose timings for each processed chunk (default FALSE).
...
optional parameters passed on to dbWriteTable
Details
Opens up the DBI connection using DBI::dbConnect, writes data to the SQL table
using DBI::dbWriteTable by extracting the data in batches from the ffdf
and appending them to the table.
Value
invisible()
See Also
dbWriteTable, chunk
Examples
require(ff)
##
## Example query using data in sqlite
##
require(RSQLite)
dbfile <- system.file("smalldb.sqlite3", package="ETLUtils")
drv <- dbDriver("SQLite")
query <- "select * from testdata limit 10000"
x <- read.dbi.ffdf(query = query, dbConnect.args = list(drv = drv, dbname = dbfile),
first.rows = 100, next.rows = 1000, VERBOSE=TRUE)
write.dbi.ffdf(x = x, name = "helloworld", row.names = FALSE, overwrite = TRUE,
dbConnect.args = list(drv = drv, dbname = dbfile),
by = 1000, VERBOSE=TRUE)
test <- read.dbi.ffdf(query = "select * from helloworld",
dbConnect.args = list(drv = drv, dbname = dbfile))
## Not run:
require(ROracle)
write.dbi.ffdf(x = x, name = "hellooracle", row.names = FALSE, overwrite = TRUE,
dbConnect.args = list(drv = dbDriver("Oracle"),
user = "YourUser", password = "YourPassword", dbname = "Mydatabase"),
VERBOSE=TRUE)
## End(Not run)
Results
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(ETLUtils)
Loading required package: ff
Loading required package: bit
Attaching package bit
package:bit (c) 2008-2012 Jens Oehlschlaegel (GPL-2)
creators: bit bitwhich
coercion: as.logical as.integer as.bit as.bitwhich which
operator: ! & | xor != ==
querying: print length any all min max range sum summary
bit access: length<- [ [<- [[ [[<-
for more help type ?bit
Attaching package: 'bit'
The following object is masked from 'package:base':
xor
Attaching package ff
- getOption("fftempdir")=="/tmp/RtmpxFPLsO"
- getOption("ffextension")=="ff"
- getOption("ffdrop")==TRUE
- getOption("fffinonexit")==TRUE
- getOption("ffpagesize")==65536
- getOption("ffcaching")=="mmnoflush" -- consider "ffeachflush" if your system stalls on large writes
- getOption("ffbatchbytes")==16777216 -- consider a different value for tuning your system
- getOption("ffmaxbytes")==536870912 -- consider a different value for tuning your system
Attaching package: 'ff'
The following objects are masked from 'package:bit':
clone, clone.default, clone.list
The following objects are masked from 'package:utils':
write.csv, write.csv2
The following objects are masked from 'package:base':
is.factor, is.ordered
> png(filename="/home/ddbj/snapshot/RGM3/R_CC/result/ETLUtils/write.dbi.ffdf.Rd_%03d_medium.png", width=480, height=480)
> ### Name: write.dbi.ffdf
> ### Title: Write ffdf data to a database table by using a DBI connection.
> ### Aliases: write.dbi.ffdf
>
> ### ** Examples
>
> require(ff)
>
> ##
> ## Example query using data in sqlite
> ##
> require(RSQLite)
Loading required package: RSQLite
Loading required package: DBI
> dbfile <- system.file("smalldb.sqlite3", package="ETLUtils")
> drv <- dbDriver("SQLite")
> query <- "select * from testdata limit 10000"
> x <- read.dbi.ffdf(query = query, dbConnect.args = list(drv = drv, dbname = dbfile),
+ first.rows = 100, next.rows = 1000, VERBOSE=TRUE)
read.dbi.ffdf 1..100 (100) dbi-read=0.002sec ffdf-write=0.015sec
read.dbi.ffdf 101..1100 (1000) dbi-read=0.002sec ffdf-write=0.006sec
read.dbi.ffdf 1101..2100 (1000) dbi-read=0.002sec ffdf-write=0.004sec
read.dbi.ffdf 2101..2860 (760) dbi-read=0.001sec ffdf-write=0.007sec
dbi-read=0.007sec ffdf-write=0.032sec TOTAL=0.039sec
>
> write.dbi.ffdf(x = x, name = "helloworld", row.names = FALSE, overwrite = TRUE,
+ dbConnect.args = list(drv = drv, dbname = dbfile),
+ by = 1000, VERBOSE=TRUE)
2016-07-04 17:38:11 dbWriteTable chunk 1/3
2016-07-04 17:38:11 dbWriteTable chunk 2/3
2016-07-04 17:38:11 dbWriteTable chunk 3/3
> test <- read.dbi.ffdf(query = "select * from helloworld",
+ dbConnect.args = list(drv = drv, dbname = dbfile))
>
> ## Not run:
> ##D require(ROracle)
> ##D write.dbi.ffdf(x = x, name = "hellooracle", row.names = FALSE, overwrite = TRUE,
> ##D dbConnect.args = list(drv = dbDriver("Oracle"),
> ##D user = "YourUser", password = "YourPassword", dbname = "Mydatabase"),
> ##D VERBOSE=TRUE)
> ## End(Not run)
>
>
>
>
>
> dev.off()
null device
1
>