Play sqldf with RPostgreSQL
This post is obsolete, the content in this post is not correct, please see Gabor’s comment.
In the previous post Manipulate data in R via SQL interface, I introduced how to use RpgSQL as DB driver to interface sqldf and PostgreSQL.
But there are some problems when using RpgSQL.
> options(sqldf.driver = “pgSQL”)
> sqldf(“select Species from iris where Species like ‘set%’ limit 3″)
Error in .verify.JDBC.result(r, “Unable to retrieve JDBC result set for “, :
Unable to retrieve JDBC result set for select Species from iris where Species like ‘set%’ limit 3 (ERROR: column “species” does not exist
Position: 8)
>
And RPostgreSQL works fine
> options(sqldf.driver = “PostgreSQL”)
> sqldf(“select Species from iris limit 3″)
Species
1 setosa
2 setosa
3 setosa
> sqldf(“select Species from iris where Species like ‘set%’ limit 3″)
Species
1 setosa
2 setosa
3 setosa
If you have trouble in installing RPostgreSQL under Mac OSX, please refer to Install R Package RPostgreSQL in Mac OSX.
- sqldf does not support the RPostgresSQL driver since it lacks certain capabilities that are needed. If you don’t specify H2 or pgSQL as a driver then it defaults to SQLite so your PostgresSQL example was actually using SQLite.
- also you don’t have to specify the driver option at all. Just load the RpgSQL package, i.e. library(RpgSL), and sqldf will notice that the package is there and assume you wanted to use PostgresSQL instead of SQLite.
- make sure that you are using “postgres” as your user on the postgres database, password of “” and dbname of “test”. That’s probably easiest but if you must change them you can change the last two by using the RpgSQL.password and RpgSQL.dbname options of the RpgSQL driver — see ?pgSQL in the RpgSQL package. I have something like this in my .Rprofile:
options(RpgSQL.password = “mypassword”, RpgSQL.dbname = “mydb”)
- if you want to use mixed case column names in PostgresSQL SQL statements such as Species they must be within double quotes.
So, on a R fresh session just do the following where the last example below shows that it is using PostgresSQL since that example won’t work in SQLite.
> library(sqldf) Loading required package: DBI Loading required package: RSQLite Loading required package: RSQLite.extfuns Loading required package: gsubfn Loading required package: proto Loading required package: chron > library(RpgSQL) Loading required package: RJDBC Loading required package: rJava > > sqldf('select "Species" from iris where "Species" like \'set%\' limit 3') Loading required package: tcltk Loading Tcl/Tk interface ... done Species 1 setosa 2 setosa 3 setosa > sqldf('select regr_slope(demand, "Time") slope, + regr_intercept(demand, "Time") intercept, + corr(demand, "Time") corr from BOD') slope intercept corr 1 1.721429 8.521429 0.8030693Thanks Gabor, RpgSQL works fine now.