Home > PostgreSQL, R, Statistics > Play sqldf with RPostgreSQL

Play sqldf with RPostgreSQL

This post is obsolete, the content in this post is not correct, please see Gabor’s comment.

 

In the previous post Manipulate data in R via SQL interface, I introduced how to use RpgSQL as DB driver to interface sqldf and PostgreSQL.

But there are some problems when using RpgSQL.

> options(sqldf.driver = “pgSQL”)
> sqldf(“select Species from iris where Species like ‘set%’ limit 3″)
Error in .verify.JDBC.result(r, “Unable to retrieve JDBC result set for “, :
Unable to retrieve JDBC result set for select Species from iris where Species like ‘set%’ limit 3 (ERROR: column “species” does not exist
Position: 8)
>

And RPostgreSQL works fine

> options(sqldf.driver = “PostgreSQL”)
> sqldf(“select Species from iris limit 3″)
Species
1 setosa
2 setosa
3 setosa
> sqldf(“select Species from iris where Species like ‘set%’ limit 3″)
Species
1 setosa
2 setosa
3 setosa

If you have trouble in installing RPostgreSQL under Mac OSX, please refer to Install R Package RPostgreSQL in Mac OSX.

Advertisement
  1. Gabor Grothendieck
    October 1, 2010 at 2:09 AM | #1

    - sqldf does not support the RPostgresSQL driver since it lacks certain capabilities that are needed. If you don’t specify H2 or pgSQL as a driver then it defaults to SQLite so your PostgresSQL example was actually using SQLite.

    - also you don’t have to specify the driver option at all. Just load the RpgSQL package, i.e. library(RpgSL), and sqldf will notice that the package is there and assume you wanted to use PostgresSQL instead of SQLite.

    - make sure that you are using “postgres” as your user on the postgres database, password of “” and dbname of “test”. That’s probably easiest but if you must change them you can change the last two by using the RpgSQL.password and RpgSQL.dbname options of the RpgSQL driver — see ?pgSQL in the RpgSQL package. I have something like this in my .Rprofile:

    options(RpgSQL.password = “mypassword”, RpgSQL.dbname = “mydb”)

    - if you want to use mixed case column names in PostgresSQL SQL statements such as Species they must be within double quotes.

    So, on a R fresh session just do the following where the last example below shows that it is using PostgresSQL since that example won’t work in SQLite.

    > library(sqldf)
    Loading required package: DBI
    Loading required package: RSQLite
    Loading required package: RSQLite.extfuns
    Loading required package: gsubfn
    Loading required package: proto
    Loading required package: chron
    > library(RpgSQL)
    Loading required package: RJDBC
    Loading required package: rJava
    >
    > sqldf('select "Species" from iris where "Species" like \'set%\' limit 3')
    Loading required package: tcltk
    Loading Tcl/Tk interface ... done
      Species
    1  setosa
    2  setosa
    3  setosa
    > sqldf('select regr_slope(demand, "Time") slope,
    +  regr_intercept(demand, "Time") intercept,
    +  corr(demand, "Time") corr from BOD')
         slope intercept      corr
    1 1.721429  8.521429 0.8030693
    
  2. chrischou
    October 2, 2010 at 8:47 AM | #2

    Thanks Gabor, RpgSQL works fine now. :)

  1. September 28, 2010 at 4:55 AM | #1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.