Skip to content
Advertisement

R using RSQLite, DBI and dbplyr, can’t have more than one table in the RSQLite in order to make a join

I’m having a hard time understanding the behavior of these three packages: RSQLite, DBI, and dbplyr.

I’m using the mtcars dataframe and I’m loading it with the tibble package to add the car name as one of the rows.

I also created my own table that has information about where all these cars were built:

By creating these two tables I can practice joins using R. Here is an example of two different ways to write a semi_join:

But I really want to feel more comfortable making joins using SQL syntax. I have an upcoming job that will involve more SQL and I really want to study before I start. For me, the best way to study is by doing this stuff. Watching example online without doing it myself just doesn’t stick for me.

So here I am trying to create a SQL version of stuff using these R packages.

This YouTube video from Andrew Couch has been very helpful. Here is the link.

This show_query thing is interesting. It shows the SQL version of the dplyr code.

enter image description here

Another SO post here was answered and showed me how to find the tables listed using dbListTables.

enter image description here

This dbGetQuery will run the code and get the expected output.

enter image description here

Here is where I’m confused. I’m trying to put the cars_origin_tbl into RSQLite and then practice joining two different tables.

After I run dbListTables() it’s like my first sql_mtcars was forgotten and disappeared.

enter image description here

Now this dbGetQuery works fine because I’m just calling the newer table.

But this other dbGetQuery doesn’t work because I’m referring to both tables, which I thought would both be in RSQLite at this point. I get a message saying there is no sql_mtcars.

I’m basically trying to get two tables into the connection so I can practice joining them, but I’m not sure how to get both tables in there without one disappearing.

Advertisement

Answer

The solution is to use the same connection for both tables. The following works:

You can now semi-join these two tables:

Producing the following SQL:

As a side note, sql_mtcars and sql_mtcars_db are different R objects. The first is a table in local R memory. The second is a local access point to a remote table stored in a database.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement