I would like to know whether there is a way of importing JSON data from a MySQL DB to an R dataframe.
I have a table like this:
id created_at json 1 2020-07-01 {"name":"Dent, Arthur","group":"Green","age (y)":43,"height (cm)":187,"wieght (kg)":89,"sensor":34834834} 2 2020-07-01 {"name":"Doe, Jane","group":"Blue","age (y)":23,"height (cm)":172,"wieght (kg)":67,"sensor":12342439} 3 2020-07-01 {"name":"Curt, Travis","group":"Red","age (y)":13,"height (cm)":128,"wieght (kg)":47,"sensor":83287699}
I would like to get the columns ‘id’ and ‘json’. I am using RMySQL package for getting the data from the db to an R dataframe but this gives me only the column ‘id’, the column ‘json’ contains only NAs in each row.
Is there any way how to import/load the data and get the json column displayed? And possibly to extract the “sensor” part of the json values?
The result would be a dataframe (df) like this:
id json 1 {"name":"Dent, Arthur","group":"Green","age (y)":43,"height (cm)":187,"wieght (kg)":89,"sensor":34834834} 2 {"name":"Doe, Jane","group":"Blue","age (y)":23,"height (cm)":172,"wieght (kg)":67,"sensor":12342439} 3 {"name":"Curt, Travis","group":"Red","age (y)":13,"height (cm)":128,"wieght (kg)":47,"sensor":83287699}
Or with with the extracted value:
id sensor 1 "sensor":34834834 2 "sensor":12342439 3 "sensor":83287699
Thank you very much for any suggestions.
Advertisement
Answer
Using unnest_wider
from tidyr
library(dplyr) con <- DBI::dbConnect(RMySQL::MySQL(), 'db_name', user = 'user', password = 'pass', host = 'hostname') t <- tbl(con, 'table_name') t %>% as_tibble() %>% transmute(j = purrr::map(json, jsonlite::fromJSON)) %>% tidyr::unnest_wider(j) DBI::dbDisconnect(con)
Result:
# A tibble: 3 x 6 name group `age (y)` `height (cm)` `wieght (kg)` sensor <chr> <chr> <int> <int> <int> <int> 1 Dent, Arthur Green 43 187 89 34834834 2 Doe, Jane Blue 23 172 67 12342439 3 Curt, Travis Red 13 128 47 83287699
If you want to only retrieve data from the last 24 hours (as the OP requested) change the tbl(con, 'table_name')
statement to:
t <- DBI::dbGetQuery(con, 'SELECT * FROM `table_name` WHERE DATE(`created_at`) > NOW() - INTERVAL 1 DAY')