Access table in other than default scheme (database) from sparklyr
Asked Answered
H

3

6

After I managed it to connect to our (new) cluster using sparklyr with yarn-client method, now I can show just the tables from the default scheme. How can I connect to scheme.table? Using DBI it's working e.g. with the following line: dbGetQuery(sc, "SELECT * FROM scheme.table LIMIT 10") In HUE, I can show all tables from all schemes.

~g

Hessney answered 5/5, 2017 at 13:35 Comment(0)
S
6

You can either use a fully qualified name to register temporary view:

spark_session(sc) %>% 
  invoke("table", "my_database.my_table") %>%
  invoke("createOrReplaceTempView", "my_view")

tbl(sc, "my_view")

or use sql method to switch databases

spark_session(sc) %>% invoke("sql", "USE my_database")

and access table directly with dplyr:tbl:

tbl(sc, "my_table")
Stylolite answered 19/5, 2017 at 13:24 Comment(2)
Thank you very much! I used the second option, which worked fine for me.Hessney
Just a heads up to avoid confusion: if you've already loaded the tidyverse, purrr::invoke() will override sparklyr::invoke() unless the function is prefixed.Pitch
C
0

You can also use DBI'sdbgetQuery to change the database. This is useful bc it will also update your view in Connections to the specific data base rather than the default.

DBI::dbGetQuery(sc, "use <database>")

Lastly you can just reference the database within a tbl statement

dplyr::tbl(sc,"want_db.have_data") %>% ...
Chumash answered 22/9, 2018 at 1:13 Comment(1)
DBI::dbSendQuery would more appropriate as you are not collecting any data.Heerlen
L
0

Another option is to use tbl_change_db to change the default database for the session.

e.g.:

tbl_change_db("other_db")
Lactate answered 11/6, 2019 at 15:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.