Setting up Docker image with R and Snowflake Drivers
Asked Answered
Q

1

3

Requirement: To connect and execute the RScript with connection from R to SnowflakeDB

I am trying to set up a Docker image that can communicate with a Snowflake database through R (either using RODBC or ODBC)

Error: The problem seems to be that It fails to install (or locate) the necessary Snowflake Drivers when establishing a connection

Error: nanodbc/nanodbc.cpp:1021: 00000: [unixODBC][Driver Manager]Can't open lib 'SnowflakeDSIIDriver': file not found

Gave connection details below

# Load required libraries
   library(paws)
   library(DBI)
   library(odbc)
   library(anomalize)
    con <- DBI::dbConnect(
      odbc::odbc(),
      Driver = "SnowflakeDSIIDriver",
      Server = "account.snowflakecomputing.com",
      Database = "DEV",
      Schema = "SCHEMA",
      Warehouse = "WH_XS",
      UID = username,
      PWD = password
    )

Here is my current template DockerFile

ARG AIRFLOW_VERSION=2.3.3
ARG PYTHON_RUNTIME_VERSION=3.8
FROM apache/airflow:${AIRFLOW_VERSION}-python${PYTHON_RUNTIME_VERSION}
SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]
USER 0
RUN sudo apt-get update && apt-get install -y r-base r-base-core r-base-dev libssl-dev libcurl4-openssl-dev libgdal-dev && \
    rm -r /var/lib/apt/lists/*

RUN R -e "install.packages(c('httr','shiny','jsonlite','data.table','forecast','anomalize','tibbletime','DBI','dplyr','dbplyr','odbc','ggplot2','DT','getip','shinyTime','paws','RODBC'), repos = 'https://cran.rstudio.com/')"
Quirt answered 5/1, 2023 at 9:17 Comment(0)
U
1

Your Dockerfile template looks good, it only needs to have the driver installation steps. Try this:

ARG AIRFLOW_VERSION=2.3.3
ARG PYTHON_RUNTIME_VERSION=3.8
FROM apache/airflow:${AIRFLOW_VERSION}-python${PYTHON_RUNTIME_VERSION}
SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]
USER 0
RUN sudo apt-get update && apt-get install -y r-base r-base-core r-base-dev libssl-dev libcurl4-openssl-dev libgdal-dev && \
    rm -r /var/lib/apt/lists/*

RUN R -e "install.packages(c('httr','shiny','jsonlite','data.table','forecast','anomalize','tibbletime','DBI','dplyr','dbplyr','odbc','ggplot2','DT','getip','shinyTime','paws','RODBC'), repos = 'https://cran.rstudio.com/')"

RUN curl -L -O https://sfc-repo.snowflakecomputing.com/odbc/linux/2.25.7/snowflake-odbc-2.25.7.x86_64.deb
RUN dpkg -i snowflake-odbc-2.25.7.x86_64.deb

ENTRYPOINT ["/bin/bash"]%  

In the example above, I hardcoded a link to download the last version of the driver to make the example more visual, but you can also replace it with a variable.

Additionally, in your connection string, the UID and PWD parameters should be in small letters, like uid and pwd, see below my example:

library(DBI)
library(odbc)
con <- DBI::dbConnect(
   odbc::odbc(),
   Driver = "SnowflakeDSIIDriver",
   Server = "account.snowflakecomputing.com",
   Database = "DEV",
   Schema = "SCHEMA",
   Warehouse = "WH_XS",
   uid = "USERNAME",
   pwd = 'password'
   )

mydata <- DBI::dbGetQuery(con,"SELECT CURRENT_TIMESTAMP()")
mydata

Unconscious answered 5/1, 2023 at 14:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.