How can I move files from a local file server (SFTP) directly into Snowflake?
Asked Answered
B

2

6

I am looking to move files from a couple of file servers (accessed via SFTP) into Snowflake directly using the Python Snowflake connector.

I have considered using Data Factory (ADF) to pull the files into blob storage and then to create an Azure stage to move the data into Snowflake. However, I am curious to see if there is a way to do this without storing the data in blob storage? In other words, can I move the files directly into Snowflake without storing them elsewhere first? I don't see a way to do this in ADF without storing in Blob Storage and the Snowflake documentation indicates that stages can be created via Azure/AWS or from a local file system, but can this be done for a file server that needs accessed via SFTP? I've been working on this with the Python connector but it still seems like the data must be stored somewhere else first (either local hard drive or blob storage) before it can be staged into Snowflake.

Bayne answered 3/5, 2019 at 17:57 Comment(0)
B
1

In the last couple of months, Azure Data Factory now has native integration with Snowflake via the Snowflake connector. So the current answer would be that there is a connector that you can use in the ADF portal.

ADF Now Supports Data Integration with Snowflake:

The Snowflake Connector is supported for a Copy Activity with source/sink matrix table, mapping data flow, and lookup activity:

Bayne answered 28/12, 2020 at 16:1 Comment(0)
H
4

Assuming you want to use Snowflake's bulk-load mechanisms (so COPY entire files, not INSERT single rows), the file always needs to end up in the blob storage at some moment.

  • you can move the files from SFTP to blob storage yourself and then use a COPY in Snowflake
  • you can download the files to local filesystem, and then use Snowflake's PUT command to copy the files to the staging location, and then use COPY again

There is no way to create a stage that would point to SFTP.

Some more info here.

Note: In theory you could fetch the data from SFTP e.g. in Python and then use INSERT to insert the rows through SQL. Then you do not use the stage, but it's more complex and certainly not worth the effort.

Hexose answered 5/5, 2019 at 7:31 Comment(2)
Yes, I would use their bulk-load COPY command. Essentially, the two solutions you offered were the only ones I had found and I suspected (based on the documentation) that this was the only way to really do it (at least right now), but figured I'd double check to see if anyone else had any other ideas. I actually had a chat with someone from Snowflake after I posted this question and they seemed to confirm that this is the way to go. The easiest solution using ADF is to use an ADF copy command to Blob Storage and then use an Azure Stage to copy into Snowflake.Bayne
Is it not possible to install the snowflake CLI onto the sftp server and then run the PUT command directly from the sftp server?Yingyingkow
B
1

In the last couple of months, Azure Data Factory now has native integration with Snowflake via the Snowflake connector. So the current answer would be that there is a connector that you can use in the ADF portal.

ADF Now Supports Data Integration with Snowflake:

The Snowflake Connector is supported for a Copy Activity with source/sink matrix table, mapping data flow, and lookup activity:

Bayne answered 28/12, 2020 at 16:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.