ETL using Python
Asked Answered
H

3

15

I am working on a data warehouse and looking for an ETL solution that uses Python. I have played with SnapLogic as an ETL, but I was wondering if there were any other solutions out there.

This data warehouse is just getting started. Ihave not brought any data over yet. It will easily be over 100 gigs with the initial subset of data I want to load into it.

Haughay answered 21/9, 2010 at 16:4 Comment(3)
Could you describe what size of a data warehouse you're working on? Is it a long-established warehouse, or is it just getting started?Controvert
Check out pandas, petl and other etl tools.Sudatorium
Why is the requirement "uses Python"? You should pick the best tool for the job.Stander
S
26

Yes. Just write Python using a DB-API interface to your database.

Most ETL programs provide fancy "high-level languages" or drag-and-drop GUI's that don't help much.

Python is just as expressive and just as easy to work with.

Eschew obfuscation. Just use plain-old Python.

We do it every day and we're very, very pleased with the results. It's simple, clear and effective.

Schizopod answered 21/9, 2010 at 16:29 Comment(2)
Totally agree. Use sqlalchemy to get meta data from source and target tables and an ODBC driver for source and target databases.Gwyn
Doing this way works! However, it is too slow compared to ETL tools, is there a faster way to bulk load in 2022?Rathe
D
1

You can use pyodbc a library python provides to extract data from various Database Sources. And than use pandas dataframes to manipulate and clean the data as per the organizational needs. And than pyodbc to load it to your data warehouse.

Deadfall answered 12/5, 2020 at 23:36 Comment(0)
U
0

You all may want to check out the Zed lake. It lets you load a variety of data formats into data "pools". Once loaded you can use the Zed Language to transform it into whatever you need. I find the Zed language to be way easier than trying to do ETL with SQL. It can scale too.

Utilize answered 28/2, 2023 at 0:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.