Are there any ETL tools that integrate with Rails models?
Asked Answered
P

3

8

I'm researching ETL tools to import flat files into a database and subsequently export xml files.

Many of the tools support generating code to use in your application; however, I haven't found any that support using code already in your application. Our model is complex (relationships, validations, polymorphic associations, callbacks, etc.).

What tools are available that will allow reuse of existing code? Or am I stuck recreating (and maintaining) my model in the ETL tool?

Note: My requirements for an ETL (as opposed to bulk inserts or activerecord-import) are the transformations. We receive data from over 200 different sources in a variety of formats, level of completeness, and cleanliness. Also, the "designer" most include is more realistic for the less-technical users who will be defining the transformations.

Papacy answered 23/2, 2012 at 21:55 Comment(2)
Where is the transformation logic? Where do you want it to be?Jinx
It depends. We have a bunch built into the application already but there are others that need to be done on a per-source basis. We're talking automotive data... Our application knows 99-01, 1999-01, 1999-2001 are all the same thing, and that HND, HNDA, HONDA, and HONDA/ACURA are all the same thing. These are the tip of the iceberg. Each of our sources has a different format. One may combine years like 99-01 and another puts them in different columns. Some will put multiple makes (HONDA, BMW) in one row, others will use 2. Again, tip of the berg, but those are what the ETL tool should handle.Papacy
L
6

ActiveWarehouse might prove useful. Initial search results make the project feel a bit old and defunct. A little digging yielded a fairly active, well documented branch of the project on GitHub: https://github.com/activewarehouse/activewarehouse-etl

Licensee answered 5/3, 2012 at 10:9 Comment(1)
It also just went 1.0. I had found this a while ago, good to see it's still alive. I'm going to take a closer look.Papacy
P
3

Write your own. ETL is a very simple process, ruby provides enough reflection support to handle this with some simple code. ETL Tools are not really helpful here, just generate dotty files to show the data sources, flows and transformations.

I've done the same in smalltalk for a data conversion. There I've used glamour and mondrian from the MOOSE reengineering toolsuite to provide more visibility.

Pigmentation answered 8/3, 2012 at 11:15 Comment(0)
L
0

Modularize, you want the Rails app and the ETL to ask about the meaning of 'HND' from the same place. Setup an API for that.

Lianeliang answered 5/3, 2012 at 1:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.