etl Questions

7

Solved

Currently I'm using an AWS Glue job to load data into RedShift, but after that load I need to run some data cleansing tasks probably using an AWS Lambda function. Is there any way to trigger a Lamb...
Obvert asked 28/2, 2018 at 16:43

2

I'm setting up a new Jupyter Notebook in AWS Glue as a dev endpoint in order to test out some code for running an ETL script. So far I created a basic ETL script using AWS Glue but, for some reason...
Oleaceous asked 9/7, 2019 at 18:43

1

I have a source as XML and has a huge number of records. just for the sample I have pasted 1 record below : <?xml version='1.0' encoding='UTF-8'?><wd:Report_Data xmlns:wd="urn:com.wor...
Shamefaced asked 7/11, 2022 at 11:8

3

Solved

I have a CSV file with a {LF} delimiting each row and a date column with the date format as "12/20/2010" (including quotation marks) My destination column is a SQL Server 2008 database table of ty...
Sickly asked 2/9, 2011 at 19:42

1

I try to use configs in dag using "trigger w/config". def execute(**kwargs): dag_run = kwargs['dag_run'] start_date = dag_run.conf['start_dt'] if 'start_dt' in dag_run.conf.keys() el...
Buckish asked 30/3, 2022 at 9:8

4

Solved

I have the following task to solve: Files are being sent at irregular times through an endpoint and stored locally. I need to trigger a DAG run for each of these files. For each file the same ta...

7

Solved

I used the code below to fill a data table - OleDbDataAdapter oleDA = new OleDbDataAdapter(); DataTable dt = new DataTable(); oleDA.Fill(dt, Dts.Variables["My_Result_Set"].Value); I get the er...
Runstadler asked 4/11, 2013 at 20:19

1

I'm trying to write a simple SCDF flow that reads from Kafka, filters the messages by presence of specific value and pushes data into Mongo. As part of this i had to wrote following #jsonPath #jso...
Bailes asked 26/6, 2020 at 20:5

4

Solved

I'm looking into ETL tools (like Talend) and investigating whether Apache Nifi could be used. Could Nifi be used to perform the following: Pick up two CSV files that are placed on local disk Join ...
Frulla asked 20/3, 2017 at 16:22

3

Solved

I have downloaded and installed Visual Studio 2022. Then after click on modify Now, I want to create SSIS package, for this I have started VS22 and in "manage extensions" when I t...

4

I have problems with SSIS process(acctually the same problem occurs for two different processes). So, we are doing some ETL work using SSIS. We have Business Intelligence project that executes with...
Ouabain asked 22/1, 2016 at 13:19

3

I have a whole bunch of data in AWS S3 stored in JSON format. It looks like this: s3://my-bucket/store-1/20190101/sales.json s3://my-bucket/store-1/20190102/sales.json s3://my-bucket/store-1/20190...
Riordan asked 20/3, 2019 at 14:1

2

I am using SQL Server 2014 with SSIS I have a data set like this: ID Name Status 1 Awesome "Store" Active 2 Market, Place Active 3 Vendor Active In SSMS, when the results are in the grid and I ...
Pam asked 20/12, 2017 at 16:30

2

I want to use ETL to read data from S3. Since with ETL jobs I can set DPU to hopefully speed things up. But how do I do it? I tried import sys from awsglue.transforms import * from awsglue.util...
Romeliaromelle asked 1/11, 2018 at 15:10

8

Solved

On a SQL 2016 Server I have a job that calls an SSIS package. That package is in a project in the SSISDB and has parameters. One of those parameters is a string type that is blank as a default. I ...
Gaberlunzie asked 5/12, 2017 at 20:56

2

Solved

I'm trying to execute a stored procedure via SSIS Error: 0xC0207014 at PO Header, OLE DB Source [59]: The SQL command requires a parameter named "@SessionID", which is not found in the paramet...
Budge asked 27/5, 2019 at 9:56

4

We are designing an Big data solution for one of our dashboard applications and seriously considering Glue for our initial ETL. Currently Glue supports JDBC and S3 as the target but our downstream ...
Existence asked 2/3, 2018 at 5:58

1

I am trying to make Microsoft.Azure.Services.AppAuthentication and its dependencies work with SSIS script task. How do I resolve assembly reference errors? static ScriptMain() { AppDomain.Curren...
Diastasis asked 2/3, 2022 at 23:44

10

I have a Data Flow Task that is hanging on excecution. The flow is simple, makes two queries to different tables (Both with a couple of joins), then sorts and merges the otuputs through a common i...
Dram asked 19/3, 2013 at 19:22

2

Solved

I have a job and multiple transformations. If i wanted to define a database connection in the Job , and use the same database connection for all the transformations , how do I go about it ? I am us...
Vondavonni asked 4/8, 2015 at 1:53

3

Solved

Am currently building an ETL pipeline, which outputs tables of data (order of ~100+ GBs) to a downstream interactive dashboard, which allows filtering the data dynamically (based on pre-defin...
Riggall asked 28/12, 2017 at 4:31

13

I tried to search posts, but I only found solutions for SQL Server/Access. I need a solution in MySQL (5.X). I have a table (called history) with 3 columns: hostid, itemname, itemvalue. If I do a s...
Grishilde asked 6/8, 2009 at 20:22

2

Solved

I'm setting up Airflow right now and loving it, except for the fact that my dags are perpetually running behind. See the picture below - this was taken on 2/19 at 15:50 UTC, and you can see that fo...
Brianbriana asked 19/2, 2018 at 15:53

6

Solved

We currently use Datastage ETL to - Export a CSV/text file with data from 15 tables(3 different schemas) on a daily basis. I am wondering If there is a simpler way to accomplish this with out usin...
etl
Farahfarand asked 24/2, 2011 at 3:13

1

I have fetched the data as List < T > by reading from different formats e.g. CSV, Parquet, Avro, JSON. I want to validate the data with mostly feature e.g. The temperature should remain with ...
Demarche asked 4/12, 2021 at 16:34

© 2022 - 2025 — McMap. All rights reserved.