etl Questions

7

Solved

If I had to perform ETL on a huge dataset(say 1Tb) stored in S3 as csv files, Both AWS Glue ETL job and AWS EMR steps can be used. Then how is AWS Glue different from AWS EMR. And which is the bett...
Circassia asked 7/6, 2020 at 20:19

10

This question was asked here before but the solutions proposed don't seem to be working for me. I'm trying to import a text file pipe delimited text qualifier ". The SSIS package is returning the ...
Fabio asked 4/6, 2017 at 23:18

4

Solved

When I open a Script component, I can choose a Connection Manager from a dropdown list: This Connection Manager has it all, if I had it as an object in the C# code, I would not need to write a har...
Luehrmann asked 4/8, 2010 at 7:30

9

Solved

I have a few SSIS packages that were password-protected (their protection level is apparently EncryptAllWithPassword) by a developer who left the company and can't be reached anymore, and trying to...
Abiogenesis asked 4/6, 2009 at 9:50

2

Solved

Airflow scheduler kinda left me scratching my head for the past few days as it backfills dag runs even after catchup=False. My timezone-aware dag has a start date of 13-04-2021 19:30 PST or 14-04-2...
Colligan asked 22/4, 2021 at 11:59

2

I have the below simple script for AWS Glue. I have a text file with empty cells and a table which accepts NULL values. When I run the glue job it fails with the exception, "Don't know how to save ...
Mizuki asked 28/11, 2017 at 0:24

6

Solved

DECLARE @PATH NVARCHAR(1000) = N'\\MY-SERVER\C$\Folder\\' DECLARE @TABLE NVARCHAR(50) = SUBSTRING(@FILENAME,0,CHARINDEX('.',@FILENAME)) DECLARE @SQL NVARCHAR(4000) = N'IF OBJECT_ID(''dbo.' + @TAB...
Onondaga asked 30/11, 2017 at 16:50

5

Solved

Getting error "The version of flat file destination is not compatible with this version of the dataflow" when trying to execute a SSIS package from the catalog, the package executes well from v...
Devoir asked 7/4, 2020 at 15:17

5

I've tried reading the Wikipedia article for "extract, transform, load", but that just leaves me more confused... Can someone explain what ETL is, and how it is actually done?
Porringer asked 3/8, 2010 at 0:30

6

Solved

I have a Python script that imports a large CSV file and then counts the number of occurrences of each word in the file, then exports the counts to another CSV file. But what is happening is that ...
Regeniaregensburg asked 4/10, 2013 at 19:44

4

Solved

I created and executed a dtsx with SSMS corresponding wizard: This was to import a flat file in an existing table. At the end I saved the "package" as a .dtsx file Now I need to modify the colu...
Ternopol asked 22/1, 2019 at 11:25

2

Solved

I have an ADF which writes output of a Kusto Function to a Kusto Table daily. I need to upsert the data daily into the table. I did not find a way to update the existing data in Kusto DB. Is there ...
Hagerman asked 24/7, 2019 at 7:43

3

I am writing an ETL script in Python that gets data in CSV files, validates and sanitizes the data as well as categorizes or classifies each row according to some rules, and finally loads it into a...
Deedeeann asked 8/3, 2012 at 19:45

2

Solved

I'm working on a Data Mart loading package in SSIS 2012. When attempting to execute the package in Visual Studio I get this error: "The AcquireConnection method call to the connection manager D...
Newland asked 19/12, 2012 at 0:31

4

Solved

Background: I have a PostgreSQL (v8.3) database that is heavily optimized for OLTP. I need to extract data from it on a semi real-time basis (some-one is bound to ask what semi real-time means a...

4

I'm using AWS Glue to move multiple files to an RDS instance from S3. Each day I get a new file into S3 which may contain new data, but can also contain a record I have already saved with some upda...
Suzysuzzy asked 22/11, 2018 at 19:21

7

Solved

In an SSIS package that I'm writing, I have a CSV file as a source. On the Connection Manager General page, it has 65001 as the Code page (I was testing something). Unicode is not checked. The col...
Bradway asked 26/1, 2018 at 0:56

5

Solved

I am receiving the following error when trying to run the package from the Integration Services catalog in SSMS. I changed the 64BitRuntime option to FALSE but it still does not work. The error bel...
Outfoot asked 29/3, 2017 at 13:48

3

Solved

I am working on a data warehouse and looking for an ETL solution that uses Python. I have played with SnapLogic as an ETL, but I was wondering if there were any other solutions out there. This dat...
Haughay asked 21/9, 2010 at 16:4

4

Solved

I have simple text files that contain just normal texts. I was wondering if there is a way to load the text contents to a table in sqlite. So maybe I could Create table myTable(nameOfText TEXT, co...
Plaint asked 10/3, 2013 at 1:31

1

I am unable to run newly created AWS Glue Crawler. I followed IAM Role guide at https://docs.aws.amazon.com/glue/latest/dg/create-an-iam-role.html?icmpid=docs_glue_console Created new Crawler Role...
Lithesome asked 8/1, 2023 at 6:4

3

Solved

I am using SQL Server 2008 import and export wizard. I need to import a database. I opened the SQL server import/export wizard and I went through the following actions:- for the destination I cho...
Spur asked 6/1, 2014 at 11:14

6

Solved

I'd like to convert result table to JSON array in MySQL using preferably only plain MySQL commands. For example with query SELECT name, phone FROM person; | name | phone | | Jack | 12345 | | John...
Sweven asked 20/1, 2017 at 8:10

2

Solved

I've been trying to add multiple partition columns, to a BigQuery table, but it seems to only take one field, even if I add multiple partition fields in the query parameters. I'm partitioning by da...
Moule asked 14/7, 2020 at 0:17

3

Solved

Are staging tables used only in Data warehouse project or in any SSIS Project? I would like to know what is a staging table? Can anyone give me some examples on how to use it and in what circumstan...
Unattended asked 28/3, 2015 at 12:29

© 2022 - 2024 — McMap. All rights reserved.