1GB of Data From MySQL to MS Access
Asked Answered
R

6

8

The Situation: I am creating an automated task which queries MySQL (through ODBC) and inserts the result set to a MS Access Database (.mdb) using OLEDB.

The Code:

OleDbConnection accCon = new OleDbConnection();
OdbcCommand mySQLCon = new OdbcCommand();
try
{
    //connect to mysql
    Connect();                
    mySQLCon.Connection = connection;              

    //connect to access
    accCon.ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;" +
        @"Data source= " + pathToAccess;
    accCon.Open();
    var cnt = 0;

    while (cnt < 5)
    {
        if (accCon.State == ConnectionState.Open)
            break;
        cnt++;
        System.Threading.Thread.Sleep(50);
    }

    if (cnt == 5)
    {
        ToolBox.logThis("Connection to Access DB did not open. Exit Process");
        return;
    }
} catch (Exception e)
{
    ToolBox.logThis("Faild to Open connections. msg -> " + e.Message + "\\n" + e.StackTrace);
}
OleDbCommand accCmn = new OleDbCommand();
accCmn.Connection = accCon;
//access insert query structure
var insertAccessQuery = "INSERT INTO {0} values({1});";
// key = > tbl name in access, value = > mysql query to b executed
foreach (var table in tblNQuery)
{
    try
    {
        mySQLCon.CommandText = table.Value;
        //executed mysql query                        
        using (var dataReader = mySQLCon.ExecuteReader())
        {
            //variable to hold row data
            var rowData = new object[dataReader.FieldCount];
            var parameters = "";
            //read the result set from mysql query
            while (dataReader.Read())
            {
                //fill rowData with the row values
                dataReader.GetValues(rowData);
                //build the parameters for insert query
                for (var i = 0; i < dataReader.FieldCount; i++)
                    parameters += "'" + rowData[i] + "',";

                parameters = parameters.TrimEnd(',');
                //insert to access
                accCmn.CommandText = string.Format(insertAccessQuery, table.Key, parameters);
                try
                {
                    accCmn.ExecuteNonQuery();
                }
                catch (Exception exc)
                {
                    ToolBox.logThis("Faild to insert to access db. msg -> " + exc.Message + "\\n\\tInsert query -> " + accCmn.CommandText );
                }                              
                parameters = "";
            }
        }
    }
    catch (Exception e)
    {
        ToolBox.logThis("Faild to populate access db. msg -> " + e.Message + "\\n" + e.StackTrace);
    }
}
Disconnect();
accCmn.Dispose();
accCon.Close();

The Issues:

  1. The memory usage goes very high (300MB++) while the MS Access file size does not change constantly! Seems like the insert caches the data rather that saving it to disk.

  2. It is very slow! I know my query executes within a few second but rather insertion process takes long.

I have tried using prepared statement in MS Access and insert the values as parameters instead of string concat to create insert query. However I get this exception message:

Data type mismatch in criteria expression.

Anyone know how to fix this or have a better approach?

Rollicking answered 24/7, 2015 at 3:51 Comment(4)
Please comment after vote down >:(Rollicking
MS Access is not an ACID compliant, and MS used that to their advantage by using the memory as cache before flushing. The time it takes can be long due to the row per row insert, and the location of your MySQL database: Is it in the same machine or do you have a network in between?Evy
it is all within the same machine. Can I force it to flush the data to disk?!Rollicking
You could try this: msdn.microsoft.com/en-us/library/… however there is a good chance that it will not work since you are using more then 1 oledb connection, which in older versions caused MS Access (by design), not to flush for performance reasons.Evy
R
0

Thanks everyone for the answers. I just found the main problem in my code. The reason for heavy memory usage (issue #1) was ODBC was caching the data from MySQL regardless of C# approach (DataReader). That issue is resolved by checking the Don't cache results of forward-only cursors checkbox in DSN settings. This also made the process slightly faster (30%). However, more concrete approach is still what Brian Pressler and Egil Hansen suggested.But since they require software installation and/or migration plan, easiest way would be to stick to this piece of code.

Rollicking answered 25/8, 2015 at 9:14 Comment(0)
L
6

You could create a VBA macro that uses the DoCmd.TransferDatabase method to pull data through ODBC into your Access database. It would probably be much faster and simpler as well.

To run the VBA code from an external program or scheduled task, simply initiate Access to open your file with the /x command line switch and it will run the import macro on startup. A GB of data though is still going to take a while. I found an article by David Catriel that implemented this approach.

An even better option is to use a different database engine back-end like the free version of SQL Server Express. Then you have a lot more options and it is much more robust. If you need MS Access forms and reports, you can create an ADP project file if you use SQL Server, or you can use linked tables to get at your data. You could even use Access as a front-end to your MySQL database instead of copying all the data if that would satisfy your requirements.

Laski answered 18/8, 2015 at 22:3 Comment(5)
Thanks Brian for the answer but it is not possible for me to install MS Access on the server machine :(Rollicking
It doesn't have to be on the server machine. Do this on any computer that has MS Access installed, then copy the mdb to the server (if necessary). It will still be much faster.Human
Yeah, you are still going to have to manually run a compact and repair on your database occasionally as well. You might consider a different database back-end like SQL Sever Express (added to my answer).Laski
MS Access database will be passed to customers. We can't give all of them database access or link a separate machine to server. Migration to SQL server is also no possible. If there is no easier way of doing it I guess I will live with the memory usage.Rollicking
If users are not connecting to the access database on the server (you're giving it to customers), then the DoCmd.TransferDatabase method should work. I have done it with about the same amount of data. You just have to run your macro on a machine with MS Access installed and an ODBC link to the MySql database. Then you can remove the macro if needed and pass it to your customers.Laski
L
5

Instead of writing code, you could turn to SQL Server Integration Services (SSIS), and be done before lunch. It is available as an extension to Visual Studio, in case you do not have it on your computer already with SQL Server.

With SSIS you are able to create a reusable SSIS package that can be triggered from the command line or scheduled task. This guide shows how to pull data from MySQL into SQL Server, but the SQL Server part should be easy to replace with Access.

Lennie answered 23/8, 2015 at 6:57 Comment(0)
S
4

some changes with comment to add transaction for command execution. if transactions is not controlled manually, it will be created and committed every time automatically and it's a time consuming action

            OleDbConnection accCon = new OleDbConnection();
            OdbcCommand mySQLCon = new OdbcCommand();
            try
            {
                //connect to mysql
                Connect();
                mySQLCon.Connection = connection;

                //connect to access
                accCon.ConnectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;" +
                                          @"Data source= " + pathToAccess;
                accCon.Open();
                var cnt = 0;

                while (cnt < 5)
                {
                    if (accCon.State == ConnectionState.Open)
                        break;
                    cnt++;
                    System.Threading.Thread.Sleep(50);
                }

                if (cnt == 5)
                {
                    ToolBox.logThis("Connection to Access DB did not open. Exit Process");
                    return;
                }
            }
            catch (Exception e)
            {
                ToolBox.logThis("Faild to Open connections. msg -> " + e.Message + "\\n" + e.StackTrace);
            }
//AMK: transaction starts here
            var transaction = accCon.BeginTransaction();
            OleDbCommand accCmn = new OleDbCommand();

            accCmn.Connection = accCon;
            accCmn.Transaction = transaction;
//access insert query structure
            var insertAccessQuery = "INSERT INTO {0} values({1});";
// key = > tbl name in access, value = > mysql query to b executed
            foreach (var table in tblNQuery)
            {
                try
                {
                    mySQLCon.CommandText = table.Value;
                    //executed mysql query                        
                    using (var dataReader = mySQLCon.ExecuteReader())
                    {
                        //variable to hold row data
                        var rowData = new object[dataReader.FieldCount];
                        var parameters = "";
                        //read the result set from mysql query
                        while (dataReader.Read())
                        {
                            //fill rowData with the row values
                            dataReader.GetValues(rowData);
                            //build the parameters for insert query
                            for (var i = 0; i < dataReader.FieldCount; i++)
                                parameters += "'" + rowData[i] + "',";

                            parameters = parameters.TrimEnd(',');
                            //insert to access
                            accCmn.CommandText = string.Format(insertAccessQuery, table.Key, parameters);
                            try
                            {
                                accCmn.ExecuteNonQuery();
                            }
                            catch (Exception exc)
                            {
                                ToolBox.logThis("Faild to insert to access db. msg -> " + exc.Message +
                                                "\\n\\tInsert query -> " + accCmn.CommandText);
                            }
                            parameters = "";
                        }
                    }
//AMK: transaction commits here if every thing is going well
                    transaction.Commit();
                }
                catch (Exception e)
                {
                    ToolBox.logThis("Faild to populate access db. msg -> " + e.Message + "\\n" + e.StackTrace);
//AMK: transaction rollback here if there is a problem
                    transaction.Rollback();
                }
            }
            Disconnect();
            accCmn.Dispose();
            accCon.Close();
Shady answered 24/8, 2015 at 14:32 Comment(3)
I tried your code with a little bit of modification. But it doesn't work. I was hoping memory usage go down at every commit and size of the .mdb increases. But it is not happening and transaction does not solve the problemRollicking
speed/growing file size? which one is more appropriate for you? if u want the file size grow, you may push the data via closing connection and reopening it or force flush the data may help. but speed-wise that is THE killing solution.Shady
I need a common ground. I don't want to make it slower than what it is right now but memory usage is more of a big deal. It goes up to 600MB sometimes and that cannot be tolerated. I am keeping closing and reopening the connection as my last option.Rollicking
S
1

Create a DSN (data source name) for the SQL server database. Then select that DSN by opening the Microsoft Access database and choosing to import from that DSN. You should have the ability to import that exact 1GB table (schema, data, everything).

More information on using a DSN: https://support.office.com/en-us/article/Link-to-SQL-Server-data-0474c16d-a473-4458-9cf7-f369b78d3db8

Alternatively you can just link to the SQL server database (not import to an Access table) using that DSN and skip the import altogether.

Sacci answered 20/8, 2015 at 19:19 Comment(0)
L
1

Should the INSERT be part of a TRANSACTION. Being within a TRANSACTION usually speeds BULK INSERTS

Liken answered 21/8, 2015 at 19:5 Comment(0)
R
0

Thanks everyone for the answers. I just found the main problem in my code. The reason for heavy memory usage (issue #1) was ODBC was caching the data from MySQL regardless of C# approach (DataReader). That issue is resolved by checking the Don't cache results of forward-only cursors checkbox in DSN settings. This also made the process slightly faster (30%). However, more concrete approach is still what Brian Pressler and Egil Hansen suggested.But since they require software installation and/or migration plan, easiest way would be to stick to this piece of code.

Rollicking answered 25/8, 2015 at 9:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.