C# Excel file OLEDB read HTML IMPORT
Asked Answered
I

3

6

I have to automate something for the finance dpt. I've got an Excel file which I want to read using OleDb:

string connectionString = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=A_File.xls;Extended Properties=""HTML Import;IMEX=1;""";

using (OleDbConnection connection = new OleDbConnection())
{
    using (DbCommand command = connection.CreateCommand())
    {
        connection.ConnectionString = connectionString;
        connection.Open();

        DataTable dtSchema = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);                        
        if( (null == dtSchema) || ( dtSchema.Rows.Count <= 0 ) )                        
        {                                
            //raise exception if needed                        
        }

        command.CommandText = "SELECT * FROM [NameOfTheWorksheet$]";

        using (DbDataReader dr = command.ExecuteReader())
        {
            while (dr.Read())
            {
                //do something with the data
            }
        }
    }
}

Normally the connectionstring would have an extended property "Excel 8.0", but the file can't be read that way because it seems to be an html file renamed to .xls. when I copy the data from the xls to a new xls, I can read the new xls with the E.P. set to "Excel 8.0".

Yes, I can read the file by creating an instance of Excel, but I rather not.. Any idea how I can read the xls using OleDb without making manual changes to the xls or by playing with ranges in a instanciated Excel?

Regards,

Michel

Inhalant answered 6/8, 2009 at 12:10 Comment(0)
S
5

I asked this same question on another forum and got the answer so I figured I'd share it here. As per this article: http://ewbi.blogs.com/develops/2006/12/reading_html_ta.html

Instead of using the sheetname, you must use the page title in the select statement without the $. SELECT * FROM [HTMLPageTitle]

Smack answered 14/8, 2009 at 19:51 Comment(2)
Thank you for your answer, it looks like it will do the trick. I had to finish my program so I implemented it with by instantiating Excel for now. When I get time I will revise the code!Inhalant
thats an important thing if your HTML file doesn't have a <title> tag in metadata you can try to make a query like this: SELECT * FROM [Table]Norean
N
1

I've been searching so many solution, end up I found something really simple and easy - to import XML file to Excel file, I tried to convert XML to HTML first, use -

http://www.csharpfriends.com/Articles/getArticle.aspx?articleID=63

then I found I could easily change my output file as .xls, instead of .html

        //create the output stream
        XmlTextWriter myWriter = new XmlTextWriter
("result.html", null);

then the output is perfect Excel file from my XML data file.

hope this will save ur work.

Nava answered 30/3, 2010 at 17:18 Comment(0)
E
1

I have run into the same problem. As previously mentioned, it seems to be an html file renamed to .xls. When I copy the data from the xls to a new xls, I can read the new xls with the E.P. set to "Excel 8.0".

In this scenario, the file couldn't be saved in the correct format. So we have to convert that file to the correct format. To do this, use MS Office Excel 2007, Click File -> Convert. The file will be converted to the right format automatically.

Erbium answered 17/8, 2010 at 11:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.