MSHTML: CreateDocumentFromString instead of CreateDocumentFromUrl
Asked Answered
B

3

5

I'd like to use the MSHTML library to parse some HTML that I have in a string variable. However, I can't figure out how to do this. I can easily parse the contents of a webpage given a known URL, but not the source HTML directly. Is this possible? If so, how?

Public Sub ParseHTML(sHTML As String)
Dim oHTML As New HTMLDocument, oDoc As HTMLDocument

    'This works:'
    Set oDoc = oHTML.createDocumentFromUrl("http://www.google.com", "")

    'I would like to do the following but no such method actually exists:'
    Set oDoc = oHTML.createDocumentFromString(sHTML)

    ....
    'Parse the HTML using the oDoc variable'
    ....
Butz answered 3/4, 2012 at 14:23 Comment(0)
G
17

You can;

Dim odoc As Object

Set odoc = CreateObject("htmlfile") '// late binding

'// or:
'// Set odoc = New HTMLDocument 
'// for early binding

odoc.open
odoc.write "<p> In his house at R'lyeh, dead <b>Cthulhu</b> waits dreaming</p>"
odoc.Close
MsgBox odoc.body.outerHTML
Gregoor answered 3/4, 2012 at 14:51 Comment(6)
Nice! Note to others: I received a compile error in VBA when I tried to declare odoc As HTMLDocument: Compile error: Function or interface marked as restricted, or the function uses an Automation type not supported in Visual Basic. Changing the declaration to odoc As Object (as this answer clearly shows) fixed the problem.Butz
Yep, I agree, nice is the word.Terryl
@Alex: Hope you don't mind, but I edited your answer to include a way to ref the library late-bound. It's non-obvious and took me some time to find via the web.Butz
Early vs late binding has nothing to do with the way class is instantitated. It's the Dim-ensioning part that is import. Dim ... As Object is late binding, Dim ... As ClassOrInterface is early binding.Leonor
Worked, thanks! Note that early binding caused an error for me: Function or interface marked as restricted.Fetching
If you want to keep the early binding, you can call the restricted Write method using CallByName, like so: CallByName odoc, "Write", vbMethod, "<p> In his house at R'lyeh, dead <b>Cthulhu</b> waits dreaming</p>"Rozek
F
2

For straight HTML code such as Access-Rich-Text this does it:

Dim HTMLDoc As New HTMLDocument

HTMLDoc.Body.innerHTML = strHTMLText
Flowing answered 18/4, 2015 at 9:13 Comment(0)
S
1

This is a much better example. You will not get a null exception, nor late binding.

(And if you use WPF, just add System.Windows.Forms in your reference.)

Dim a As Object
        a = New mshtml.HTMLDocument

        a.open()
        a.writeln(code)
        a.close()

        Do Until a.readyState = "complete"
            System.Windows.Forms.Application.DoEvents()
        Loop


        Dim doc As mshtml.HTMLDocument = a



        Dim b As mshtml.HTMLSelectElement = doc.getElementsByTagName("Select").item("lang", 0)
Srini answered 10/12, 2013 at 0:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.