How to let Ruby Mechanize get a page which lives in a string
Asked Answered
R

1

13

Generally Mechanize will get a webpage from a URL and the result of the get method is a Mechanize::Page object, from which you can use a lot of useful methods.

If the page lives in a string, how do I get the same Mechanize::Page object?

require 'mechanize'

html = <<END_OF_STRING
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<title>Page Title</title>
<style type="text/css">
</style>
</head>
<body>
<h1>This is a test</h1>
</body>
</html>
END_OF_STRING

agent = Mechanize.new

# How can I get the page result from the string html?
#page = ...
Residence answered 3/3, 2012 at 19:56 Comment(0)
R
22

Mechanize uses Nokogiri to parse the HTML. If you are accessing the HTML without the need of an internet transfer protocol you don't need Mechanize. All you are looking to do is to parse the input HTML, right?

The following will let you do this:

require 'Nokogiri'
html = 'html here'
page = Nokogiri::HTML html

If you have the Mechanize gem installed you will already have Nokogiri.

Otherwise you can still create a new Mechanize page using:

require 'Mechanize'
html = 'html here'
a = Mechanize.new
page2 = Mechanize::Page.new(nil,{'content-type'=>'text/html'},html,nil,a)
Resistor answered 3/3, 2012 at 20:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.