stumped on clicking a link with nokogiri and mechanize
Asked Answered
E

3

5

perhaps im doing it wrong, or there's another more efficient way. Here is my problem:

I first, using nokogiri open an html document and use its css to traverse the document until i find the link which i need to click.

Now once i have the link, how do i use mechanize to click it? According to the documentation, the object returned by Mechanize.new either the string or a Mechanize::Page::Link object.

I cannot use string - since there could be 100's of the same link - i only want mechanize to click the link that was traversed by nokogiri.

Any idea?

Electrotechnics answered 20/9, 2011 at 22:44 Comment(0)
H
14

After you have found the link node you need, you can create the Mechanize::Page::Link object manually, and click it afterwards:

agent = Mechanize.new
page = agent.get "http://google.com"
node = page.search ".//p[@class='posted']"
Mechanize::Page::Link.new(node, agent, page).click
Haydenhaydn answered 21/9, 2011 at 2:36 Comment(2)
That is not the best way to go. Take a look at my answer.Asuncion
I think this one is better in some cases, even if not the easiest. There were many many links with the same class in the page, but I needed to know which one I was clicking relative to the table cell it was in relative to another table cell. So I can user Nokorigi to find that cell and then the link within it. I can't do that with Mechanize link_with for what I have seen.Hemeralopia
A
5

Easier way than @binarycode option:

agent = Mechanize.new
page = agent.get "http://google.com"
page.link_with(:class => 'posted').click
Asuncion answered 22/9, 2011 at 9:58 Comment(2)
Your approach is best when the conditions that are used to find the link are very simple. Here the question poster uses nokogiri to traverse the document, so i provided solution where he could use nokogiri features, so more complex logic for finding the correct link could be implemented.Haydenhaydn
The only limitation then is that the node must respond to .href or ['href'] or ['src']Asuncion
W
2

That is simple, you don't need to use mechanize link_with().click

You can just getthe link and update your page variable

Mechanize saves current working site internally, so it is smart enough to follow local links

Ex.:

agent = Mechanize.new
page = agent.get "http://somesite.com"

next_page_link =  page.search('your exotic selectors here').first rescue nil  #nokogyri object 
next_page_href =  next_page_link['href'] rescue nil  # '/local/link/file.html'

page = agent.get(next_page_href) if next_page_href  # goes to 'http://somesite.com/local/link/file.html'
Wiggly answered 10/9, 2014 at 10:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.