How do I use Mechanize to process JavaScript?
Asked Answered
S

3

28

I'm connecting to a web site, logging in.

The website redirects me to new pages and Mechanize deals with all cookie and redirection jobs, but, I can't get the last page. I used Firebug and did same job again and saw that there are two more pages I had to pass with Mechanize.

I took a quick look at the pages and saw that there is some JavaScript and HTML code but couldn't understand it because it doesn't look like normal page code. What are those pages for? How they can redirect to other pages? What should I do to pass these?

Soto answered 29/4, 2009 at 12:51 Comment(1)
Why don't you put the javascript into a paste site like pastie.org and post the link here?Fendley
O
39

If you need to handle pages with Javascript, try WATIR or Selenium - those drive a real web browser, and can thus handle any Javascript. WATIR Classic requires either IE or Firefox with a certain extension installed, and you will see the pages flash on the screen as it works.

Your other option would be understanding what the Javascript on the offending page does and bypassing it manually, but that seems onerous.

Ordinate answered 29/4, 2009 at 13:5 Comment(3)
thank you everybody. watir have done just what i wanted to do :) it looks great, takes me more into ruby :) at first time the website i was trying to get was angry with me because of user_agent, but when i set it to firefox, problem has gone. stackoverflow rocks! i love here :)Soto
@Ordinate can you hide Browser while executing script in WATIR?Weatherby
Whoa, thread necromancy. I have no idea anymore and I'd wager a "no" – and at any rate, this is a different question than the one the OP asked, so you can ask it separately on SO so someone else has a chance to answer. I'd also consider looking at PhantomJS, that's a headless (i.e. "no UI") WebKit automator. It works well enough, but last time I wanted to use it for something, the asynchronous API made doing what I needed too convoluted, so I ended up going with Selenium.Ordinate
I
14

At present, Mechanize doesn't handle JavaScript. There's talk of eventually merging Johnson's capabilities into Mechanize, but until that happens, you have two options:

  1. Figure out the JavaScript well enough to understand how to traverse those pages.
  2. Automate an actual browser that does understand JavaScript using Watir.
Ist answered 29/4, 2009 at 13:11 Comment(1)
Johnson is a dead project. Is there a replacement?Maomaoism
F
5

what are those pages for? how they can redirect to other pages. what should i do to pass these?

Sometimes work is done on those pages. Sometimes the JavaScript is there to prevent automated access like what you're trying to do :). A lot of websites have unnecessary checks to make sure you have a "good" browser, so make sure that your user_agent is set to something common, like IE. Sometimes setting the user_agent to look like an old browser will let you get past without JavaScript.

Website automation is fun because you have to outsmart the website and its software developers, using multiple strategies. Like the others said, Watir is the best tool for getting past JavaScript at the moment.

Fendley answered 29/4, 2009 at 15:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.