HtmlUnit is an awesome Java library that allows you to programatically fill out and submit web forms. I'm currently maintaining a pretty old system written in ASP, and instead of manually filling out this one web form on a monthly basis as I'm required, I'm trying to find a way to maybe automate the entire task because I keep forgetting about it. It's a form for retrieving data gathered within a month. Here's what I've coded so far:
WebClient client = new WebClient();
HtmlPage page = client.getPage("http://urlOfTheWebsite.com/search.aspx");
HtmlForm form = page.getFormByName("aspnetForm");
HtmlSelect frMonth = form.getSelectByName("ctl00$cphContent$ddlStartMonth");
HtmlSelect frDay = form.getSelectByName("ctl00$cphContent$ddlStartDay");
HtmlSelect frYear = form.getSelectByName("ctl00$cphContent$ddlStartYear");
HtmlSelect toMonth = form.getSelectByName("ctl00$cphContent$ddlEndMonth");
HtmlSelect toDay = form.getSelectByName("ctl00$cphContent$ddlEndDay");
HtmlSelect toYear = form.getSelectByName("ctl00$cphContent$ddlEndYear");
HtmlCheckBoxInput games = form.getInputByName("ctl00$cphContent$chkListLottoGame$0");
HtmlSubmitInput submit = form.getInputByName("ctl00$cphContent$btnSearch");
frMonth.setSelectedAttribute("1", true);
frDay.setSelectedAttribute("1", true);
frYear.setSelectedAttribute("2012", true);
toMonth.setSelectedAttribute("1", true);
toDay.setSelectedAttribute("31", true);
toYear.setSelectedAttribute("2012", true);
games.setChecked(true);
submit.click();
After the click()
, I'm supposed to wait for the very same web page to finish reloading because somewhere there is a table that displays the results of my search. Then, when the page is done loading, I need to download it as an HTML file (very much like "Save Page As..." in your favorite browser) because I will scrape out the data to compute their totals, and I've already done that using the Jsoup library.
My questions are: 1. How do I programatically wait for the web page to finish loading in HtmlUnit? 2. How do I programatically download the resulting web page as an HTML file?
I've looked into the HtmlUnit docs already and couldn't find a class that'll do what I need.
asXml()
does work! Do you know anything about waiting for the page to reload though? I tried to make the thread sleep for 30 seconds after my call toclick()
and successfully wrote the result ofasXml()
in an HTML file, but while the<select>
elements are properly modified, the results don't show in the table. I'm assumming this might be because I need to make a newHtmlPage
reference to the resulting one (which is basically just itself too), but how do I do that? – Heir