I have spent a day on researching a library that can be used to accomplish the following:
- Retrieve the full contents of a webpage like in the background without rendering result to a view.
- The lib should support pages that fires off ajax requests to load some additional result data after the initial HTML has loaded for example.
- From the resulting html I need to grab elements in xpath or css selector form.
- In future I also possibly need to navigate to a next page (fire off events, submitting buttons/links etc)
Here is what I have tried without success:
- Jsoup: Works great but no support for javascript/ajax (so it does not load full page)
- Android built in HttpEntity: same problem with javascript/ajax as jsoup
- HtmlUnit: Looks exactly what I need but after hours cannot get it to work on Android (Other users failed by trying to load the 12MB+ worth of jar files. I myself loaded the full source code and referenced it as a project library only to find that things such as Applets and java.awt (used by HtmlUnit) does not exist in Android).
- Rhino - I find this very confusing and don't know how to get it working in Android and even if it is what I am looking for.
- Selenium Driver: Looks like it can work but you don't have an straightforward way to implement it in a headless way so that you don't have the actual html displayed to a view.
I really want HtmlUnit to work as it seems the best suited for my solution. Is there any way or at least another library I have missed that is suitable for my needs?
I am currently using Android Studio 0.1.7 and can move to Ellipse if needed.
Thanks in advance!
@Makyen
, so I can help in getting it reopened. – Uranography