How to use Http, Socks4 and Socks5 proxies in java?
Asked Answered
S

3

6

I want to screen-scrape a website and for that I want to use Http, Socks4 and Sock5 proxies. So my questions are as follows:

  1. Is it possible to use these proxies through Java without using any other external API? For instance, Is it possible to send a request through HttpURLConnection through theseproxies?

  2. If it is not possible, then What other external APIs I can use?

  3. I was doing it by using a headless browser provided by HtmlUnit but it takes time to load even simple webpages, so could you please suggest me other APIs (if any) that provide headless browsers that are fast in loading webpages. I don't want to open webpages that contain heavy AJAX or Javascript code. I just need to click on the forms button through the headless browser.

Satori answered 16/1, 2010 at 15:50 Comment(0)
B
3

Is it possible to use these proxies through Java without using any other external API? For instance, Is it possible to send a request through HttpURLConnection through these proxies?

Yes, you can configure proxies by either using (global) system properties, or using the Proxy class, or using a ProxySelector. The two later options are available since Java 5 and are more flexible. Have a look at Java Networking and Proxies as mentioned by jarnbjo for all the details.

I was doing it by using a headless browser provided by HtmlUnit but it takes time to load even simple webpages, so could you please suggest me other APIs (if any) that provide headless browsers that are fast in loading webpages. I don't want to open webpages that contain heavy AJAX or Javascript code. I just need to click on the forms button through the headless browser.

Unfortunately, the first alternatives I can think of are either HtmlUnit based (like JWebUnit or WebTest) or slower (Selenium, WebDriver - that you can run in headless mode). But maybe you could try HttpUnit if you don't need advanced JavaScript support.

Boone answered 16/1, 2010 at 17:20 Comment(2)
Your answer is very informative. I have already used Selenium too. And you are right that Selenium is slower than HtmlUnit so there is no question of using Selenium by replacing HtmlUnit. I had tried HttpUnit also two days back but the .jar file that I downloaded for HttpUnit contained various linked libraries too so when I tried to run the program, there were many references errors referring to other libraries. I downloaded some of them but couldn't downloaded all of them so I stopped using it.Satori
With Maven or Ivy, it would be pretty easy to setup your project (with the dependencies). If you're not using one of these tools, the dependencies are mentioned here for example: mvnrepository.com/artifact/httpunit/httpunit/1.6.2Boone
M
1

Yes, that is possible. You can find the configuration options for different network proxies here.

Marozik answered 16/1, 2010 at 16:14 Comment(0)
B
0

You can set up per-connection proxies. Here is an example with the Java 11 HttpClient and the legacy HttpURLConnection:

public static void java11Http(String url) throws Exception {
    ProxySelector proxySelector = new ProxySelector() {
        @Override
        public List<Proxy> select(URI uri) {
            return List.of(new Proxy(Proxy.Type.SOCKS, new InetSocketAddress("127.0.0.1", 1234)));
        }
        @Override
        public void connectFailed(URI uri, SocketAddress sa, IOException ioe) {
            ioe.printStackTrace();
        }
    };

    HttpClient client = HttpClient.newBuilder()
            .proxy(proxySelector)
            .build();
    HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .build();

    HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
    System.out.println(response.body());
}

private static void legacyJavaHttp(String url) {
    SocketAddress proxyAddr = new InetSocketAddress("127.0.0.1", 1234);
    Proxy pr = new Proxy(Proxy.Type.SOCKS, proxyAddr);

    try {
        HttpURLConnection con = (HttpURLConnection) URI.create(url).toURL().openConnection(pr);
        con.setConnectTimeout(5000);
        con.setReadTimeout(5000);
        con.connect();
        System.out.println(con.getResponseMessage());
    } catch (IOException ex) {
        ex.printStackTrace();
    }
}

You can use either SOCKS or HTTP proxying.

You can read more on Java proxying here: https://docs.oracle.com/javase/8/docs/technotes/guides/net/proxies.html

Bight answered 17/7, 2024 at 10:24 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.