Relevant information: issue 3602 on GitHub
I'm working on a project that gathers and tests public/free proxies, and noticed that when I use the curl_multi interface for testing these proxies, sometimes I get many 28(timeout)
errors. This never happens if I test every proxy alone.
The problem is that this issue is unreliably reproducible, and it does not always show up , it could be something in curl or something else.
Unfortunately, I'm not such a deep networks debugger and I don't know how to debug this issue on a deeper level, however I wrote 2 C testing programs (one of them is originally written by Daniel Stenberg but I modified it's output to the same format as the other C program). These 2 C programs test 407 public proxies using curl
with curl_multi interface (which has the problem)
with curl on many threads, each curl operates on a thread. (which has no problem)
These are the 2 C programs I wrote for testing I'm not a C developer so please let me know about anything wrong you notice in the 2 programs.
This is the original PHP class that I used for reproducing the issue a month ago.
And these are the 2 C programs tests results. You can notice that the tests done with curl_multi timeout, while the timeouts made by curl-threads are stable (about 50 out of 407 of the proxies are working).
This is a sample from the test results. Please note columns 4 and 5 to see how the curl threads timeout about ~170 times and successfully connect ~40 times. Out of these, curl_multi makes 0 successful connections and timeouts ~300 times out of 407 proxies.
column(1) : #
column(2) : time(UTC)
column(3) : total execution time (seconds)
column(4) : no error 0 (how many requests result in no error CURLE_OK)
column(5) : error 28 (how many requests result in error 28 CURLE_OPERATION_TIMEDOUT)
column(6) : error 7 (how many requests result in error 7 CURLE_COULDNT_CONNECT)
column(7) : error 35 (how many requests result in error 35 CURLE_SSL_CONNECT_ERROR)
column(8) : error 56 (how many requests result in error 56 CURLE_RECV_ERROR)
column(9) : other errors (how many requests result in errors other than the above)
column(10) : program that used the curl
column(11) : cURL version
c(1) c(2) c(3)c(4)c(5)c(6)c(7)c(8)c(9) c(10) c(11)
267 2019-3-28 01:58:01 40 43 176 183 1 4 0 C (curl - threads) (Linux Fedora) 7.59.0
268 2019-3-28 01:59:01 30 0 286 110 1 10 0 C (curl-multi one thread) (Linux Fedora) 7.59.0
269 2019-3-28 02:00:01 30 46 169 181 1 8 2 C (curl - threads) (Linux Fedora) 7.59.0
270 2019-3-28 02:01:01 31 0 331 74 1 1 0 C (curl-multi one thread) (Linux Fedora) 7.59.0
271 2019-3-28 02:02:01 30 42 173 186 1 4 1 C (curl - threads) (Linux Fedora) 7.59.0
272 2019-3-28 02:03:01 30 0 277 116 1 13 0 C (curl-multi one thread) (Linux Fedora) 7.59.0
Why does curl_multi timeout inconsistently with most of the connections, while curl-threads never does this?
I downloaded Wireshark and used it to capture the traffic while each of the 2 C programs was running, I also filtered the traffic to the proxies list used by the 2 C programs, and saved the files on GitHub.
the curl-threads program (the expected behavior)
63 successful connections and 158 connections timeout out of 407 proxies.
- this is the program output.
- this is the Wireshark .pcapng raw file.
the curl_multi program (the unexpected behavior)
0 successful connections and 272 connections timeout out of 407 proxies.
- this is the program output.
- this is the Wireshark .pcapng raw file.
You can open the .pcapng
files using Wireshark and see the recorded traffic on my computer while both expected/unexpected behavior. I filtered the traffic to the 407 proxy IPs and left Wireshark open for a little while after the 30 seconds of curl limit because I noticed some packets still showing up. I don't know Wireshark and this level of networking, but I thought this could be useful.
Note on the bandwidth:
Open the .pcapng
file of the curl_threads program (the normal behavior) in wireshark and go to Statistics > Conversations . you will see a window like this
I have copied the data and saved them here on GitHuB , now calculate the Sum
of the Bytes sent from A->B and B->A.
The ENTIRE bandwidth needed to work normally is about 692.8 KB.
CURLOPT_VERBOSE
. It may also be considerable to use the C version provided by badger on GitHub for consistency. – Geochemistrystrace curl
. – Geochemistrycurl
for 1 request on 1 thread, and it works fine.curl_multi
still produces28
timeout errors, I don't use anymore. – Gayomart