TL;DR a web client uses CONNECT
only when it knows it talks to a proxy and the final URI begins with https://
.
When a browser says:
CONNECT www.google.com:443 HTTP/1.1
it means:
Hi proxy, please open a raw TCP connection to google; any following
bytes I write, you just repeat over that connection without any
interpretation. Oh, and one more thing. Do that only if you talk to
Google directly, but if you use another proxy yourself, instead you
just tell them the same CONNECT
.
Note how this says nothing about TLS (https). In fact CONNECT
is orthogonal to TLS; you can have only one, you can have other, or you can have both of them.
That being said, the intent of CONNECT
is to allow end-to-end encrypted TLS session, so the data is unreadable to a proxy (or a whole proxy chain). It works even if a proxy doesn't understand TLS at all, because CONNECT
can be issued inside plain HTTP and requires from the proxy nothing more than copying raw bytes around.
But the connection to the first proxy can be TLS (https) although it means a double encryption of traffic between you and the first proxy.
Obviously, it makes no sense to CONNECT
when talking directly to the final server. You just start talking TLS and then issue HTTP GET
. The end servers normally disable CONNECT
altogether.
To a proxy, CONNECT
support adds security risks. Any data can be passed through CONNECT
, even ssh hacking attempt to a server on 192.168.1.*, even SMTP sending spam. Outside world sees these attacks as regular TCP connections initiated by a proxy. They don't care what is the reason, they cannot check whether HTTP CONNECT
is to blame. Hence it's up to proxies to secure themselves against misuse.