Python - Twisted, Proxy and modifying content
Asked Answered
M

2

18

So i've looked around at a few things involving writting an HTTP Proxy using python and the Twisted framework.

Essentially, like some other questions, I'd like to be able to modify the data that will be sent back to the browser. That is, the browser requests a resource and the proxy will fetch it. Before the resource is returned to the browser, i'd like to be able to modify ANY (HTTP headers AND content) content.

This ( Need help writing a twisted proxy ) was what I initially found. I tried it out, but it didn't work for me. I also found this ( Python Twisted proxy - how to intercept packets ) which i thought would work, however I can only see the HTTP requests from the browser.

I am looking for any advice. Some thoughts I have are to use the ProxyClient and ProxyRequest classes and override the functions, but I read that the Proxy class itself is a combination of the both.

For those who may ask to see some code, it should be noted that I have worked with only the above two examples. Any help is great.

Thanks.

Maestricht answered 27/2, 2012 at 12:45 Comment(2)
Why didn't those solutions work for you? Did you get a traceback, did nothing happen when you ran it, or did you not understand how to modify the classes for your needs?Trident
Good question. I forgot to mention this. The first one "Need help writing a twisted proxy", I added a ProxyFactory and reactor to the answer code and it worked as a bypass proxy, but no inversion. The second, I got it working, however it would only print the HTTP requests from the browser. I was not able to get it to print the requested pages.Maestricht
K
18

To create ProxyFactory that can modify server response headers, content you could override ProxyClient.handle*() methods:

from twisted.python import log
from twisted.web import http, proxy

class ProxyClient(proxy.ProxyClient):
    """Mangle returned header, content here.

    Use `self.father` methods to modify request directly.
    """
    def handleHeader(self, key, value):
        # change response header here
        log.msg("Header: %s: %s" % (key, value))
        proxy.ProxyClient.handleHeader(self, key, value)

    def handleResponsePart(self, buffer):
        # change response part here
        log.msg("Content: %s" % (buffer[:50],))
        # make all content upper case
        proxy.ProxyClient.handleResponsePart(self, buffer.upper())

class ProxyClientFactory(proxy.ProxyClientFactory):
    protocol = ProxyClient

class ProxyRequest(proxy.ProxyRequest):
    protocols = dict(http=ProxyClientFactory)

class Proxy(proxy.Proxy):
    requestFactory = ProxyRequest

class ProxyFactory(http.HTTPFactory):
    protocol = Proxy

I've got this solution by looking at the source of twisted.web.proxy. I don't know how idiomatic it is.

To run it as a script or via twistd, add at the end:

portstr = "tcp:8080:interface=localhost" # serve on localhost:8080

if __name__ == '__main__': # $ python proxy_modify_request.py
    import sys
    from twisted.internet import endpoints, reactor

    def shutdown(reason, reactor, stopping=[]):
        """Stop the reactor."""
        if stopping: return
        stopping.append(True)
        if reason:
            log.msg(reason.value)
        reactor.callWhenRunning(reactor.stop)

    log.startLogging(sys.stdout)
    endpoint = endpoints.serverFromString(reactor, portstr)
    d = endpoint.listen(ProxyFactory())
    d.addErrback(shutdown, reactor)
    reactor.run()
else: # $ twistd -ny proxy_modify_request.py
    from twisted.application import service, strports

    application = service.Application("proxy_modify_request")
    strports.service(portstr, ProxyFactory()).setServiceParent(application)

Usage

$ twistd -ny proxy_modify_request.py

In another terminal:

$ curl -x localhost:8080 http://example.com
Keeton answered 27/2, 2012 at 17:30 Comment(4)
I tried this one and I keep getting a Content Encoding Error. Any Ideas?Maestricht
@dulac: buffer.upper() probably should not be called on arbitrary content. Replace it by mere buffer.Keeton
Sebastion: Ok! This is the one I'm going with. I can view the content as it goes through. Also, I didn't use your main function. I did: from twisted.internet import reactor reactor.listenTCP(8080, ProxyFactory() ) reactor.run() And it works just the sameMaestricht
@dulac:listenTCP() listens on all interfaces (not only localhost), run lsof -i tcp:8080 to check. Additionally if there are errors endpoints variant shutdowns the reactor properly.Keeton
C
4

For two-way proxy using twisted see the article:

http://sujitpal.blogspot.com/2010/03/http-debug-proxy-with-twisted.html

Cosmogony answered 27/2, 2012 at 15:18 Comment(1)
Thanks. I read through this. It WILL work, however I am wondering if there would be a way to do this with the proxy.Proxy modules like the above two. Would it even matter?Maestricht

© 2022 - 2024 — McMap. All rights reserved.