2010-09-17

HTTP requests specifying the Host header (in Python)

Since it took me some time to find this solution, I think it might be worth to share it.

When you have a web server listening on a single IP address but serving several domains, it's quite common to run:
$ curl -H "Host: domain" http://ip_address/path/to/the/page.html
if you need to view that domain directly pointing to the web server (so avoiding any balancers or network "magics" you might have in place).

Well, the question is: how to do that in Python? I find my answer in the httplib module:
import httplib
conn = httplib.HTTPConnection("ip_address")
conn.putrequest("GET", "/", skip_host=True)
conn.putheader("Host", "domain.ext")
conn.endheaders()
res = conn.getresponse()
print res.read()
HTH

3 comments:

Anonymous said...

You can also do this using urllib2.Request:

req=urllib2.Request('http://ip_address/path/to/the/page.html')
req.add_header('host','domain.ext')
print urllib2.urlopen(req).read()

Kai Hendry said...

I prefer the curl line. :)

Sandro Tosi said...

@foone: what python version are you using? I probably should have specified that I'm on 2.5, and that I tried with urllib2 but didn't work (and it made sense, since in the last line of the doc it says that "a few standard headers (Content-Length, Content-Type and Host) are added when the Request is passed to urlopen() (or OpenerDirector.open())." so it seems something autatically done, maybe in a later release this is changed.)

@Kai: sure, if you just have to retrieve the page to see "if it works" but when you'll have to something more with the retrieved data (compare it to a "master copy", md5sum it, and all the stuff) have it in python helps a lot :)