Making HEAD request in Ruby

C

4

I am kind of new to ruby and from a python background I want to make a head request to a URL and check some information like if the file exists on the server and timestamp, etag etc.,I am not able to get this done in RUBY.

In Python:

import httplib2
print httplib2.Http().request('url.com/file.xml','HEAD')

In Ruby: I tried this and throwing some error

require 'net/http'

Net::HTTP.start('url.com'){|http|
   response = http.head('/file.xml')
}
puts response


SocketError: getaddrinfo: nodename nor servname provided, or not known
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `initialize'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `open'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `block in connect'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/timeout.rb:51:in `timeout'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:876:in `connect'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:861:in `do_start'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:850:in `start'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:582:in `start'
    from (irb):2
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/bin/irb:16:in `<main>'

Carlettacarley answered 1/5, 2013 at 20:27 Comment(0)

B

6

I don't think that passing in a string to :start is enough; in the docs it looks like it requires a URI object's host and port for a correct address:

uri = URI('http://example.com/some_path?query=string')

Net::HTTP.start(uri.host, uri.port) do |http|
  request = Net::HTTP::Get.new uri

  response = http.request request # Net::HTTPResponse object
end

You can try this:

require 'net/http'

url = URI('yoururl.com')

Net::HTTP.start(url.host, url.port){|http|
   response = http.head('/file.xml')
   puts response
}

One thing I noticed - your puts response needs to be inside the block! Otherwise, the variable response is not in scope.

Edit: You can also treat the response as a hash to get the values of the headers:

response.each_value { |value| puts value }

Bracy answered 1/5, 2013 at 20:41 Comment(7)

thank you. priti, the url what I am trying is internal you cant access it. But in general,its a url to download a xml file. I dont want to download it before I know about it , like is it stale,duplicate etc., so head request doesnt download it but instead fetches the properties – Carlettacarley 1/5, 2013 at 21:5

i tried your second method it worked but I am getting only this value back "#<Net::HTTPOK:0x13f4cf6f>", I am expecting whole lot of header information and properties about the file – Carlettacarley 1/5, 2013 at 21:6

I am expecting information like this ({'status': '200', 'content-length': '2983', 'accept-ranges': 'bytes', 'server': 'Apache/2.2.17 (Unix)', 'last-modified': 'Wed, 01 May 2013 20:53:26 GMT', 'etag': '"5f56a-ba7-4dbae4f35555"', 'date': 'Wed, 01 May 2013 21:11:30 GMT', 'content-type': 'application/xml'}, '') – Carlettacarley 1/5, 2013 at 21:11

Yes. If you look at the documentation you'll see that the :head method returns an HTTPResponse object that embodies the response status code (here, it's 200 OK). You can print headers in a hash format, like puts response['content-type'] – Bracy 1/5, 2013 at 21:13

but when I do response.body it has the actual content of the file. I dont want to download the content. Because I have like 100s of files at server and they really huge like 800 MB.. so it will take up my sys mem and slows down the calls. So I just need to do HEAD request and get the properties of the file alone – Carlettacarley 1/5, 2013 at 21:24

hlh, thank you. your second method worked well. It did not download any content but the same time when i do response['<tag name>'] it worked – Carlettacarley 1/5, 2013 at 21:30

Krish, you can also treat the response like a hash to get the header values. See my edit. Happy it worked for you. – Bracy 1/5, 2013 at 21:37

E

8

I realize this has been answered but I had to go through some hoops, too. Here's something more concrete to start with:

#!/usr/bin/env ruby

require 'net/http'
require 'net/https' # for openssl

uri = URI('http://stackoverflow.com')
path = '/questions/16325918/making-head-request-in-ruby'

response=nil
http = Net::HTTP.new(uri.host, uri.port)
# http.use_ssl = true                            # if using SSL
# http.verify_mode = OpenSSL::SSL::VERIFY_NONE   # for example, when using self-signed certs

response = http.head(path)
response.each { |key, value| puts key.ljust(40) + " : " + value }

Endostosis answered 5/3, 2015 at 1:26 Comment(0)

B

6

I don't think that passing in a string to :start is enough; in the docs it looks like it requires a URI object's host and port for a correct address:

uri = URI('http://example.com/some_path?query=string')

Net::HTTP.start(uri.host, uri.port) do |http|
  request = Net::HTTP::Get.new uri

  response = http.request request # Net::HTTPResponse object
end

You can try this:

require 'net/http'

url = URI('yoururl.com')

Net::HTTP.start(url.host, url.port){|http|
   response = http.head('/file.xml')
   puts response
}

One thing I noticed - your puts response needs to be inside the block! Otherwise, the variable response is not in scope.

Edit: You can also treat the response as a hash to get the values of the headers:

response.each_value { |value| puts value }

Bracy answered 1/5, 2013 at 20:41 Comment(7)

thank you. priti, the url what I am trying is internal you cant access it. But in general,its a url to download a xml file. I dont want to download it before I know about it , like is it stale,duplicate etc., so head request doesnt download it but instead fetches the properties – Carlettacarley 1/5, 2013 at 21:5

i tried your second method it worked but I am getting only this value back "#<Net::HTTPOK:0x13f4cf6f>", I am expecting whole lot of header information and properties about the file – Carlettacarley 1/5, 2013 at 21:6

I am expecting information like this ({'status': '200', 'content-length': '2983', 'accept-ranges': 'bytes', 'server': 'Apache/2.2.17 (Unix)', 'last-modified': 'Wed, 01 May 2013 20:53:26 GMT', 'etag': '"5f56a-ba7-4dbae4f35555"', 'date': 'Wed, 01 May 2013 21:11:30 GMT', 'content-type': 'application/xml'}, '') – Carlettacarley 1/5, 2013 at 21:11

Yes. If you look at the documentation you'll see that the :head method returns an HTTPResponse object that embodies the response status code (here, it's 200 OK). You can print headers in a hash format, like puts response['content-type'] – Bracy 1/5, 2013 at 21:13

but when I do response.body it has the actual content of the file. I dont want to download the content. Because I have like 100s of files at server and they really huge like 800 MB.. so it will take up my sys mem and slows down the calls. So I just need to do HEAD request and get the properties of the file alone – Carlettacarley 1/5, 2013 at 21:24

hlh, thank you. your second method worked well. It did not download any content but the same time when i do response['<tag name>'] it worked – Carlettacarley 1/5, 2013 at 21:30

Krish, you can also treat the response like a hash to get the header values. See my edit. Happy it worked for you. – Bracy 1/5, 2013 at 21:37

D

3

headers = nil

uri = URI('http://my-bucket.amazonaws.com/filename.mp4')

Net::HTTP.start(uri.host, uri.port) do |http|
  headers = http.head(uri.path).to_hash
end

And now you have a hash of headers in headers

Declaim answered 27/2, 2016 at 18:34 Comment(0)

S

0

I forgot the use_ssl: true and it was hanging

uri = URI("https://www.google.com/ncr")

response = Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http| 
  http.head(uri.path)
end

# => #<Net::HTTPOK 200 OK readbody=true>
# or
# => #<Net::HTTPNotFound 404 Not Found readbody=true>

Schweinfurt answered 28/3 at 1:21 Comment(0)

Recommended topics

Hot tags