Making HEAD request in Ruby
Asked Answered
C

4

4

I am kind of new to ruby and from a python background I want to make a head request to a URL and check some information like if the file exists on the server and timestamp, etag etc.,I am not able to get this done in RUBY.

In Python:

import httplib2
print httplib2.Http().request('url.com/file.xml','HEAD')

In Ruby: I tried this and throwing some error

require 'net/http'

Net::HTTP.start('url.com'){|http|
   response = http.head('/file.xml')
}
puts response


SocketError: getaddrinfo: nodename nor servname provided, or not known
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `initialize'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `open'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:877:in `block in connect'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/timeout.rb:51:in `timeout'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:876:in `connect'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:861:in `do_start'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:850:in `start'
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/lib/ruby/2.0.0/net/http.rb:582:in `start'
    from (irb):2
    from /Users/comcast/.rvm/rubies/ruby-2.0.0-p0/bin/irb:16:in `<main>'
Carlettacarley answered 1/5, 2013 at 20:27 Comment(0)
B
6

I don't think that passing in a string to :start is enough; in the docs it looks like it requires a URI object's host and port for a correct address:

uri = URI('http://example.com/some_path?query=string')

Net::HTTP.start(uri.host, uri.port) do |http|
  request = Net::HTTP::Get.new uri

  response = http.request request # Net::HTTPResponse object
end

You can try this:

require 'net/http'

url = URI('yoururl.com')

Net::HTTP.start(url.host, url.port){|http|
   response = http.head('/file.xml')
   puts response
}

One thing I noticed - your puts response needs to be inside the block! Otherwise, the variable response is not in scope.

Edit: You can also treat the response as a hash to get the values of the headers:

response.each_value { |value| puts value }
Bracy answered 1/5, 2013 at 20:41 Comment(7)
thank you. priti, the url what I am trying is internal you cant access it. But in general,its a url to download a xml file. I dont want to download it before I know about it , like is it stale,duplicate etc., so head request doesnt download it but instead fetches the propertiesCarlettacarley
i tried your second method it worked but I am getting only this value back "#<Net::HTTPOK:0x13f4cf6f>", I am expecting whole lot of header information and properties about the fileCarlettacarley
I am expecting information like this ({'status': '200', 'content-length': '2983', 'accept-ranges': 'bytes', 'server': 'Apache/2.2.17 (Unix)', 'last-modified': 'Wed, 01 May 2013 20:53:26 GMT', 'etag': '"5f56a-ba7-4dbae4f35555"', 'date': 'Wed, 01 May 2013 21:11:30 GMT', 'content-type': 'application/xml'}, '')Carlettacarley
Yes. If you look at the documentation you'll see that the :head method returns an HTTPResponse object that embodies the response status code (here, it's 200 OK). You can print headers in a hash format, like puts response['content-type']Bracy
but when I do response.body it has the actual content of the file. I dont want to download the content. Because I have like 100s of files at server and they really huge like 800 MB.. so it will take up my sys mem and slows down the calls. So I just need to do HEAD request and get the properties of the file aloneCarlettacarley
hlh, thank you. your second method worked well. It did not download any content but the same time when i do response['<tag name>'] it workedCarlettacarley
Krish, you can also treat the response like a hash to get the header values. See my edit. Happy it worked for you.Bracy
E
8

I realize this has been answered but I had to go through some hoops, too. Here's something more concrete to start with:

#!/usr/bin/env ruby

require 'net/http'
require 'net/https' # for openssl

uri = URI('http://stackoverflow.com')
path = '/questions/16325918/making-head-request-in-ruby'

response=nil
http = Net::HTTP.new(uri.host, uri.port)
# http.use_ssl = true                            # if using SSL
# http.verify_mode = OpenSSL::SSL::VERIFY_NONE   # for example, when using self-signed certs

response = http.head(path)
response.each { |key, value| puts key.ljust(40) + " : " + value }
Endostosis answered 5/3, 2015 at 1:26 Comment(0)
B
6

I don't think that passing in a string to :start is enough; in the docs it looks like it requires a URI object's host and port for a correct address:

uri = URI('http://example.com/some_path?query=string')

Net::HTTP.start(uri.host, uri.port) do |http|
  request = Net::HTTP::Get.new uri

  response = http.request request # Net::HTTPResponse object
end

You can try this:

require 'net/http'

url = URI('yoururl.com')

Net::HTTP.start(url.host, url.port){|http|
   response = http.head('/file.xml')
   puts response
}

One thing I noticed - your puts response needs to be inside the block! Otherwise, the variable response is not in scope.

Edit: You can also treat the response as a hash to get the values of the headers:

response.each_value { |value| puts value }
Bracy answered 1/5, 2013 at 20:41 Comment(7)
thank you. priti, the url what I am trying is internal you cant access it. But in general,its a url to download a xml file. I dont want to download it before I know about it , like is it stale,duplicate etc., so head request doesnt download it but instead fetches the propertiesCarlettacarley
i tried your second method it worked but I am getting only this value back "#<Net::HTTPOK:0x13f4cf6f>", I am expecting whole lot of header information and properties about the fileCarlettacarley
I am expecting information like this ({'status': '200', 'content-length': '2983', 'accept-ranges': 'bytes', 'server': 'Apache/2.2.17 (Unix)', 'last-modified': 'Wed, 01 May 2013 20:53:26 GMT', 'etag': '"5f56a-ba7-4dbae4f35555"', 'date': 'Wed, 01 May 2013 21:11:30 GMT', 'content-type': 'application/xml'}, '')Carlettacarley
Yes. If you look at the documentation you'll see that the :head method returns an HTTPResponse object that embodies the response status code (here, it's 200 OK). You can print headers in a hash format, like puts response['content-type']Bracy
but when I do response.body it has the actual content of the file. I dont want to download the content. Because I have like 100s of files at server and they really huge like 800 MB.. so it will take up my sys mem and slows down the calls. So I just need to do HEAD request and get the properties of the file aloneCarlettacarley
hlh, thank you. your second method worked well. It did not download any content but the same time when i do response['<tag name>'] it workedCarlettacarley
Krish, you can also treat the response like a hash to get the header values. See my edit. Happy it worked for you.Bracy
D
3
headers = nil

uri = URI('http://my-bucket.amazonaws.com/filename.mp4')

Net::HTTP.start(uri.host, uri.port) do |http|
  headers = http.head(uri.path).to_hash
end

And now you have a hash of headers in headers

Declaim answered 27/2, 2016 at 18:34 Comment(0)
S
0

I forgot the use_ssl: true and it was hanging

uri = URI("https://www.google.com/ncr")

response = Net::HTTP.start(uri.host, uri.port, use_ssl: true) do |http| 
  http.head(uri.path)
end

# => #<Net::HTTPOK 200 OK readbody=true>
# or
# => #<Net::HTTPNotFound 404 Not Found readbody=true>
Schweinfurt answered 28/3 at 1:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.