em-http stream with basic auth and gzip hangs
Asked Answered
E

4

6

I'm attempting to consume the Gnip PowerTrack API which requires me to connect to an HTTPS stream of JSON with basic auth. I feel like this should be fairly trivial so I'm hoping some rubyist who is smarter than me can point out my obvious mistake.

Here's relevant parts my ruby 1.9.3 code:

require 'eventmachine'
require 'em-http'
require 'json'

usage = "#{$0} <user> <password>"
abort usage unless user = ARGV.shift
abort usage unless password = ARGV.shift
GNIP_STREAMING_URL = 'https://stream.gnip.com:443/foo/bar/prod.json'

http = EM::HttpRequest.new(GNIP_STREAMING_URL)
EventMachine.run do
  s = http.get(:head => { 'Authorization' => [user, password], 'accept' => 'application/json', 'Accept-Encoding' => 'gzip,deflate' }, :keepalive => true, :connect_timeout => 0, :inactivity_timeout => 0)

  buffer = ""
  s.stream do |chunk|
    buffer << chunk
    while line = buffer.slice!(/.+\r?\n/)
      puts JSON.parse(line)
    end
  end
end

The stream connects (My Gnip dashboard repors a connection) but then just buffers and never outputs anything. In fact, it seems like it never enters the s.stream do.. block. Note that this is a GZip encoded stream.

Note that this works:

curl --compressed -uusername $GNIP_STREAMING_URL

EDIT: I'm sure this is kinda implicit, but I can't give out any login creds or the actual URL, so don't ask ;)

EDIT #2: yajl-ruby would probably work if I could figure out how to encode credentials for the URL (simple URL encoding doesn't seem to work as I fail authentication with Gnip).

EDIT #3: @rweald found that em-http does not support streaming gzip, I've created a GitHub issue here.

EDIT #4: I have forked and fixed this in em-http-request, you can point at my fork if you want to use em-http this way. The patch has been merged into the maintainer's repo and will be working in the next release.

EDIT #5: My fixes have been published in em-http-request 1.0.3, so this should no longer be an issue.

Ewell answered 19/2, 2012 at 4:55 Comment(0)
G
2

The problem lies within em-http-request. If you look at https://github.com/igrigorik/em-http-request/blob/master/lib/em-http/decoders.rb

You will notice that the GZIP decompressor can not do streaming decompression :( https://github.com/igrigorik/em-http-request/blob/master/lib/em-http/decoders.rb#L100

You would need to fix the underlying streaming gzip problem if you wanted to be able to read a stream using em-http-request

Geyser answered 22/2, 2012 at 5:13 Comment(5)
Nice find! Maybe we fix the em-http gem. If not, is there a way to use yajl-ruby or curb to keep the connection alive and then try reconnecting in an exponential backoff pattern?Ewell
Yeah so I actually found a workaround yesterday that will allow streaming gzip json from GNIP. I am working on cleaning up the code now and you will be able to see it in my github project github.com/rweald/gnip-streamGeyser
I think I should be able to generalize the fix as well so that it could be added as a patch to em-http-request. I will have a look over the weekend.Geyser
Only problem with taking yajl-ruby's impl is that they don't want native extensions, but it looks like you're kicking ass on gnip-stream so thanks :) - Let me know if I can help anyEwell
Bounty awarded really just due to all the great work you're doing on gnip-stream :) Will use curb in the interim. Thanks!Ewell
K
1

I have been using some code base off of this Gist to connect to Gnip console. https://gist.github.com/1468622

Kerrykersey answered 5/3, 2012 at 21:45 Comment(1)
Thanks! Would have saved so much time had I found that earlier, but I may be able to patch the gnip-stream gem now :)Ewell
G
0

it looks like using https://github.com/brianmario/yajl-ruby would solve this nicely

Grandioso answered 21/2, 2012 at 22:21 Comment(2)
It did look promising, but I can't figure out how to encode the username and password such that I don't get this error: "lib/ruby/1.9.1/uri/generic.rb:411:in `check_user': bad component(expected userinfo component or user component)"Ewell
This actually wont help either. If you look at the yajl-ruby code for http_request you will notice that it only supports gzip if the response is not "Chunked" which the GNIP response is. github.com/brianmario/yajl-ruby/blob/master/lib/yajl/…Geyser
E
0

Gnip suggested I use curb and here's what I came up with from their example:

require 'rubygems'
require 'curb'

# Usage: <script> username password url
# prints data to stdout.
usage = "#{$0} <user> <password> <url>"
username, password, url = ARGV.first 3

Curl::Easy.http_get url do |c|
  c.http_auth_types = :basic
  c.username = username
  c.password = password
  c.encoding = 'gzip'
  c.on_body do |data|
    puts data
    data.size # required by curl's api.
  end
end

Though I would like something that will reconnect when the connection is dropped and handle different types of failures gracefully.

Ewell answered 26/2, 2012 at 17:55 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.