Rake task to download and unzip
Asked Answered
F

3

9

I would like to update a cities table every week to reflect changes in cities across the world. I am creating a Rake task for the purpose. If possible, I would like to do this without adding another gem dependency.

The zipped file is a publicly available zipped file at geonames.org/15000cities.zip.

My attempt:

require 'net/http'
require 'zip'

namespace :geocities do
  desc "Rake task to fetch Geocities city list every 3 days"
  task :fetch do

    uri = URI('http://download.geonames.org/export/dump/cities15000.zip')
    zipped_folder = Net::HTTP.get(uri) 

    Zip::File.open(zipped_folder) do |unzipped_folder| #erroring here
      unzipped_folder.each do |file|
        Rails.root.join("", "list_of_cities.txt").write(file)
      end
    end
  end
end

The return from rake geocities:fetch

rake aborted!
ArgumentError: string contains null byte

As detailed, I'm trying to unzip the file and save it to a list_of_cities.txt file. Once I the methodology down for accomplishing this, I believe I can figure out how to update my db, based on the file. (But if you have opinions on how best to handle the actual db update, other than my planned way, I'd love to hear them. But that seems like a different post entirely.)

Floccus answered 19/9, 2015 at 17:25 Comment(0)
S
8

This will save zipped_folder to disk, then unzip it and save its contents:

require 'net/http'                                                              
require 'zip'                                                                   

namespace :geocities do                                                         
  desc "Rake task to fetch Geocities city list every 3 days"                    
  task :fetch do                                                                

    uri = URI('http://download.geonames.org/export/dump/cities15000.zip')                          
    zipped_folder = Net::HTTP.get(uri)                                          

    File.open('cities.zip', 'wb') do |file|                                      
      file.write(zipped_folder)                                                 
    end                                                                         

    zip_file = Zip::File.open('cities.zip')                                     
    zip_file.each do |file|                                                     
      file.extract
    end                                                                         
  end                                                                           
end

This will extract all files inside the zip file, in this case cities15000.txt.
You can then read the contents of cities15000.txt and update your database.

If you want to extract to a different file name, you can pass it to file.extract like this:

zip_file.each do |file|                                                     
    file.extract('list_of_cities.txt')
end 
Size answered 19/9, 2015 at 19:9 Comment(4)
rake aborted! Encoding::UndefinedConversionError: "\x9C" from ASCII-8BIT to UTF-8 - occurring at the File.open block. I believe this is because we're trying to write a compressed folder to a file?Floccus
The compressed folder is a zipped file. So it is OK to write it to a file. I tried it myself and it worked fine.Size
Perhaps I'm missing something. I tried just copying and pasting your code in with no luck (with output error noted above). Do I need to alter any of it? I'll gladly accept this answer.Floccus
All we need to do to clean this up is use a binary File.open mode!Floccus
R
1

I think it can be done more easily without ruby, just using wget and unzip:

namespace :geocities do
  desc "Rake task to fetch Geocities city list every 3 days"
  task :fetch do
     `wget -c --tries=10 http://download.geonames.org/export/dump/cities15000.zip | unzip`
  end
end
Roughhew answered 19/9, 2015 at 19:39 Comment(1)
This is a viable option. But I'm trying to avoid using wget, as implementing a command line approach is less robust and debug-friendly, IMO.Floccus
B
0

Here's a working solution of downloading a zip from a remote URL to the local and unzipping to the tmp file location.

url = 'https://example.com/path/to/your_zip_file.zip'

destination = Rails.root.join('tmp', 'your_zip_file.zip')

system("curl -L -o '#{destination}' '#{url}'")


# Unzip the downloaded file using a system command
system("unzip '#{destination}' -d '#{Rails.root.join('tmp')}'")
Bartizan answered 16/5, 2023 at 10:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.