I wrote a little web crawler and had known that the Response is a zip file.
In my limited experience with golang programing, I only know how to unzip a existing file.
Can I unzip the Response.Body in memory without saving it in hard disk in advance?
Updating answer for handling Zip file response body in-memory.
Note: Ensure you have sufficient memory for handling zip file.
package main
import (
"archive/zip"
"bytes"
"fmt"
"io/ioutil"
"log"
"net/http"
)
func main() {
resp, err := http.Get("zip file url")
if err != nil {
log.Fatal(err)
}
defer resp.Body.Close()
body, err := ioutil.ReadAll(resp.Body)
if err != nil {
log.Fatal(err)
}
zipReader, err := zip.NewReader(bytes.NewReader(body), int64(len(body)))
if err != nil {
log.Fatal(err)
}
// Read all the files from zip archive
for _, zipFile := range zipReader.File {
fmt.Println("Reading file:", zipFile.Name)
unzippedFileBytes, err := readZipFile(zipFile)
if err != nil {
log.Println(err)
continue
}
_ = unzippedFileBytes // this is unzipped file bytes
}
}
func readZipFile(zf *zip.File) ([]byte, error) {
f, err := zf.Open()
if err != nil {
return nil, err
}
defer f.Close()
return ioutil.ReadAll(f)
}
By default Go HTTP client handles Gzip response automatically. So do typical read and close of response body.
However there is a catch in it.
// Reference https://github.com/golang/go/blob/master/src/net/http/transport.go
//
// DisableCompression, if true, prevents the Transport from
// requesting compression with an "Accept-Encoding: gzip"
// request header when the Request contains no existing
// Accept-Encoding value. If the Transport requests gzip on
// its own and gets a gzipped response, it's transparently
// decoded in the Response.Body. However, if the user
// explicitly requested gzip it is not automatically
// uncompressed.
DisableCompression bool
What it means is; If you add a header Accept-Encoding: gzip
manually in the request then you have to handle Gzip response body by yourself.
For Example -
reader, err := gzip.NewReader(resp.Body)
if err != nil {
log.Fatal(err)
}
defer reader.Close()
body, err := ioutil.ReadAll(reader)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(body))
.zip
file, my bad. I will update the code snippet in a while. –
Fogged ioutil.ReadAll(f)
is deprecated for io.ReadAll(f)
–
Chambliss io.ReadAll
! in the first place. This will load the whole body into memory, causing out of memory issues and most likely also memory leaks. Instead try to go with io.Copy
. –
Theona io.Copy
. –
Theona I believe the other proposed solutions are not great as they will not give you the full idea how to unzip the all the content of the zip file.
Furthermore, the example above is using ReadAll
, which should be avoid (since that will read the whole content into the memory!).
Instead of io.ReadAll
, this example uses io.Copy
to avoid out-of-memory issues as well as memory leaks.
Instead of reading everything in memory, I try to first write the body response content into a temporary file via io.Copy
. After that I use zip.OpenReader
to read-in the file. I know for sure that my example is currently not causing any (major) memory leaks. Correct me if I'm wrong.
See example below. Which should work with large zip files (eg. 100MB+) and executing this function within a goroutine shouldn't be a problem either.
package main
import (
"archive/zip"
"fmt"
"io"
"log"
"net/http"
"os"
"path/filepath"
"time"
)
// Global reusable HTTP client with time-out set to 30s
var httpClient = &http.Client{
Timeout: 30 * time.Second,
}
// Example download function
func download() {
res, err := httpClient.Get("https://somedomain.com/yourfile.zip")
if err != nil {
log.Printf("Error making http request: %v\n", err)
return
}
defer res.Body.Close() // Close the body resource always
// Create a temporary file to store the response body
tmpFile, err := os.CreateTemp("", "temp.zip")
if err != nil {
log.Printf("Error creating temporary file: %v\n", err)
return
}
defer os.Remove(tmpFile.Name()) // Clean up the temporary file afterwards
// Copy the response body to the temporary file
_, err = io.Copy(tmpFile, res.Body)
if err != nil {
log.Printf("Error copying response body to temporary file: %v\n", err)
return
}
tmpFile.Close()
// Unzip the data from resBody
err = unzip(tmpFile.Name(), destinationPath)
if err != nil {
log.Printf("Failed to unzip file: %v", err)
return
}
}
func unzip(filename string, dest string) error {
reader, err := zip.OpenReader(filename)
if err != nil {
return fmt.Errorf("failed to create zip reader: %w", err)
}
defer reader.Close()
for _, file := range reader.File {
filePath := filepath.Join(dest, file.Name)
// Create directories as needed
if file.FileInfo().IsDir() {
if err := os.MkdirAll(filePath, os.ModePerm); err != nil {
return fmt.Errorf("failed to create directory: %w", err)
}
continue
}
// Create a file
if err := os.MkdirAll(filepath.Dir(filePath), os.ModePerm); err != nil {
return fmt.Errorf("failed to create directory for file: %w", err)
}
dstFile, err := os.OpenFile(filePath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, file.Mode())
if err != nil {
return fmt.Errorf("failed to open file: %w", err)
}
// Extract the file
srcFile, err := file.Open()
if err != nil {
dstFile.Close()
return fmt.Errorf("failed to open zip file: %w", err)
}
_, err = io.Copy(dstFile, srcFile)
// Close the open files
dstFile.Close()
srcFile.Close()
if err != nil {
return fmt.Errorf("failed to copy file contents: %w", err)
}
}
return nil
}
Feel free to extend my example by another example that can do everything in memory (without the need of a temp file, but also without memory leaks of course).
I hope this will helps somebody!
© 2022 - 2025 — McMap. All rights reserved.