I have a list of urls in a text file.i want the images to be downloaded to a particular folder ,how i can do it.is there any addons available in chrome or any other program to download images from url
Create a folder in your machine.
Place your text file of images URL in the folder.
cd
to that folder.Use
wget -i images.txt
You will find all your downloaded files in the folder.
brew install wget
first, but after that, this was a breeze! Thanks so much! –
Stellate On Windows 10/11 this is fairly trivial using
for /F "eol=;" %f in (filelist.txt) do curl -O %f
Note the inclusion of eol=;
allows us to mask individual exclusions by adding ;
at the start of those lines in filelist.txt that we do not want this time. If using above in a batch file GetFileList.cmd then double those %%'s
So in my system I simply type and enter Do GetFileList
and all those stored URLs are downloaded, where Do
is an old DO'S trick to keep many small commands in one self editing zip file but nowadays I use CMD where Do Edit
calls it up as Notepad "%~f0"
to paste a section like this.
Part of Do.bat
:GetFileList
Rem as posted to https://stackoverflow.com/questions/42878196
for /F "eol=;" %%f in (filelist.txt) do curl -O %%f
exit /b 0
goto error GetFileList
Windows 7 has a FTP command, but that can often throw up a firewall dialog requiring a User Authorization response.
Currently running Windows 7 and wanting to download a list of URLs without downloading any wget.exe or other dependency like curl.exe (which would be simplest as the first command) the shortest compatible way is a power-shell command (not my favorite for speed, but if needs must.)
The file with URLs is filelist.txt
and IWR
is the PS near equivalent of wget
.
The Security Protocol first command ensures we are using modern TLS1.2 protocol
-OutF ... split-path ...
means the filenames will be the same as remote filenames but in CWD (current working directory), for scripting you can cd /d folder
if necessary.
PS> [Net.ServicePointManager]::SecurityProtocol = "Tls12" ; GC filelist.txt | % {IWR $_ -OutF $(Split-Path $_ -Leaf)}
To run as a CMD use a slightly different set of quotes around 'Tls12'
PowerShell -C "& {[Net.ServicePointManager]::SecurityProtocol = 'Tls12' ; GC filelist.txt | % {IWR $_ -OutF $(Split-Path $_ -Leaf)}}"
This needs to be made into a function with error handling but it repeatedly downloads images for image classification projects
import requests
urls = pd.read_csv('cat_urls.csv') #save the url list as a dataframe
rows = []
for index, i in urls.iterrows():
rows.append(i[-1])
counter = 0
for i in rows:
file_name = 'cat' + str(counter) + '.jpg'
print(file_name)
response = requests.get(i)
file = open(file_name, "wb")
file.write(response.content)
file.close()
counter += 1
import os
import time
import sys
import urllib
from progressbar import ProgressBar
def get_raw_html(url):
version = (3,0)
curr_version = sys.version_info
if curr_version >= version: #If the Current Version of Python is 3.0 or above
import urllib.request #urllib library for Extracting web pages
try:
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
request = urllib.request.Request(url, headers = headers)
resp = urllib.request.urlopen(request)
respData = str(resp.read())
return respData
except Exception as e:
print(str(e))
else: #If the Current Version of Python is 2.x
import urllib2
try:
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
request = urllib2.Request(url, headers = headers)
try:
response = urllib2.urlopen(request)
except URLError: # Handling SSL certificate failed
context = ssl._create_unverified_context()
response = urlopen(req,context=context)
#response = urllib2.urlopen(req)
raw_html = response.read()
return raw_html
except:
return"Page Not found"
def next_link(s):
start_line = s.find('rg_di')
if start_line == -1: #If no links are found then give an error!
end_quote = 0
link = "no_links"
return link, end_quote
else:
start_line = s.find('"class="rg_meta"')
start_content = s.find('"ou"',start_line+1)
end_content = s.find(',"ow"',start_content+1)
content_raw = str(s[start_content+6:end_content-1])
return content_raw, end_content
def all_links(page):
links = []
while True:
link, end_content = next_link(page)
if link == "no_links":
break
else:
links.append(link) #Append all the links in the list named 'Links'
#time.sleep(0.1) #Timer could be used to slow down the request for image downloads
page = page[end_content:]
return links
def download_images(links, search_keyword):
choice = input("Do you want to save the links? [y]/[n]: ")
if choice=='y' or choice=='Y':
#write all the links into a test file.
f = open('links.txt', 'a') #Open the text file called links.txt
for link in links:
f.write(str(link))
f.write("\n")
f.close() #Close the file
num = input("Enter number of images to download (max 100): ")
counter = 1
errors=0
search_keyword = search_keyword.replace("%20","_")
directory = search_keyword+'/'
if not os.path.isdir(directory):
os.makedirs(directory)
pbar = ProgressBar()
for link in pbar(links):
if counter<=int(num):
file_extension = link.split(".")[-1]
filename = directory + str(counter) + "."+ file_extension
#print ("Downloading image: " + str(counter)+'/'+str(num))
try:
urllib.request.urlretrieve(link, filename)
except IOError:
errors+=1
#print ("\nIOError on Image" + str(counter))
except urllib.error.HTTPError as e:
errors+=1
#print ("\nHTTPError on Image"+ str(counter))
except urllib.error.URLError as e:
errors+=1
#print ("\nURLError on Image" + str(counter))
counter+=1
return errors
def search():
version = (3,0)
curr_version = sys.version_info
if curr_version >= version: #If the Current Version of Python is 3.0 or above
import urllib.request #urllib library for Extracting web pages
else:
import urllib2 #If current version of python is 2.x
search_keyword = input("Enter the search query: ")
#Download Image Links
links = []
search_keyword = search_keyword.replace(" ","%20")
url = 'https://www.google.com/search?q=' + search_keyword+ '&espv=2&biw=1366&bih=667&site=webhp&source=lnms&tbm=isch&sa=X&ei=XosDVaCXD8TasATItgE&ved=0CAcQ_AUoAg'
raw_html = (get_raw_html(url))
links = links + (all_links(raw_html))
print ("Total Image Links = "+str(len(links)))
print ("\n")
errors = download_images(links, search_keyword)
print ("Download Complete.\n"+ str(errors) +" errors while downloading.")
search()
In this python project I make a search in unsplash.com, which brings me a list of URL, then I save a number of them (pre-defined by the user) to a pre-defined folder. Check it out.
On Windows, install wget - https://sourceforge.net/projects/gnuwin32/files/wget/1.11.4-1/
and add C:\Program Files (x86)\GnuWin32\bin
to your environment path.
create a folder with a txt file of all the images you want to download.
in the location bar at the top of the file explorer type cmd
When the command prompt opens enter the following.
wget -i images.txt --no-check-certificate
If you want a solution in windows.
Prepare Your Text File: Create a text file containing the URLs of the images, with one URL per line with name file.txt.
Open PowerShell: Press Win + X, then select "Windows PowerShell" or "Windows PowerShell (Admin)" from the menu. Alternatively, you can search for "PowerShell" in the Start menu and open it from there.
Navigate to the Directory: Use the cd command to navigate to the directory where your text file is located. For example:
$urls = Get-Content "file.txt"
$outputDirectory = "output"
New-Item -ItemType Directory -Force -Path $outputDirectory | Out-Null
foreach ($url in $urls) {
$uri = [System.Uri]$url
$filename = [System.IO.Path]::GetFileNameWithoutExtension($uri.Segments[-1])
$extension = [System.IO.Path]::GetExtension($uri.Segments[-1])
$parameters = $uri.Query -replace '\?', '_' -replace '=', '-' -replace '&', '_'
$outputFilename = $filename + $parameters + $extension
$outputPath = Join-Path -Path $outputDirectory -ChildPath $outputFilename
Invoke-WebRequest -Uri $url -OutFile $outputPath
}
If you don't want to download software like wget
, you could take advantage of Chrome's built-in Save as Webpage, Complete option:
Create an HTML file
images.html
with all of your images:<img src="https://example.com/image1.jpg" /> <img src="https://example.com/image2.jpg" /> ...
Open the HTML file in Chrome.
File > Save Page As…
Choose
Webpage, Complete
as the Format.Navigate to where you saved the files and open the corresponding
_files
folder. It should contain all the images from the webpage.
© 2022 - 2024 — McMap. All rights reserved.