Is there a way to automatically download historical prices of stocks from yahoo finance or google finance (csv format)? Preferably in Python.
Short answer: Yes. Use Python's urllib to pull the historical data pages for the stocks you want. Go with Yahoo! Finance; Google is both less reliable, has less data coverage, and is more restrictive in how you can use it once you have it. Also, I believe Google specifically prohibits you from scraping the data in their ToS.
Longer answer: This is the script I use to pull all the historical data on a particular company. It pulls the historical data page for a particular ticker symbol, then saves it to a csv file named by that symbol. You'll have to provide your own list of ticker symbols that you want to pull.
import urllib
base_url = "http://ichart.finance.yahoo.com/table.csv?s="
def make_url(ticker_symbol):
return base_url + ticker_symbol
output_path = "C:/path/to/output/directory"
def make_filename(ticker_symbol, directory="S&P"):
return output_path + "/" + directory + "/" + ticker_symbol + ".csv"
def pull_historical_data(ticker_symbol, directory="S&P"):
try:
urllib.urlretrieve(make_url(ticker_symbol), make_filename(ticker_symbol, directory))
except urllib.ContentTooShortError as e:
outfile = open(make_filename(ticker_symbol, directory), "w")
outfile.write(e.content)
outfile.close()
AttributeError: module 'urllib' has no attribute 'ContentTooShortError'
–
Helban urllib
module to be broken into submodules. The exception class you're looking for can be found here: docs.python.org/3/library/… –
Muskeg When you're going to work with such time series in Python, pandas
is indispensable. And here's the good news: it comes with a historical data downloader for Yahoo: pandas.io.data.DataReader
.
from pandas.io.data import DataReader
from datetime import datetime
ibm = DataReader('IBM', 'yahoo', datetime(2000, 1, 1), datetime(2012, 1, 1))
print(ibm['Adj Close'])
Here's an example from the pandas
documentation.
Update for pandas >= 0.19:
The pandas.io.data
module has been removed from pandas>=0.19
onwards. Instead, you should use the separate pandas-datareader
package. Install with:
pip install pandas-datareader
And then you can do this in Python:
import pandas_datareader as pdr
from datetime import datetime
ibm = pdr.get_data_yahoo(symbols='IBM', start=datetime(2000, 1, 1), end=datetime(2012, 1, 1))
print(ibm['Adj Close'])
GOOG
is not accepted by the API (don't understand why, after move to Alphabet GOOG ticker was kept). Works fine for GOOGL
and various other symbols. Example adjusted just in case. –
Tart Short answer: Yes. Use Python's urllib to pull the historical data pages for the stocks you want. Go with Yahoo! Finance; Google is both less reliable, has less data coverage, and is more restrictive in how you can use it once you have it. Also, I believe Google specifically prohibits you from scraping the data in their ToS.
Longer answer: This is the script I use to pull all the historical data on a particular company. It pulls the historical data page for a particular ticker symbol, then saves it to a csv file named by that symbol. You'll have to provide your own list of ticker symbols that you want to pull.
import urllib
base_url = "http://ichart.finance.yahoo.com/table.csv?s="
def make_url(ticker_symbol):
return base_url + ticker_symbol
output_path = "C:/path/to/output/directory"
def make_filename(ticker_symbol, directory="S&P"):
return output_path + "/" + directory + "/" + ticker_symbol + ".csv"
def pull_historical_data(ticker_symbol, directory="S&P"):
try:
urllib.urlretrieve(make_url(ticker_symbol), make_filename(ticker_symbol, directory))
except urllib.ContentTooShortError as e:
outfile = open(make_filename(ticker_symbol, directory), "w")
outfile.write(e.content)
outfile.close()
AttributeError: module 'urllib' has no attribute 'ContentTooShortError'
–
Helban urllib
module to be broken into submodules. The exception class you're looking for can be found here: docs.python.org/3/library/… –
Muskeg Extending @Def_Os's answer with an actual demo...
As @Def_Os has already said - using Pandas Datareader makes this task a real fun
In [12]: from pandas_datareader import data
pulling all available historical data for AAPL
starting from 1980-01-01
#In [13]: aapl = data.DataReader('AAPL', 'yahoo', '1980-01-01')
# yahoo api is inconsistent for getting historical data, please use google instead.
In [13]: aapl = data.DataReader('AAPL', 'google', '1980-01-01')
first 5 rows
In [14]: aapl.head()
Out[14]:
Open High Low Close Volume Adj Close
Date
1980-12-12 28.750000 28.875000 28.750 28.750 117258400 0.431358
1980-12-15 27.375001 27.375001 27.250 27.250 43971200 0.408852
1980-12-16 25.375000 25.375000 25.250 25.250 26432000 0.378845
1980-12-17 25.875000 25.999999 25.875 25.875 21610400 0.388222
1980-12-18 26.625000 26.750000 26.625 26.625 18362400 0.399475
last 5 rows
In [15]: aapl.tail()
Out[15]:
Open High Low Close Volume Adj Close
Date
2016-06-07 99.250000 99.870003 98.959999 99.029999 22366400 99.029999
2016-06-08 99.019997 99.559998 98.680000 98.940002 20812700 98.940002
2016-06-09 98.500000 99.989998 98.459999 99.650002 26419600 99.650002
2016-06-10 98.529999 99.349998 98.480003 98.830002 31462100 98.830002
2016-06-13 98.690002 99.120003 97.099998 97.339996 37612900 97.339996
save all data as CSV file
In [16]: aapl.to_csv('d:/temp/aapl_data.csv')
d:/temp/aapl_data.csv - 5 first rows
Date,Open,High,Low,Close,Volume,Adj Close
1980-12-12,28.75,28.875,28.75,28.75,117258400,0.431358
1980-12-15,27.375001,27.375001,27.25,27.25,43971200,0.408852
1980-12-16,25.375,25.375,25.25,25.25,26432000,0.378845
1980-12-17,25.875,25.999999,25.875,25.875,21610400,0.38822199999999996
1980-12-18,26.625,26.75,26.625,26.625,18362400,0.399475
...
There is already a library in Python called yahoo_finance so you'll need to download the library first using the following command line:
sudo pip install yahoo_finance
Then once you've installed the yahoo_finance library, here's a sample code that will download the data you need from Yahoo Finance:
#!/usr/bin/python
import yahoo_finance
import pandas as pd
symbol = yahoo_finance.Share("GOOG")
google_data = symbol.get_historical("1999-01-01", "2016-06-30")
google_df = pd.DataFrame(google_data)
# Output data into CSV
google_df.to_csv("/home/username/google_stock_data.csv")
This should do it. Let me know if it works.
UPDATE: The yahoo_finance library is no longer supported.
yahoo_finance
no longer works. –
Duluth You can check out the yahoo_fin package. It was initially created after Yahoo Finance changed their API (documentation is here: http://theautomatic.net/yahoo_fin-documentation).
from yahoo_fin import stock_info as si
aapl_data = si.get_data("aapl")
nflx_data = si.get_data("nflx")
aapl_data.head()
nflx_data.head()
aapl_data.to_csv("aapl_data.csv")
nflx_data.to_csv("nflx_data.csv")
aapl_data.to_csv("aapl_data.csv")
–
Cavefish It's trivial when you know how:
import yfinance as yf
df = yf.download('CVS', '2015-01-01')
df.to_csv('cvs-health-corp.csv')
If you wish to plot it:
import finplot as fplt
fplt.candlestick_ochl(df[['Open','Close','High','Low']])
fplt.show()
© 2022 - 2024 — McMap. All rights reserved.