yfinance: Don't see time on day when choosing one hour intervals
Asked Answered
S

2

5

I use the Python package yfinance to get the historical stock prices of a stock (in this example, Tesla's stock).

When I do the following, and fetch the stock price for the last week as one minute intervals:

import yfinance as yf

print(yf.Ticker('TSLA').history(period='7d', interval='1m'))

I get

                                 Open        High         Low       Close   Volume  Dividends  Stock Splits
Datetime
2020-12-03 09:30:00-05:00  586.391479  590.975586  585.549988  586.391479  2999806          0             0
2020-12-03 09:31:00-05:00  586.320007  591.919983  586.320007  591.619995   457446          0             0
2020-12-03 09:32:00-05:00  591.820007  591.907104  586.000000  587.492798   324244          0             0
2020-12-03 09:33:00-05:00  586.909973  590.020020  586.799988  588.919983   306530          0             0
2020-12-03 09:34:00-05:00  588.730774  588.919922  584.330017  584.688416   318614          0             0
...                               ...         ...         ...         ...      ...        ...           ...
2020-12-11 10:20:00-05:00  613.155029  614.059998  612.770020  613.789978    87083          0             0
2020-12-11 10:21:00-05:00  613.876404  613.960022  612.799988  613.235474    58031          0             0
2020-12-11 10:22:00-05:00  613.262390  614.010010  613.262390  614.000000   106497          0             0
2020-12-11 10:23:00-05:00  614.000000  614.000000  612.659973  613.099426    80285          0             0
2020-12-11 10:24:18-05:00  613.215027  613.215027  613.215027  613.215027        0          0             0

[2390 rows x 7 columns]

so I can see the date and the time of day for each interval.

However, when I instead choose one hour intervals:

import yfinance as yf

print(yf.Ticker('TSLA').history(period='7d', interval='1h'))

I get

                  Open        High         Low       Close    Volume  Dividends  Stock Splits
Date
2020-12-03  590.020020  595.890015  582.429993  588.159973  14637166          0             0
2020-12-03  588.164917  591.000000  583.690002  587.432983   4633556          0             0
2020-12-03  587.370117  593.599976  586.430115  592.580017   4635495          0             0
2020-12-03  592.520020  594.500000  589.450012  594.130005   2941966          0             0
2020-12-03  594.110107  598.969971  593.169983  596.325012   6434228          0             0
2020-12-03  596.499878  598.309998  591.500000  594.809998   4211141          0             0
2020-12-03  594.844971  596.539978  592.000000  593.280029   2916165          0             0
2020-12-04  591.010010  597.440002  585.500000  591.739502   9404838          0             0
2020-12-04  591.859985  595.429993  587.750000  591.310120   4337670          0             0
2020-12-04  591.397888  594.789978  589.919983  593.419983   2994462          0             0
2020-12-04  593.530029  596.000000  592.409973  593.159973   2625920          0             0
2020-12-04  593.140015  594.309998  590.330017  592.700012   2374415          0             0
2020-12-04  592.619995  596.700012  592.239990  594.233398   3066786          0             0
2020-12-04  594.215027  599.000000  594.109985  599.000000   2983803          0             0
2020-12-07  604.919678  624.750000  603.049988  624.164978  14539011          0             0
2020-12-07  624.289978  630.000000  624.109985  626.499878   8340672          0             0
2020-12-07  626.450317  629.301575  625.609985  627.753296   3925194          0             0
2020-12-07  627.734985  633.500000  625.500000  632.647583   4394597          0             0
2020-12-07  632.684998  639.989990  631.500000  638.101013   6408641          0             0
2020-12-07  638.000000  648.785583  635.340027  645.309998  10078446          0             0
2020-12-07  645.304993  648.000000  637.099976  642.000000   6027320          0             0
2020-12-08  625.505005  637.340027  618.500000  629.020020  21461425          0             0
2020-12-08  629.099976  630.830017  624.260010  624.909973   5519322          0             0
2020-12-08  624.950012  630.250000  620.929993  629.372681   5926122          0             0
2020-12-08  629.409973  640.000000  628.520020  639.946594   5931369          0             0
2020-12-08  640.000000  651.280029  636.739990  650.429199  10931715          0             0
2020-12-08  650.500000  650.599915  642.000000  646.159973   7110200          0             0
2020-12-08  646.190002  650.479980  644.229980  650.250000   4363843          0             0
2020-12-09  653.690002  654.320007  630.000000  639.059998  16440841          0             0
2020-12-09  639.083801  643.039978  635.000000  635.594788   6210129          0             0
2020-12-09  635.605591  637.799988  628.500000  632.789978   5634442          0             0
2020-12-09  632.809998  633.400024  613.309998  616.940002   7626216          0             0
2020-12-09  616.759583  618.000000  588.000000  616.830017  18922860          0             0
2020-12-09  616.809998  616.820007  598.000000  601.000000   9894340          0             0
2020-12-09  601.000000  607.879883  600.400024  604.169983   4249969          0             0
2020-12-10  574.369995  607.059998  566.340027  600.495911  22449936          0             0
2020-12-10  600.492798  624.330017  600.309998  621.483887  12631129          0             0
2020-12-10  621.710022  622.679993  609.299988  611.215027   8174524          0             0
2020-12-10  611.290894  616.397217  602.260010  615.700012   6841379          0             0
2020-12-10  615.570129  619.869995  609.929993  618.749390   4659236          0             0
2020-12-10  618.580017  624.489990  615.340027  621.729980   6155838          0             0
2020-12-10  621.599976  627.750000  621.280029  627.150024   4087834          0             0
2020-12-11  615.010010  624.000000  607.307007  612.724426  11765035          0             0
2020-12-11  613.319214  613.319214  613.319214  613.319214         0          0             0

meaning I don't know the time of day for the different intervals, just on which day they were recorded. How comes? I get the time of day when choosing one minute intervals so why don't I get it for one hour intervals? Can I get the time of day easily somehow also when using one hour intervals, or do I have to compare them to the one minute intervals and try to figure out which interval corresponds to which hour?

Sural answered 11/12, 2020 at 15:38 Comment(2)
Try 60m instead of 1h - github.com/ranaroussi/yfinance/issues/125Metamorphic
@Metamorphic Thank you very much, that solved the problem! If you post it as an answer I will mark it as the accepted one.Sural
D
5

I took a look into the coding behind yfinance in my \venv\Lib\site-packages\yfinance folder and I messed around with it for awhile, here's what I found:

base.py line 182: (yfinance queries for it's data)

quotes = utils.parse_quotes(data["chart"]["result"][0], tz)

utils.py line 131: (yfinance converts time into specified format)

quotes.index = _pd.to_datetime(timestamps, unit="s")

datetimes.py lines 605 - 617 (the function where the conversion takes place)

def to_datetime()

That last function seems to be where the date conversion happens, messing with those settings didn't do anything except break the program so I went back to base.py and kept messing with stuff there until I found out that yfinance does a second time format at line 234 and if you put elif params["interval"] == '1h': pass right after line 236 you can get the data you wan't without the missing hour/minute/second mark.

Caution modifying your yfinance library may break something, I'm not recommending this.

base.py lines 234-243 should now look like this:

if params["interval"][-1] == "m":
    df.index.name = "Datetime"
elif params["interval"] == '1h':
    pass
else:
    df.index = _pd.to_datetime(df.index.date)
    if tz is not None:
        df.index = df.index.tz_localize(tz)
    df.index.name = "Date"

Take a look at your data and you will notice a big problem...

THE BIG PROBLEM is the data adds values for market open at 9:30 and every hour after that until 4PM is at the 30 minute mark not the 00 minute mark:

                           Open    High      ...  Dividends  Stock Splits
2020-12-03 08:00:00-05:00  592.75  596.28    ...          0             0
2020-12-03 09:00:00-05:00  594.15  594.99    ...          0             0
2020-12-03 09:30:00-05:00  590.02  595.90    ...          0             0
2020-12-03 10:30:00-05:00  588.16  591       ...          0             0
...
2020-12-03 15:30:00-05:00  594.84  596.53    ...          0             0
2020-12-03 16:00:00-05:00  593.33  597.24    ...          0             0

At this point, we can keep tweaking yfinance but honestly it would be easier to get the data ourselves. I wrote the following code from picking apart yfinance and modified it to what we need:

Code to Download 1h Data Directly from Yahoo Finance

import requests
import datetime as dt


ticker = 'TSLA'
base_url = 'https://query1.finance.yahoo.com'
url = "{}/v8/finance/chart/{}".format(base_url, ticker)
params = {'interval': '1h', 'range': '7d', 'includePrePost': True}

response = requests.get(url=url, params=params)
data = response.json()

epoch = data['chart']['result'][0]['timestamp']
prices = data['chart']['result'][0]['indicators']['quote'][0]['close']

count = 0
list_of_time_and_price = []

for entry in epoch:
    date_and_time = dt.datetime.fromtimestamp(entry).strftime('%Y-%m-%d %H:%M:%S')

    list_of_time_and_price.append([date_and_time, prices[count]])

    count += 1

print(list_of_time_and_price)

You can mess around with the above to your own specifications, currently it returns a list with a datetime paired with the closing price and it does include the :30 minute data during the trading day.

EDIT

Good thing I double checked, apparently getting 1h data straight from Yahoo Finance will always return the :30 minute interval trading data so the issue isn't with the python library yfinance but the website Yahoo Finance itself. I guess the workaround for this would be to download 1 minute data and update that to 1 hour but then you're limited to using only last week's worth of data.

If you only need stock data I'd recommend checking out Alpha Vantage API which has intraday data for 2 years and it's free with a limit of 5 requests per minute (or something like that).

Dragelin answered 12/12, 2020 at 0:38 Comment(11)
Thanks for doing all the playing around! It seems like you maybe have found the source of the problem. Following putty's comment, it seems someone has created a pull request for the same place in the code. And interesting to see an alternative solution using only requests!Sural
No problem! It was rather fun picking apart the library and now I know why my EMA indicators are slightly off so I'm really glad I decided to take a look into it.Dragelin
An interesting observation I made when looking at the stock data you added to your answer is that you have intervals starting at 08:00 and 09:00, which are both before the stock has opened, and I don't understand why you get those data or how the open value can change between 08:00 and 09:00 when the stock is supposedly closed. When I follow putty's tip and use the interval 60m instead of 1h, I only get intervals starting at 09:30, 10:30, 11:30, 12:30, 13:30, 14:30 and 15:30.Sural
You also have an interval starting at 16:00, which would go from 16:00 to 17:00, during which the stock is closed. But maybe I have misinterpreted the timestamp, and that this is in fact the time the interval ended as opposed to started?Sural
What confuses me further is that your data for the 09:30 interval is different from the data I get when using the interval 60m; in my data this interval opens at 586.3914795 and has its high at 594.9699707. The intervals at 10:30 and at 15:30 seem to be more or less corresponding to yours, though, except from that my 15:30 interval has its high at 596.539978 and not at 596.53. My data is also the same no matter whether I use the 1h or 60m; only the timestamps differ. So there seem to be some discrepancies between the data you get and the data I get.Sural
Regular stocks are actually open from 4am to 8pm with trading day hours of 9:30am to 4pm. Not all brokers allow you to see pre/post market hour data, I know TD Ameritrade / thinkorswim does. When I queried yfinance I asked for 'prepost' data which is why I was getting those extra hours: yf.Ticker('TSLA').history(period='7d', interval='60m', prepost=True)Dragelin
As for the price discrepancies, I rounded my values because the table wasn't lining up correctly, I double checked and they're all correctly rounded to 2 decimals. If you look at your OP, your 9:30 value for 1m is 586.3914795 while for 1h it's 590.020020 which is just like mine 590.02 except not roundedDragelin
Indeed, that is another weird discrepancy. You would think they should be the same for 1h and 60m, wouldn't you?Sural
Wow, I didn't know about prepost data; thanks for telling me about it. So are some people able to trade stocks outside of trading day hours, or how comes that the stock price can change?Sural
Yeah honestly, I'd strongly recommend switching to Alpha Vantage, I just queried their server and their data looks properly formatted with timestamps of 9:00 10:00 11:00 etc... they only downside is they don't have futures and have that 5 requests per minute 500 per day but the quality of data is so worth itDragelin
Glad I could help! Some stocks that have future counterparts like SPY, QQQ, GLD are available to trade all 24 hours of the day, however all stocks (including futures) are closed 5pm Friday until 6pm Sunday. If you have a broker that allows you to trade pre/post market (like TD Ameritrade) then yeah you can buy those stocks pre/post market. Anyone with a broker that allows them to trade pre/post can do so and that's why stock prices still move when the "main" market is closed. There are a few stocks that ONLY trade during market hours 9:30 - 4pm, mostly small name stocks.Dragelin
B
2

This worked fine for me:

data = yf.download("TSLA",period='7d' , interval='60m')

data.to_csv("spy.csv")
Bachman answered 29/12, 2021 at 6:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.