Google Drive Python API resumable upload error 401 after 2 hours
C

2

5

First of all, I'm sorry if this is a too silly question... this is the first time I'm trying to use any of the technologies involved in this script (Python, the drive api, oauth 2.0, etc). I swear I've been searching and trying this for about a week before posting the question. hehehe

I'm trying to use the google-api-python-client to upload a big file (3.5GiB) that is on a terminal only Linux Debian. I've had some success uploading small files, but when I try to upload the big file, the upload stops about 1~2 hours after it started with HTTP 401 error (unauthorized). I've been looking on how to get a new access token but have had little success.

This is my (updated) code so far:

#!/usr/bin/python

import httplib2
import pprint
import time

from apiclient.discovery import build
from apiclient.http import MediaFileUpload
from apiclient import errors
from oauth2client.client import OAuth2WebServerFlow

# Copy your credentials from the APIs Console
CLIENT_ID = 'myclientid'
CLIENT_SECRET = 'myclientsecret'

# Check https://developers.google.com/drive/scopes for all available scopes
OAUTH_SCOPE = 'https://www.googleapis.com/auth/drive'

# Redirect URI for installed apps
REDIRECT_URI = 'urn:ietf:wg:oauth:2.0:oob'

# Run through the OAuth flow and retrieve credentials
flow = OAuth2WebServerFlow(CLIENT_ID, CLIENT_SECRET, OAUTH_SCOPE, REDIRECT_URI)
authorize_url = flow.step1_get_authorize_url()
print 'Go to the following link in your browser: ' + authorize_url
code = raw_input('Enter verification code: ').strip()
credentials = flow.step2_exchange(code)

# Create an httplib2.Http object and authorize it with our credentials
http = httplib2.Http()
http = credentials.authorize(http)

drive_service = build('drive', 'v2', http=http)

# Insert a file
media_body = MediaFileUpload('bigfile.zip', mimetype='application/octet-stream', chunksize=1024*256, resumable=True)
body = {
    'title': 'bigfile.zip',
    'description': 'Big File',
    'mimeType': 'application/octet-stream'
}

retries = 0
request = drive_service.files().insert(body=body, media_body=media_body)
response = None
while response is None:
    try:
            print http.request.credentials.access_token
            status, response = request.next_chunk()
            if status:
                    print "Uploaded %.2f%%" % (status.progress() * 100)
                    retries = 0
    except errors.HttpError, e:
            if e.resp.status == 404:
                    print "Error 404! Aborting."
                    exit()
            else:   
                    if retries > 10:
                            print "Retries limit exceeded! Aborting."
                            exit()
                    else:   
                            retries += 1
                            time.sleep(2**retries)
                            print "Error (%d)... retrying." % e.resp.status
                            continue
print "Upload Complete!"

After some digging, I found out that the authorized http object automatically refreshes the access token after receiving 401. Although it's really changing the access token, it's still not continuing the upload as expected... see the output below:

ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.28%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.29%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.29%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Uploaded 2.30%
ya29.AHES6ZTo_-0oDqwn3JnU2uCR2bRjpRGP0CSQSMHGr6KvgEE
Error (401)... retrying.
ya29.AHES6ZQqp3_qbWsTk4yVDdHnlwc_7GvPZiFIReDnhIIiHao
Error (401)... retrying.
ya29.AHES6ZSqx90ZOUKqDEP4AAfWCVgXZYT2vJAiLwKDRu87JOs
Error (401)... retrying.
ya29.AHES6ZTp0RZ6U5K5UdDom0gq3XHnyVS-2sVU9hILOrG4o3Y
Error (401)... retrying.
ya29.AHES6ZSR-IOiwJ_p_Dm-OnCanVIVhCZLs7H_pYLMGIap8W0
Error (401)... retrying.
ya29.AHES6ZRnmM-YIZj4S8gvYBgC1M8oYy4Hv5VlcwRqgnZCOCE
Error (401)... retrying.
ya29.AHES6ZSF7Q7C3WQYuPAWrxvqbTRsipaVKhv_TfrD_gef1DE
Error (401)... retrying.
ya29.AHES6ZTsGzwIIprpPhCrqmoS3UkPsRzst5YHqL-zXJmz6Ak
Error (401)... retrying.
ya29.AHES6ZSS_1ZBiQJvZG_7t5uW3alsy1piGe4-u2YDnwycVrI
Error (401)... retrying.
ya29.AHES6ZTLFbBS8mSFWQ9zK8cgbX8RPeLghPxkfiKY54hBB-0
Error (401)... retrying.
ya29.AHES6ZQBeMWY50z6fWXvaCcd5_AJr_AYOuL2aiNKpK-mmyU
Error (401)... retrying.
ya29.AHES6ZTs2mYYSEyOqI_Ms4itKDx36t39Oc5RNZHkV4Dq49c
Retries limit exceeded! Aborting.

I'm using debian lenny with Python 2.5.2 installed, and installed the ssl and google-api-python-client through pip install about a week ago.

Thanks in advance for any help.

EDIT: Apparently, the problem isn't with the api. I tried the same code above, but with two small files, with 1h between them (system.sleep()). The output was:

ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Uploaded 66.89%
ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Upload 1 Complete!
ya29.AHES6ZRUssiLfuhqCP9Cu7C7LuhRV2rYzPldU27wiMJZWb8
Uploaded 57.62%
ya29.AHES6ZQd3o1ciwXpNFImH3CK0-dJAtQba_oeIO9DDbIq154
Upload 2 Complete!

For the second upload, a new access token was used successfully. So, perhaps the resumable session is expiring after some time or is only valid for that specific access token?

Casein answered 11/1, 2013 at 20:49 Comment(3)
According to the documentation on credentials.authorize, I think the http.request object should be automatically refreshing the access token when a 401 is received. I'll do some debugging to find out if that's really happening.Casein
I confirmed it. Even if I do nothing to refresh the access token, the API does a refresh after receiving 401 (I did a print of http.request.credentials.access_token and it changes automatically after the first 401). But it's not working... maybe a bug?Casein
Another update: I tried to upload the file via simple upload (resumable = False), but got an error about the filesize: "OverflowError: long int too large to convert to int"Casein
C
4

I filed an issue on the google-api-python-client project, and according to Joe Gregorio from google, the problem is in the backend:

"This is an issue with the backend and not with the API or with your code. As you deduced, if the upload goes too long the access_token expires and at that point the resumable upload can't be continued. There is work on progress to fix this issue right now, I will update this bug once the issue is fixed on the server side."

Casein answered 14/1, 2013 at 15:2 Comment(7)
And still need to be solved after half year pass. Does anyone know any workaround?Plumose
Joe Gregorio said 2 days ago the issue is fixed.Plumose
No luck here. Still stops after about 1 hour with 401. I still believe I will "celebrate" this bug's birthday. hehehCasein
Anyone brought the cake? The bug's first birthday is today! I didn't test it recently though... gave up on this a long time ago.Casein
Apparently, the old API should work fine. Someone commented that on the linked issue yesterday. Again, I didn't test.Casein
Yes, I've tested it about one year ago. That API does not check the token in new way.Plumose
This bug has since been fixed.Amoritta
T
0

I assume the problem is that after the 1-2 hour limit your access token to your remote database expires; cutting off your connection with the remote server. I think what you could do is look at your hosts API manual... They should have something in there about 'refresh tokens'(They get you another Access Token, note some hosts only allow you to use one refresh token per session), if they are allowed an unlimited amount you can use a combination of a timer and AJAX to keep asking for more access tokens.

If not then you would have a make an AJAX request for another Authorization Token and exchange that for another Access token every hour. That sounds like a very rigorous process but I think that is the only way if your token keeps expiring.

Also just on another note have you tried other methods of uploading? If you said the above script ran for 1-2 hours and it only uploaded 1.44% of the file that could take 100+ hours to fully upload (Way too long for only 3 Gigs).

Tourer answered 11/1, 2013 at 21:8 Comment(5)
Hi, Devon. Thanks for your quick answer. I suspect the problem is really the expiration of the access token. I tried using the Credentials.refresh function to renew it, but couldn't get it working. See the exception 401 handling. The manual talks about access (expires) and refresh tokens (never expires), but I'm not finding a way to implement the access token renew. As for the time it takes to upload, it's normal for average upload speed in my city (128kbps ~ 256kbps).Casein
@user1970843: I am not exactly sure what the documentation for your specific server host is to get a refresh key so unfortunately I cannot give you code specific to their requirements. But as far as I know a refresh token should not expire as long as your session is running but most hosts only let you use it once... Is it possible for you to use AJAX just replicate your first process of getting the Authentication Token and exchanging for a Access Token; because if that is a function you can just use AJAX to re-run that function every hour or so.Tourer
I'm not very familiar with AJAX, but isn't that a technology for websites? Mine is a local python script, I can't see where AJAX would apply. I tried repeating the authentication process (including accessing the website to get a new access code) all over again after the first 401, still no luck.Casein
@user1970843: Oh sorry I misunderstood I thought since you were using API it was a web based platform, but now I get it. Either way you can do the same thing with python; it is the same idea just different language scripts. I think I have a possible idea for what is happening; possibly your access token expires on your end but the server is still looking to refresh that same token but cannot find it. Maybe there is some way to keep that same token value and update your token with that same value either every hour of check for when it expires.Tourer
But the API doesn't check for token expiration time... I just found out that it tries (automatically) to get a new access token using the refresh token after receiving 401 from the server, which means (I guess) the token expired on the server end.Casein

© 2022 - 2024 — McMap. All rights reserved.