Easiest way to get file ID from URL on Google Apps Script
Asked Answered
K

14

54

Here is what I'm trying to do: given a Google document URL, I want to get the document ID to create a copy on Google Drive. I know I can achieve that by some regex or replacing on the URL, but as there are several different forms to represent the same document in a URL, I wanted to find a generic solution.

Currently, that's the best I could think:

function getFileIdFromUrl(url) {
  try {
    return getDocIdFromUrl(url);
  } catch (e) {
    return getSpreadsheetIdFromUrl(url);
  }
}

function getDocIdFromUrl(url) {
  var doc = null;
  try {
    doc = DocumentApp.openByUrl(url);
  } catch (e) {
    doc = DocumentApp.openByUrl(url + "/edit");
  }
  return doc.getId();
}

function getSpreadsheetIdFromUrl(url) {
  var spreadsheet = null;
  try {
    spreadsheet = SpreadsheetApp.openByUrl(url);
  } catch (e) {
    spreadsheet = SpreadsheetApp.openByUrl(url + "/edit");
  }
  return spreadsheet.getId();
}

function copy(url) { // may throw an exception if the URL is invalid or private
   var id = getFileIdFromUrl(url);
   var file = DriveApp.getFileById(id);
   file.makeCopy().setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW);
}

The problem is that my solution only covers documents and spreadsheets, I would like to do the same with any uploaded file, for example:

https://docs.google.com/file/d/0B-FYu_D7D7x4REdtRVEzVH0eU0/edit

In short, I wanted something like that:

DriveApp.getFileByUrl(url).makeCopy();

Does anyone know if it's possible?

Any safe solution to extract the file ID from the file URL would fit as well for me.

Thanks

Klepht answered 30/5, 2013 at 15:23 Comment(0)
S
140

DriveApp is indeed missing a getFileByUrl (and also folder for that matter). You may want to open an enhancement request on Apps Script issue tracker.

But what I do on my scripts (since these openByUrl functions are somewhat new), is to get the id using a regex. Like this.

function getIdFromUrl(url) { return url.match(/[-\w]{25,}/); }

This regex works for any google url I've tried: Drive url for folders and files, Fusion Tables, Spreadsheets, Docs, Presentations, etc. It just looks for anything in a string that "looks like" a Google key. That is, any big enough string that has only (google key) valid characters in it.

Also, it works even if it receives the ID directly, instead of the URL. Which is useful when you're asking the link from the user, as some may paste the id directly instead of the url and it still works.

--edit

There are some other answers and comments that address some edge cases that I never encountered myself but might happen, like trying to get a folder-id on a nested folder URL, or when you have G-Suite domain that is 25+ characters long. For those cases, you might want to use a more strict regex.

From a quick look at the suggestions below I recommend the following /[-\w]{25,}(?!.*[-\w]{25,})/ because it is still very simple and should address these cases.

Steatite answered 30/5, 2013 at 15:50 Comment(8)
I did some tests and it looks great! Thank you. Just a question: this number (25) is really the minimum length for google docs ids?Klepht
This is not documented anywhere. I just took a bunch of URL from various files, look at the shortest one and gave a discount just in case :) Also checked that there wasn't another part of the url remotely close to 25 straight chars (without a dot in between, etc), so it won't confuse with another part.Steatite
Using the regex suggested by aquadeep is less likely to make an unwanted match. Then use a (group) to extract only the ID portion of the regex: url.match(/\/d\/(.{25,})\//)[1];Weinert
@Kenigmatic, an unwanted match seems very unlikely, and I more willing to trust 25 characters as a base length than I am that the ID will always start with a number. I also think we can trust that the ID will always be made up of digits, letters, and hyphens. Why use a dot?Allotropy
Just discovered that your edit added an unwanted '$' in the last paragraph—not sure that's supposed to be there? And guess which one I copied and struggled with :).Lovable
Indeed, the idea for the $ was to get the last match of 25+ matches in the string, but that'll break if the url has extra things after the id, which is entirely possible. I've edited and replaced that with a lookaheadSteatite
SonarCloud warns about using regex with backtracking, which may lead to a performance issue called catastrophic backtracking. I'm no expert in regex so @HenriqueG.Abreu, is this regex prone to that issue?Junior
I guess it could be, if the negative lookahead is not optimized (on the last suggested regex in the end of my answer). I haven't tested that on Apps Script. If you're worried about that, I imagine limiting the input URL to some reasonable length should prevent the execution even if bad. It should be an easy testSteatite
E
8

The url is something like this and file id is present in this pattern "/d/XXXXXXXX/" for almost all GoogleDrive/Docs links:
https://drive.google.com/file/d/0B3tB9BU9FRnpcTJmS2FoaktsQzA/view

Using below function, we can get the '/d/fileid/' and then truncate '/d/' from begining and '/' from end.

public static string getIdFromUrl(string url)
{
    Regex r = new Regex(@"\/d\/(.+)\/", RegexOptions.IgnoreCase);
    Match m = r.Match(url);
    return m.ToString().TrimStart('/', 'd').Trim('/');
}
Ewell answered 3/7, 2015 at 7:33 Comment(0)
B
8

I don't have enough reputation to comment on the accepted answer, but the accepted answer from Henrique G. Abreu fails when a Drive URL contains the domain name and the domain name exceeds 25 characters (just found this out the hard way :)

It has been very reliable otherwise and I think is the most elegant and robust among those provided here.

So, expanding on the accepted answer, the following regex will get the last occurrence of a string of word characters or hyphens that is at least 25 characters long, that is immediately preceded by a character that isn't a non word character or a hyphen, and followed optionally by the same type of character, and whatever other junk might come at the end:

/.*[^-\w]([-\w]{25,})[^-\w]?.*/

This LOSES the characteristic of the accepted answer that it will work when passed an ID only, however that's not a use case I require. It works for all different types of Drive, Docs, Sheets URLs for both documents and folders that I've tested.

Belva answered 19/11, 2018 at 3:52 Comment(0)
U
5

An openByUrl method is now available in Google Apps Script.

See ref docs here for Sheets, here for Docs, here for Slides and here for Forms.

Because you wrote:

I want to get the document ID to create a copy on Google Drive

...am assuming you don't need the ID per se. After getting the sheet/doc/slide/form by URL, you can make a copy of it.

Ulmer answered 17/8, 2019 at 5:54 Comment(0)
C
4

There are some more URL extensions not covered above which can contain IDs.

https://drive.google.com/drive/folders/ and https://drive.google.com/open?id= and https://drive.google.com/a/domain.edu.vn/folderview?id=

I thought I'd add my solution which builds on this idea, and covers the above two extensions, as well as the ones using /d/

function getIdFrom(url) {
  var id = "";
  var parts = url.split(/^(([^:\/?#]+):)?(\/\/([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?/);
  if (url.indexOf('?id=') >= 0){
     id = (parts[6].split("=")[1]).replace("&usp","");
     return id;
   } else {
   id = parts[5].split("/");
   //Using sort to get the id as it is the longest element. 
   var sortArr = id.sort(function(a,b){return b.length - a.length});
   id = sortArr[0];
   return id;
   }
 }
Cypress answered 29/10, 2016 at 23:22 Comment(0)
S
3

I just wanted to add the function I created based on the two given answers, because neither was quite what I was looking for.

function templateIdFrom(url) {
  var parts = url.match(/\/d\/(.+)\//);
  if (parts == null || parts.length < 2) {
    return url;
  } else {
    return parts[1];
  }
}

This gets the part after /d/ and up until the next /, which is how the document URLs always contain their IDs. If no match is found for this, then we simply return the original param, which is assumed to be the ID.

Sather answered 13/6, 2016 at 20:45 Comment(0)
S
3

The solution suggested by Henrique may not cover the scenario when a Google Drive file is shared by a Google Workspace user where the domain may be part of the file URL. If the domain name is long, the domain name gets captured instead of the file URL.

https://drive.google.com/a/thisisaverylongdomainname.org/file/d/1djf7XfuKx4Px55x7ahvMa5uznp3Ibe5vd7Y/view?usp=sharing

The File IDs generated by Google Drive do not contain a period (.) so this modified RegEx can prevent the capturing of domain names.

function getFileIdFromDriveUrl(url) {
  var match = url.match(/([a-z0-9_-]{25,})[$/&?]/i);
  return match ? match[1] : null;
}
Selie answered 22/12, 2020 at 13:4 Comment(0)
S
2

For Python:

For fixed lenght google drive IDs you can use this:

regex = "([\w-]){33}|([\w-]){19}"
match = re.search(regex,url)

Google drive use 33 chars for normal drives and 19 chars for team drives

Another approach without using fixed lenght but instead preceding patterns:

regex = "(?<=/folders/)([\w-]+)|(?<=%2Ffolders%2F)([\w-]+)|(?<=/file/d/)([\w-]+)|(?<=%2Ffile%2Fd%2F)([\w-]+)|(?<=id=)([\w-]+)|(?<=id%3D)([\w-]+)"

match = re.search(regex,url)
Septet answered 21/8, 2020 at 21:4 Comment(0)
M
2

I have a sheet where I put a URL into a cell and have it pulled in the App Script to do something else (like creating docs inside of a folder with content from the sheet).

I just used simple str.split("/") to pull the ID out of it. Sometimes, if the URL includes /u/0/, I just move down the index a bit:

if (sheet.getRange("D2").getValue().split("/")[4] === "u") {
  folderId = sheet.getRange("D2").getValue().split("/")[7];
} else {
  folderId = sheet.getRange("D2").getValue().split("/")[5];
}

though it only works with the two given Google Drive URL formats. I haven't encountered many others but drive.google.com/drive/folders/#folderId and drive.google.com/drive/u/o/folders/#folderId.

Metamer answered 13/8, 2021 at 3:31 Comment(0)
I
1

To extract id from url spreadsheets I use the code below. It works with google spreadsheet and Excel in Drive. Maybe works with other docs too.

function getIdSheetFromUrl_(url)
{
    var id = url.split('id=')[1];
    if(!id)
    {
        id = url.split('/d/')[1];
        id = id.split('/edit')[0]; // here we have the id
    }
    return DriveApp.getFileById(id);
}
Imprint answered 10/9, 2020 at 19:29 Comment(0)
F
1

How about this for Google Apps Script to convert google drive URL to ID?

function testConverter(){
 getFileIdFromUrl("https://drive.google.com/file/d/1NqU78123456IxsnEOMISfjoA3Rgz5gZS/view?usp=share_link");

}

function getFileIdFromUrl(strUrl){

var secondpart = strUrl.split("/d/")[1];
return secondpart.split("/")[0];

}
Falstaffian answered 15/4, 2023 at 5:9 Comment(0)
B
0

If you have the URL to the Google Drive file in a cell in your spreadsheet I believe you can use the following formula to extract the fileID:

=mid({cellAddress},33,33)

Example:

=mid(A2,33,33)

Buckeen answered 21/11, 2020 at 3:17 Comment(0)
P
0

I based my answer on this response, but in a few different languages and simplified.

Regex

I updated the regex used to grab the id to \/d\/([^\/\\]+)(?:\/|$) (Regex101 Example). This new regex has the following parts:

  • /d/, which serves as the anchor at the start of the pattern
  • ([^\/\\]+), the only capture group, which matches every character other than / and \, as many times as possibele. This is the id.
  • (?:\/|$), a non capturing group that matches either the next /, or the end of the url

The main difference is that I excluded the URL separator character (/) from the id, which means a URL where the id is followed by more path segments (/d/id-here/history/12) will stop at the end of id-here. This helps future proof against Google changing how they build the URLs, though it will break if they change the /d/ pattern. I also tweaked it so that if can match a url like go.c/spreadsheet/d/id-here, with no trailing /.

Solution

Regardless of language, with the Regex pattern, the solution looks like the following:

  • define the regex
  • test the URL, or an empty string if the URL is null, against the pattern
  • if the URL matches (note the empty string would not), pull the first (and only) capture group from the URL to get the id
  • if the URL does not match the pattern, return null

This provides a null-safe method that can handle changes in ID length and any additional URL nodes after the ID without issue.

While I could optimize this by checking the URL for null first, short-circuiting the regex check if it's null, I chose not to since this should almost always be called with a non-null value anyway. Further, the performance of a pattern match on an empty string is practically free anyway. A slight performance boost for an edge case isn't worth the additional (admittedly marginal) complexity.

Examples

// javascript, Google App Script
/**
 * @param { string | null } url 
 * @return { string | null } string if id found, else null
 */
function getIdFromUrl(url)
{
  const pattern = /\/d\/([^\/\\]+)(?:\/|$)/i;
  return pattern.test(url || '')
    ? url.match(pattern)[1]
    : null;
}
// typescript
/**
 * gets the ID from a Google document or drive URL
 */
function getIdFromUrl(url: string | null): string | null
{
  const pattern = /\/d\/([^\/\\]+)(?:\/|$)/i;
  return pattern.test(url || '')
    ? url.match(pattern)[1]
    : null;
}
// c#
private static readonly Regex _idFromUrlPattern = new Regex(
    @"\/d\/([^\/\\]+)(?:\/|$)",
    RegexOptions.Compiled | RegexOptions.IgnoreCase);
public static string GetIdFromUrl(string url)
{
    var match = _idFromUrlPattern.Match(url ?? "");
    return match.Success
        ? match.Groups[1].Value
        : null;
}
Passacaglia answered 5/5 at 17:59 Comment(0)
R
0

Here is a better way. It returns String value, unlike Array. Instead of giving an error, it returns null if the id is not found in the link.

function getIdFromUrl(url) {
  return url.match(/[-\w]{25,}/)?.[0];
}
Reichard answered 20/6 at 5:22 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.