How to extract URL from Link in Google Sheets using a formula?
Asked Answered
N

18

78

I have copied from a website a series of hyperlinks and pasted them in a google sheet. The values show up as linked text, not hyperlink formulas, and are still linked correctly. For each row, I'm trying to extract the URL ONLY (not the friendly text) and insert it into the adjacent column. How could this be accomplished using a formula?

For example:

=SOMEFUNCTION(cellThatHoldsLink, returnedURLOnly)

This and similar scenarios do not apply because the pasted data are not formulas. I will accept a script (GAS) solution, or any solution for that matter, but would prefer if it could be done using a formula. I have found dozens of HYPERLINK manipulation scripts, but nothing on this particular scenario, or even how to access the property that is holding that url. Thanks.

Noellanoelle answered 5/2, 2016 at 18:10 Comment(8)
Provide a couple of real examples of the content in the cells. Then someone can test the output.Nutt
Sandy, here is an example: docs.google.com/spreadsheets/d/…Noellanoelle
did you orginially paste some links or hyperlinks in that spot, then delete and paste it again?Sullins
I selected the text with a mouse direct from a webpage, right clicked and chose copy, then merely pasted it into the sheet. If there is another way of copying the data by which some needed properties can be transferred, i'm open to that as well.Noellanoelle
yes - i use linkclump every day - its a nifty chrome add on - or if you share the source url from which you got the links - ill show you a cool trick on your sheetSullins
linkclump saves a lot of timeConcurrence
duplicate question here, #28474808 use FORMULATEXT(cellid)Tegantegmen
Interestingly enough, @gaoithe, FORMULATEXT(cellid) won't work for me referencing a cell, even if it only contains just one link (embedded in the text).Depression
S
6

You can use importxml to pull in the whole data table and it's contents with this:

=IMPORTXML(A1,"//tr")

next to it we pulled in the url strings in the tags

=IMPORTXML(A1,"//tr/td[1]/a/@href")

and finally concatenated the strings with the original domain to create your hyperlinks

=ARRAYFORMULA("http://www.bnilouisiana.com/"&INDIRECT("A2:A"&COUNTA(A2:A)))
Sullins answered 6/2, 2016 at 20:53 Comment(6)
I tried this where i just had URL in cell, no hyperlink formula, but yes the cell is a link. I pasted above in to new sheet and it imported all the data but just with the string 'url' not the link. Small example of cells are here docs.google.com/a/fuzion.co.nz/spreadsheets/d/…Fiord
oop hit "add comment' to soon. Do you think the above should work in this situation?Fiord
you havent shared the doc @petednz-fuzionSullins
thank you for trying to assist and confirming this is a 'no hoper' as "they are not string urls" to quote from your PM.Fiord
The second formula. But i also came across posts that said that this function sometimes doesn't workCarrell
Are you pulling your data from the exact same source as the op? Cause if it isnt working then thay is because the xpath will vary for each url or website source @andySullins
M
47

After some update in 2020 all codes I have found on the Internet were broken, so here is my contribution:

/** 
 * Returns the URL of a hyperlinked cell, if it's entered with control + k. 
 * Author: @Frederico Schardong based on https://support.google.com/docs/thread/28558721?hl=en&msgid=28927581 and https://github.com/andrebradshaw/utilities/blob/master/google_apps/convertHiddenLinks.gs 
 * Supports ranges
 */
function linkURL(reference) {
  var sheet = SpreadsheetApp.getActiveSheet();
  var formula = SpreadsheetApp.getActiveRange().getFormula();
  var args = formula.match(/=\w+\((.*)\)/i);
  try {
    var range = sheet.getRange(args[1]);
  }
  catch(e) {
    throw new Error(args[1] + ' is not a valid range');
  }
  
  var formulas = range.getRichTextValues();
  var output = [];
  for (var i = 0; i < formulas.length; i++) {
    var row = [];
    for (var j = 0; j < formulas[0].length; j++) {
      row.push(formulas[i][j].getLinkUrl());
    }
    output.push(row);
  }
  return output
}
Means answered 28/10, 2020 at 2:2 Comment(7)
This is great! Thanks a lot! I still have an edge case to solve, I've multiple links in the same "formula" (formulas[0].length == 1). Do you know how to solve that by any chance?Curb
Finally found the solution, using getRuns() made the job. My "specific" script is here: gist.github.com/glureau/833d83099cb74efc937f9afddba2cca4 you'll just have to adapt the format wantedCurb
For embedded URLs (that are not like HYPERLINK=...) I used this add-on extract-urls.contributor.pw first to convert them and then this linkURL to get the actual URLStoriette
@Kikiwa's gist works the best for me in 2021Marcel
And if you want to just know if your cell have an hyperlink or not, the return have to be return output[0][0] !== nullPyridine
Thanks for the script, but regex in the script does not take into account Cyrillic symbols and parameters in the url. Two examples below: --- 1. [link] (артсайд.рф/catalog/cityboard/billboard827) 2. [link] (sales.russoutdoor.ru/sales/billboards/MSBB18866Б1/…) --- Can you, plz, modify it, @Levi-op?Salve
Another solution: support.google.com/docs/thread/…Peacoat
U
31

If your hyperlink is specified in another cell as a formula—for example let's suppose that cell A1 contains the formula =HYPERLINK("https://www.wikipedia.org/","Wikipedia"), you can extract the Link text using a regular expression. All you need to do is:

=REGEXEXTRACT(FORMULATEXT(A1),"""(.+)"",")

This formula will yield the result:

https://www.wikipedia.org/

No custom functions required.

Unwholesome answered 15/12, 2019 at 20:58 Comment(6)
=REGEXEXTRACT(FORMULATEXT(A1),"""(.+?)""") is better because link text is an optional argument and there may be no comma, and also REGEXEXTRACT returns only the first matchBregenz
Brilliant it works very well when the hyperlink was created with =hyperlink("https ...," ...") but unfortunately, this doesn't work when the hyperlink was pasted from another place. Is it possible to convert a pasted hyperlink to a hyperlink that uses the formula =hyperlink() ? With a macro perhaps?Converted
1. the above =REGEXEXTRACT() is fantastic for the =hyperlink() extractions, thank you @Jordan. 2 @AdolfoCorrea - did you every get an reply/answer/feedback to your above question? I too am interested.Tridimensional
@JohnnyUtahh I haven't found a way to workaround that problem yet. Please if you do, comment here. I'll do the same if I find a way to solve this problemConverted
a bit late to the party, but if you copied many links at once from a site straight into a spreadsheet, you can use this extension on firefox or chrome instead addons.mozilla.org/en-US/firefox/addon/copy-selected-linksThanet
Thank you x 100! I just spent way too many hours trying all sorts of tutorials online to have them all fail and ran into this. Thank you!!Cambell
F
27

This can be done for links auto created by pasting them in or with the link button button by going to Tools -> Script editor and creating the following script:

function GETLINK(input){

return SpreadsheetApp.getActiveSheet().getRange(input).getRichTextValue().getLinkUrl();

}

If cell A1 has the link you will need to reference it as =GETLINK("A1") That's not going to update dynamically if you have a lot of links so use =GETLINK(cell("Address",A1)) to get around that.

Credit to morrisjr1989 on reddit.

Fantastic answered 22/4, 2021 at 4:52 Comment(4)
This GETLINK is the only thing that helped me today. However, with my linked values all in column B, I ended up using =GETLINK("B"&ROW()). (That after wondering for a long time, why, e.g., =GETLINK(B2) wouldn't want to work directly.)Kibler
This raises an exception :( Exception: Argument cannot be null: a1Notation GETLINK @ Code.gs:3Storiette
@Storiette A1 is just an example. You need to use the address of the cell that has the link.Fantastic
=GETLINK("B"&ROW()) worked for me. I found the script editor by going to extensions -> Apps Script and pasting the script there.Subduct
S
7

The built-in SpreadsheetApp service doesn't seem to support pulling such URLs out, but the “Advanced” Sheets service does.

Enable the Advanced Sheets service according to Google's instructions, and then try this code:

function onOpen() {
  var menu = SpreadsheetApp.getUi().createMenu("Extract URLs");
  menu.addItem("Process =EXTRACT_URL(A1) formulas", "processFormulas");
  menu.addToUi();
}

function EXTRACT_URL() {
  return SpreadsheetApp.getActiveRange().getFormula();
}

function processFormulas() {
  var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = spreadsheet.getActiveSheet();
  var rows = sheet.getDataRange().getFormulas();
  for (var r = 0; r < rows.length; r++) {
    var row = rows[r];
    for (var c = 0; c < row.length; c++) {
      var formula = row[c];
      if (formula) {
        var matched = formula.match(/^=EXTRACT_URL\((.*)\)$/i);
        if (matched) {
          var targetRange = matched[1];
          if (targetRange.indexOf("!") < 0) {
            targetRange = sheet.getName() + "!" + targetRange;
          }
          var result = Sheets.Spreadsheets.get(spreadsheet.getId(), {
            ranges: targetRange,
            fields: 'sheets.data.rowData.values.hyperlink'
          });
          try {
            var value = result.sheets[0].data[0].rowData[0].values[0].hyperlink;
            sheet.getRange(r + 1, c + 1).setValue(value);
          } catch (e) {
            // no hyperlink; just ignore
          }
        }
      }
    }
  }
}

This creates a custom function called EXTRACT_URL, which you should call with a reference to the cell that contains the link; for example, =EXTRACT_URL(B3).

Unfortunately it doesn't work immediately, because the Advanced Sheets service can't be used directly by custom functions. So this script adds a menu called “Extract URLs” to the spreadsheet menu bar, with one menu item labeled “Process =EXTRACT_URL(A1) formulas”. When you click it, it will replace all uses of the EXTRACT_URL function with the URL itself.

Scag answered 22/3, 2018 at 6:43 Comment(3)
This works, but I get an error "Insufficient tokens for quota 'ReadGroup' and limit 'USER-100s' of service 'sheets.googleapis.com'" if processing too many at once. Also worth noting this is a one-off operation as it replaces the formula with the result.Tella
Yes this works! But is very slow because of the quota problem already mentioned. How can it be solved?Empire
@JoeErcolino You can speed it up by updating the code to take multiple values and then write to the sheet once. Doing so, I was able to get, apply additional processing, and print 184 values in 0.793 seconds.Outtalk
S
6

You can use importxml to pull in the whole data table and it's contents with this:

=IMPORTXML(A1,"//tr")

next to it we pulled in the url strings in the tags

=IMPORTXML(A1,"//tr/td[1]/a/@href")

and finally concatenated the strings with the original domain to create your hyperlinks

=ARRAYFORMULA("http://www.bnilouisiana.com/"&INDIRECT("A2:A"&COUNTA(A2:A)))
Sullins answered 6/2, 2016 at 20:53 Comment(6)
I tried this where i just had URL in cell, no hyperlink formula, but yes the cell is a link. I pasted above in to new sheet and it imported all the data but just with the string 'url' not the link. Small example of cells are here docs.google.com/a/fuzion.co.nz/spreadsheets/d/…Fiord
oop hit "add comment' to soon. Do you think the above should work in this situation?Fiord
you havent shared the doc @petednz-fuzionSullins
thank you for trying to assist and confirming this is a 'no hoper' as "they are not string urls" to quote from your PM.Fiord
The second formula. But i also came across posts that said that this function sometimes doesn't workCarrell
Are you pulling your data from the exact same source as the op? Cause if it isnt working then thay is because the xpath will vary for each url or website source @andySullins
S
4

So, with a bit of research, I can see that there are 2 types of links in google sheets

  1. hyperlinks (formula like =HYPERLINK=("www.google.com"))
  2. "embedded" links (where you can't see or edit the formula)

For the first kind you can use @frederico-schardong 's linkURL directly.

For the 2nd kind you can first use the extract URLs add-on and then linkURL.

Storiette answered 8/7, 2021 at 10:1 Comment(1)
The Extract URLS addon for google sheets help me extract the urls where i cant see them. Thank you very much.Forbiddance
T
2

If you happy to use Google Apps Script then use below function to get the hyperlink from a text. When you pass the cell, you should send with double quote. Eg: =GETURL("A4") to get the A4 hyperlink.

function GETURL(input) {
  var range = SpreadsheetApp.getActiveSheet().getRange(input);
  var url = /"(.*?)"/.exec(range.getFormulaR1C1())[1];
  return url;
}

Refer here for example.

Edit: Ignore this answer. This will only work if url is linked in cell.

Twain answered 6/2, 2016 at 13:25 Comment(7)
This wont work if you look at his example - the links were not hyperlink formulas - it is oddly just the text with the links but no visible "link formula"Sullins
Ohh! got it! sorry this will not work for his example.Twain
@Noellanoelle i commented above, if you share a sample link from where your grabbing the urls from i will show you the formula to automatically just pull all those urls in via xpathSullins
Aurielle, here's a sample source sheet from where urls would come from. bnilouisiana.com/… Currently, i'm getting them into a google sheet using the IMPORTHTML formula. For now, i'm using iwebtool.com. it's a manual solution, but beats processing each URL individually.Noellanoelle
@Noellanoelle added a sheet to your sample doc - and put in there the xml extractions - I can only assume that it is not only one page you want to extract - if you have a second link so I can see what the url pattern is we can automatically pull in all the pages for youSullins
also added 4 comments over the formulas for clarificationSullins
This greatly works for cells that contain the HYPERLINK formula. Thanks!Karole
M
2

Found an answer that works on Google Groups (by Troy):

  1. Publish your spreadsheet (copy the url)
  2. Create a new spreadsheet and use IMPORTXML function (replace the url with the url you get in step 1)
  3. You will get the data - you can then copy and paste the values where you need them
  4. Unpublish the spreadsheet (if you do not want it public)

Could be done by Script, but I have no time now. :)

Mercantilism answered 13/8, 2017 at 15:54 Comment(0)
D
1

Updated for 2022:

  1. Record a dummy macro, save it as "ExtractLinks".
  2. Then edit the macro to get to the script editor.
  3. Then paste the following and save:
function ExtractLinks() {
  var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = spreadsheet.getActiveSheet();
  var rows = sheet.getActiveRange().getValues();
  var z = sheet.getActiveRange().getRowIndex();
  var s = sheet.getActiveRange().getColumn();

  for (var r = 0; r < rows.length; r++) {
    var row = rows[r];
    for (var c = 0; c < row.length; c++) {
      var val = row[c];
      if (val) {
        var targetRange = sheet.getRange(r+z, c+s).getA1Notation();
        var result = Sheets.Spreadsheets.get(spreadsheet.getId(), {
          ranges: sheet.getName() + "!" + targetRange,
          fields: 'sheets.data.rowData.values'
        });
        if (result.sheets[0].data[0].rowData[0].values[0].hyperlink) {
          var url = result.sheets[0].data[0].rowData[0].values[0].hyperlink;
          var text = result.sheets[0].data[0].rowData[0].values[0].effectiveValue.stringValue;
          sheet.getRange(r + z, c + s).setValue(''+url);
        }
      }
    }
  }
};
  1. You can run the macro from Extensions > Macros > ExtractLinks

This is an adaptation of Natso's code, and will work for a range.

Dicta answered 22/1, 2022 at 4:32 Comment(1)
Two additional thoughts: Be sure to have the Sheets API turned on. And watch out for requests per minute quota limitations.Cellar
N
1

I got a pretty indirect method to extract all url. Export the column of embedded url titles in html file. Then online URL extractor would do the job.

Neuro answered 18/4, 2023 at 7:27 Comment(0)
P
0

I was able to solve this for Jotform in a very simple way.

I was looking to include the Edit link in a query, but it would only give me the text "Edit Submission."

However, I noticed that I had the form ID in Column R. I was then able copy the JotForm link and combine it with the cell reference "https://www.jotform.com/edit/"&R2

Pinnule answered 15/1, 2019 at 18:22 Comment(0)
C
0

Ryan Tarpine's Example helped a lot. Thanks!

With the code below, you can replace all embedded links by standard HYPERLINK formulas within a selected Range. Please note, that the Advanced Sheets Service must be activated.

function embeddedURLsToHyperlink() {
  var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
  var sheet = spreadsheet.getActiveSheet();
  var rows = sheet.getActiveRange().getValues();
  var z = sheet.getActiveRange().getRowIndex();
  var s = sheet.getActiveRange().getColumn();

  for (var r = 0; r < rows.length; r++) {
    var row = rows[r];
    for (var c = 0; c < row.length; c++) {
      var val = row[c];
      if (val) {
        var targetRange = sheet.getRange(r+z, c+s).getA1Notation();
        var result = Sheets.Spreadsheets.get(spreadsheet.getId(), {
          ranges: sheet.getName() + "!" + targetRange,
          fields: 'sheets.data.rowData.values'
        });
        if (result.sheets[0].data[0].rowData[0].values[0].hyperlink) {
          var url = result.sheets[0].data[0].rowData[0].values[0].hyperlink;
          var text = result.sheets[0].data[0].rowData[0].values[0].effectiveValue.stringValue;
          sheet.getRange(r + z, c + s).setFormula('=HYPERLINK("' + url + '","' + text + '")');
        }
      }
    }
  }
}

I you want to process the full sheet, replace lines 4-6 by the following code:

var rows = sheet.getDataRange().getValues();
var z = 1;
var s = 1;
Canoodle answered 4/6, 2020 at 15:25 Comment(0)
E
0

You can create a macro "ExtractURLs", then edit it with the following code which parses consistent text style blocks in the active cell, tries to retrieve URLs, and pastes them into a neighbor cell.


    function ExtractURLs() {
      
      var spreadsheet = SpreadsheetApp.getActive();
    
      var richTextCell = SpreadsheetApp.getActiveRange().getRichTextValue();
    
      var richTextStrings = richTextCell.getRuns();
      
      var linksStr = "";
      var linkCell = spreadsheet.getCurrentCell()
      var link = "";
      var richTextString = "";
    
      for (var x=0; x < richTextStrings.length; x++)
      {
        richTextString = richTextStrings[x].getText();
        
        Logger.log(richTextString);
    
        link = richTextStrings[x].getLinkUrl();
        Logger.log(link);
    
        if (link != null)
        {
         linksStr += link;
         linksStr += ', '; 
        }  
      }  
    
      var targetCell = linkCell.offset(0, 1).activate();
      targetCell.setValue(linksStr.substr(0,linksStr.length-2));
    };

Evertor answered 10/11, 2021 at 15:12 Comment(0)
T
0

I used this macros (Extensions > Macros) to automatically extract links from each cell in a defined range (1 column only)

function extractHyperlinks() {
  var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
  var range = sheet.getActiveRange();
  var values = range.getRichTextValues();

  var extractedLinks = [];
  var extractedText = [];

  values.forEach(function(row) {
    row.forEach(function(cell) {
      var hyperlink = cell.getLinkUrl();
      var text = cell.getText();
      
      if (hyperlink != null) {
        extractedLinks.push([text, hyperlink]);
        extractedText.push([text]);
      }
    });
  });
  
  if (extractedLinks.length > 0) {
    var outputSheet = createOutputSheet();
    outputSheet.getRange(1, 1, extractedLinks.length, 2).setValues(extractedLinks);
    SpreadsheetApp.getActiveSpreadsheet().setActiveSheet(outputSheet);
  } else {
    Browser.msgBox("No hyperlinks found in the selected range.");
  }
}

function createOutputSheet() {
  var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
  var sheetName = "Extracted Links";
  
  // Check if a sheet with the same name already exists
  var existingSheet = spreadsheet.getSheetByName(sheetName);
  if (existingSheet != null) {
    // Generate a unique name for the new sheet
    var timestamp = new Date().getTime();
    sheetName = "Extracted Links (" + timestamp + ")";
  }
  
  // Create a new sheet
  var outputSheet = spreadsheet.insertSheet(sheetName);
  outputSheet.getRange(1, 1).setValue("Text");
  outputSheet.getRange(1, 2).setValue("Link");
  
  return outputSheet;
}

Save the macros, define a range to extract, and use the macros. It will automatically create a new cell with original texts and hyperlinks.

Tetrahedron answered 11/7, 2023 at 9:6 Comment(0)
H
0

Despite the fact that OP appears to have already selected a solution, this is for those who are having trouble getting the "GETLINK()" script to work when:

  • The GETLINK() custom google apps script function returns null despite knowing that a cell contains hyperlinks from richtext.

I followed the instructions to create a custom "GETLINK()" function from Tyler Robertson's web post, but it always returned NULL.

His original solution and those posted here didn't work for my case because my cell's rich text had multiple formats in it resulting in a portion of the text (what google documentation calls a "run") having no hyperlink and another run actually containing a hyperlink.

So, the original solution by Tyler Robertson would always return null even though a hyperlink existed in the richtext.

After digging through googles documents for too long, I came to this solution:

function GETLINK(input){
  //input = "'Sheet1'!$A$63"
  var sheet = SpreadsheetApp.getActiveSheet();
  var range = sheet.getRange(input);
  Logger.log(range.getRichTextValue())
  Logger.log(input + " /n" + range.getA1Notation());
  
  var rtRuns = range.getRichTextValue().getRuns();
  
  var url = "";

  for (let i = 0; i < rtRuns.length; i++){
    if(rtRuns[i].getLinkUrl()!==null){ 
      url=rtRuns[i].getLinkUrl();
      break;
    }
  };

  Logger.log(rtRuns  +  " / " + url);

  return url;
}

function sheetName() {
  return SpreadsheetApp.getActiveSpreadsheet().getActiveSheet().getName();
}

And here is an example of the function in use:

getlink( "'" & sheetname() & "'!" & cell("address", offset($A62,1,0) ))

Which evaluates to this:

getlink('Sheet1'!A63)

Finally, here is a screenshot of it in use for those who want to see: enter image description here

Hephzipa answered 6/3, 2024 at 17:32 Comment(0)
G
0

You should simpler way:

  1. Download the sheet as CSV
  2. Don't do anything on the CSV file
  3. Create a new sheet and import the CSV file

That's it!

Gettysburg answered 17/4, 2024 at 13:31 Comment(0)
P
-1

Easiest for pasted links: =getlink(cell("Address",A2))

with

function GETLINK(input){

return SpreadsheetApp.getActiveSheet().getRange(input).getRichTextValue().getLinkUrl();

}

Pandolfi answered 26/3, 2023 at 18:6 Comment(0)
A
-3

Quick way to do it If cell rows are limited --->

  1. just right click cell and click edit link
  2. Remove Text and Click Apply
  3. Use f4 for every cell that follow

Please if you have more to this or can be done with selection of cell. Let me know

Anticyclone answered 5/8, 2019 at 9:5 Comment(2)
But then you also lose the link textGyre
the main problem here is that the F4 overwrites the first url address over the current cell instead of removing the link name as it was desired.Converted

© 2022 - 2025 — McMap. All rights reserved.