Retrieve linked URL from text in a Google Document using Google Apps Script
Asked Answered
C

2

6

I'm working with Google Apps Script and I'm trying to retrieve the URL hyperlinked to one word in the text string returned by the GAS function below, but I'm getting the error listed below.

As you can see from my code, I'm a rookie, so any help and 'best practice' is greatly appreciated.

Error Message Returned By GAS IDE

TypeError: Cannot find function getLinkUrl in object HYPERLINK to your “Intro To Google Documents” document. Open your MrBenrudShared folder and create a new blank Google Document. Name it “Your Name: Intro To Google Documents”.. (line 19, file "Code")

GAS Function

function getURLfromHyprlink() {
  var body = DocumentApp.getActiveDocument().getBody();
  Logger.log(body.getNumChildren());

  // table is bode child element #1 of 3.
  var rubricTable = body.getChild(1);
  Logger.log(rubricTable);

  // Find out about row 3 in table
  var studentWorkRow = rubricTable.getChild(2);
  Logger.log(studentWorkRow);

  // Find what is in column2 of hyperlink row
  var studentHyperlinkCell = studentWorkRow.getChild(1);
  Logger.log(studentHyperlinkCell); //tells me it is a table cell

  // Returns text from studentHyperlinkCell
  var hyperlinkText = studentHyperlinkCell.asText().getText();
  var hyperlinkURL = hyperlinkText.getLinkUrl();
  Logger.log(hyperlinkURL); 

  }

THE STRING RETURNED By The Above Function

HYPERLINK to your “Intro To Google Documents” document.

Open your MrBenrudShared folder and create a new blank Google Document. Name it “Your Name: Intro To Google Documents”.

The URL is only on the word HYPERLINK, and not on the rest of the string.

The document is here - https://docs.google.com/document/d/18zJMjXWoBNpNzrNuPT-nQ_6Us1IbACfDNXQZJqnj1P4/edit# and you can see the word HYPERLINK in row3 of the table and the hyperlink

Thanks for your help!

Cards answered 8/9, 2018 at 20:21 Comment(2)
just asked you to share permission with me on googleNola
Just tried getLinkUrl(0), but it didn't work.Cards
S
5
  • You want to retrieve URL of the hyperlink in texts in Google Document.
  • In your situation, the texts you want to retrieve are in a table which can be seen at your shared sample Document.

If my understanding for your question is correct, how about this modification?

Modification points:

  • Retrieve each cell.
  • Retrieve children from each cell and retrieve the text from the child.
  • In your case, it splits the texts every word.
  • Check the hyperlink every word and retrieve the link when the word has the link.
    • getLinkUrl(offset) is used for this.

The script which reflected above points is as follows. When you use this modified script, please copy and paste this script to the script editor of your shared Google Document, and run sample().

Modified script:

function sample() {
  var body = DocumentApp.getActiveDocument().getBody();
  var table = body.getTables()[0];
  var rows = table.getNumRows();
  var result = [];
  for (var i = 0; i < rows; i++) {
    var cols = table.getRow(i).getNumCells();
    for (var j = 0; j < cols; j++) {
      var cell = table.getCell(i, j);
      for (var k = 0; k < cell.getNumChildren(); k++) {
        var child = cell.getChild(k).asText();
        var text = child.getText(); // Retrieve text of a child in a cell.
        var words = text.match(/\S+/g); // Split text every word.
        if (words) {
          var links = words.map(function(e) {return {
            text: text,
            linkedWord: e,
            url: child.getLinkUrl(child.findText(e).getStartOffset()), // Check the link every word.
          }}).filter(function(e) {return e.url != null}); // Retrieve the link when the word has the link.
          if (links.length > 0) result.push(links);
        }       
      }
    }
  }
  result = Array.prototype.concat.apply([], result);
  Logger.log(result)
}

Result:

When this script is used for your shared sample Document, the following result is retrieved.

[
  {
    "text": "HYPERLINK  to your “Intro To Google Documents” document. ",
    "linkedWord": "HYPERLINK",
    "url": "https://docs.google.com/document/d/1HDGUxgqZYVQS5b8gLtiQTNumaXRjP2Ao1fHu2EFqn_U/edit"
  },
  {
    "text": "Video",
    "linkedWord": "Video",
    "url": "http://mrbenrud.net/videos/video.php?id=&v=EhnT8urxs_E&title=How to Create a Folder in Google Drive&description="
  },
  {
    "text": "Some instructions will have hyperlinks and other will use different types for formating. ",
    "linkedWord": "hyperlinks",
    "url": "https://docs.google.com/document/d/1tS-Pq2aqG7HpsMA5br2NzrjH9DFdiz9oA0S70vejg4c/edit"
  },
  {
    "text": "Video",
    "linkedWord": "Video",
    "url": "http://mrbenrud.com/index.php/tutorials/project-tutorials/94-how-to-share-a-folder-in-google-drive-with-someone-else-so-they-can-edit-it"
  },
  {
    "text": "Video",
    "linkedWord": "Video",
    "url": "http://mrbenrud.com/index.php/tutorials/project-tutorials/98-how-to-move-a-document-in-google-drive-into-another-folder"
  },
  {
    "text": "Video",
    "linkedWord": "Video",
    "url": "http://mrbenrud.com/index.php/tutorials/project-tutorials/96-how-to-search-for-and-filter-through-images-using-google"
  },
  {
    "text": "Video",
    "linkedWord": "Video",
    "url": "http://mrbenrud.com/index.php/tutorials/project-tutorials/99-how-to-rename-file-on-a-mac-in-osx"
  }
]

Note:

  • In this script, all links in a table are retrieved. So if you want to retrieve the specific cell, please modify the script.

References:

If I misunderstand your question, I'm sorry.

Serow answered 9/9, 2018 at 2:24 Comment(7)
Tanaike, thank you for working on this!! Two questions: 1st, do I have to break paragraphs into individual words in order for the getLinkUrl() to work? 2nd, what if I just wanted to get the linked URL from the first letter in a paragraph string? Shouldn't I be able to use getLinkUrl([0])?Cards
@Mr. B Thank you for replying. A1: You can also search every character. In your situation, I found that the hyperlinks are included in each word in the text. So I thought that to split every word might be suitable. A2: In that case, for example, you can retrieve it using cell.getChild(k).asText().getLinkUrl(n). In the case of your another question, you can use var paragraphText = paragraph.asText().getLinkUrl(2).Serow
A2 worked! Thank you SO MUCH! No joke, I've been working on this for over four hours and the only think I was missing was the .asText() ... Thanks again!Cards
@Mr. B Welcome. I'm glad your issue was solved. Also I could study from your question. Thank you, too.Serow
yes. I was actually working on updating the GAS for the question now. How can I give you credit for the answer over there? I wouldn't have it if it wasn't for your help here.Cards
@Mr. B Thank you for the comment. In this question, it retrieves each child from the cells of table in Document and retrieve the links. On the other hand, your another question, it is how to use getLinkUrl(offset). From this question, you could understand about how to use it. I thought it is also important question, and it will be helpful for other users by posting what you could understand. And I have already got the credit here. Thank you, again.Serow
@Mr. B Thank you for your response!Serow
S
1

This post helped me a lot to understand the mechanics of Google Docs when dealing with hyperlinks.

Here my solution for a document wide search & replace:

https://gist.github.com/vladox/f8cd873571ffa8038fb15175a476f20b

var oldLink = "https://your-old-link-here";
var newLink = "https://your-new-link-here";
var documentId = 'YOUR-GOOGLE-DOCUMENT-ID-HERE';
var doc = DocumentApp.openById(documentId);
var searchType = DocumentApp.ElementType.TEXT;

function findAndReplaceLinks() {
  var body = doc.getBody();
  var text = body.getText();

  var searchResult = null;
  var searchResultTextElement = null;
  var searchResultText = "";
  while (searchResult = body.findElement(searchType, searchResult)) {
    searchResultTextElement = searchResult.getElement().asText();
    searchResultText = searchResultTextElement.getText();
    // Logger.log("TEXT: %s", searchResultText);

    var words = searchResultText.match(/\S+/g);
    if (words) {
      words.map(function (e) {
        // sanitize search terms for regex relevent symbols
        e = e.replaceAll("(", "\\(").replaceAll(")", "\\)").replaceAll("+", "\\+").replaceAll("*", "\\*");
        e = e.replaceAll("[", "\\[").replaceAll("]", "\\]").replaceAll("{", "\\{").replaceAll("}", "\\}");
        // Logger.log("WORD: %s", e);

        var partialElementUrl = null;
        if (e.trim() != "") {
          var partialElement = searchResultTextElement.findText(e);
          var partialElementText = partialElement.getElement().asText();
          var startOffset = partialElement.getStartOffset();
          var endOffsetInclusive = partialElement.getEndOffsetInclusive();
          partialElementUrl = searchResultTextElement.getLinkUrl(partialElement.getStartOffset());
          if (partialElementUrl != null) {
            var updatedUrl = partialElementUrl.replace(oldLink, newLink);
            if (partialElementUrl.includes(oldLink)) {
              Logger.log("REPLACING WORD: %s AT OFFSET: %s", partialElementText.getText().substring(startOffset, endOffsetInclusive + 1), startOffset);
              partialElementText.setLinkUrl(startOffset, endOffsetInclusive, updatedUrl);
            } else if (partialElementUrl.includes(newLink)) {
              Logger.log("ALREADY REPLACED WORD: %s", partialElementUrl);
            }
          }
        }
      });
    }
  }
}
Spinach answered 8/1, 2022 at 21:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.