How do I extract the text from a docx file using Apps Script?
Asked Answered
G

1

5

The files are saved in a Drive folder. I need to send the text content of all .docx file as an API payload. I've tried using Blob but to no avail. Is there a way to get this done?

Girosol answered 18/12, 2019 at 10:11 Comment(2)
Did your issue get solved?Illomened
Yeah, your solution did it.Girosol
I
7

If I understand you correctly, you want to send the text content of a docx file that you have in Drive. If that's correct, then you can do the following:

function docx() {
  var docxId ="your-docx-id";
  var docx = DriveApp.getFileById(docxId);
  var blob = docx.getBlob();
  var file = Drive.Files.insert({}, blob, {convert:true});
  var id = file["id"];
  var doc = DocumentApp.openById(id);
  var text = doc.getBody().getText();
  return text;
}

This code uses Advanced Drive Service to create a Docs file out of the blob you get from the docx, via Drive.Files.insert. Then, you can easily access this newly created file via DocumentApp and use getText.

Bear in mind that this will create a new file every time you run it. Use Files.delete to avoid that.

I hope this is of any help.

Illomened answered 18/12, 2019 at 11:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.