Chrome Extension: Get Page Variables in Content Script
Asked Answered
E

11

42

Is there any way to retrieve a page's javascript variables from a Google Chrome Content Script?

Extroversion answered 17/10, 2010 at 23:35 Comment(1)
T
85

If you really need to, you can insert a <script> element into the page's DOM; the code inside your <script> element will be executed and that code will have access to JavaScript variables at the scope of the window. You can then communicate them back to the content script using data- attributes and firing custom events.

Sound awkward? Why yes, it is, and intentionally so for all the reasons in the documentation that serg has cited. But if you really, really need to do it, it can be done. See here and here for more info. And good luck!

Thong answered 19/10, 2010 at 6:0 Comment(3)
Ok that's better answer than mine.Sternforemost
Thanks!! Such a simple and obvious need and so deeply hidden. Both accessing the page variables and communicating back with extension.Dumbstruck
2024 note: This no longer works. It errors "Refused to execute inline script because it violates the following Content Security Policy directive."Coorg
J
27

I created a little helper method, have fun :)

to retrieve the window's variables "lannister", "always", "pays", "his", "debts", you execute the following:

var windowVariables = retrieveWindowVariables(["lannister", "always", "pays", "his", "debts"]);
console.log(windowVariables.lannister);
console.log(windowVariables.always);

my code:

function retrieveWindowVariables(variables) {
    var ret = {};

    var scriptContent = "";
    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        scriptContent += "if (typeof " + currVariable + " !== 'undefined') $('body').attr('tmp_" + currVariable + "', " + currVariable + ");\n"
    }

    var script = document.createElement('script');
    script.id = 'tmpScript';
    script.appendChild(document.createTextNode(scriptContent));
    (document.body || document.head || document.documentElement).appendChild(script);

    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        ret[currVariable] = $("body").attr("tmp_" + currVariable);
        $("body").removeAttr("tmp_" + currVariable);
    }

    $("#tmpScript").remove();

    return ret;
}

please note that i used jQuery.. you can easily use the native js "removeAttribute" and "removeChild" instead.

Jaejaeger answered 21/6, 2014 at 17:35 Comment(5)
Is this answer applicable in the context of a content script?Extroversion
yes, it is.. just use it straight from your plugin jsJaejaeger
perfect solution, ThxIncognito
great, but this is not going to work with objects, check out the corrected version of this, scroll bottom posted by user "Taras"Gherkin
This won't work if you want access to something that can't be serialized.Kellyekellyn
S
20

Using Liran's solution, I'm adding some fix for Objects, here's correct solution:

function retrieveWindowVariables(variables) {
    var ret = {};

    var scriptContent = "";
    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        scriptContent += "if (typeof " + currVariable + " !== 'undefined') $('body').attr('tmp_" + currVariable + "', JSON.stringify(" + currVariable + "));\n"
    }

    var script = document.createElement('script');
    script.id = 'tmpScript';
    script.appendChild(document.createTextNode(scriptContent));
    (document.body || document.head || document.documentElement).appendChild(script);

    for (var i = 0; i < variables.length; i++) {
        var currVariable = variables[i];
        ret[currVariable] = $.parseJSON($("body").attr("tmp_" + currVariable));
        $("body").removeAttr("tmp_" + currVariable);
    }

     $("#tmpScript").remove();

    return ret;
}
Shamrao answered 6/5, 2016 at 7:58 Comment(2)
I got an error when I use JSON.stringify(" + currVariable + ") like TypeError: Converting circular structure to JSONTowns
If you try to use this in 2023, you'll be greeted with Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'self' 'wasm-unsafe-eval' 'inline-speculation-rules'Superlative
E
11

Chrome's documentation gives you a good starting point: https://developer.chrome.com/extensions/content_scripts#host-page-communication

This method allows you to extract a global page variable to your content script. It also uses an idea to only accept incoming messages that you recognize given your handshake. You can also just use Math.random() for the handshake but I was having some fun.

Explanation

  1. This method creates a script tag
  2. It stringifies the function propagateVariable and passes the current handShake and targeted variable name into the string for preservation since the function will not have access to our content script scope.
  3. Then it injects that script tag to the page.
  4. We then create a listener in our content script waiting to hear back from the page to pass back the variable we're after.
  5. By now the injected script has hit the page.
  6. The injected code was wrapped in an IIFE so it runs itself pushing the data to the listener.
  7. Optional: The listener makes sure that it had the correct handshake and voila we can trust the source of the data (It's not actually secure, but it helps create an identifier in this case, that gives us some level of trust).

Round 1

v1.0

const globalToExtract = 'someVariableName';
const array = new Uint32Array(5);
const handShake = window.crypto.getRandomValues(array).toString();

function propagateVariable(handShake, variableName) {
  const message = { handShake };
  message[variableName] = window[variableName];
  window.postMessage(message, "*");
}

(function injectPropagator() {
  const script = `( ${propagateVariable.toString()} )('${handShake}', '${globalToExtract}');`
  const scriptTag = document.createElement('script');
  const scriptBody = document.createTextNode(script);
  
  scriptTag.id = 'chromeExtensionDataPropagator';
  scriptTag.appendChild(scriptBody);
  document.body.append(scriptTag);
})();

window.addEventListener("message", function({data}) {
  console.log("INCOMINGGGG!", data);
  // We only accept messages from ourselves
  if (data.handShake != handShake) return;

  console.log("Content script received: ", data);
}, false);

v1.1 With Promise!

function extractGlobal(variableName) {

  const array = new Uint32Array(5);
  const handShake = window.crypto.getRandomValues(array).toString();

  function propagateVariable(handShake, variableName) {
    const message = { handShake };
    message[variableName] = window[variableName];
    window.postMessage(message, "*");
  }

  (function injectPropagator() {
    const script = `( ${propagateVariable.toString()} )('${handShake}', '${variableName}');`
    const scriptTag = document.createElement('script');
    const scriptBody = document.createTextNode(script);

    scriptTag.id = 'chromeExtensionDataPropagator';
    scriptTag.appendChild(scriptBody);
    document.body.append(scriptTag);
  })();

  return new Promise(resolve => {
    window.addEventListener("message", function({data}) {
      // We only accept messages from ourselves
      if (data.handShake != handShake) return;
      resolve(data);
    }, false);
  });
}

extractGlobal('someVariableName').then(data => {
  // Do Work Here
});

Round 2 - Class & Promises

v2.0

I would recommend tossing the class into its own file and exporting it as a default if using es modules. Then it simply becomes:

ExtractPageVariable('someGlobalPageVariable').data.then(pageVar => {
  // Do work here 💪
});

class ExtractPageVariable {
  constructor(variableName) {
    this._variableName = variableName;
    this._handShake = this._generateHandshake();
    this._inject();
    this._data = this._listen();
  }

  get data() {
    return this._data;
  }

  // Private

  _generateHandshake() {
    const array = new Uint32Array(5);
    return window.crypto.getRandomValues(array).toString();
  }

  _inject() {
    function propagateVariable(handShake, variableName) {
      const message = { handShake };
      message[variableName] = window[variableName];
      window.postMessage(message, "*");
    }

    const script = `( ${propagateVariable.toString()} )('${this._handShake}', '${this._variableName}');`
    const scriptTag = document.createElement('script');
    const scriptBody = document.createTextNode(script);

    scriptTag.id = 'chromeExtensionDataPropagator';
    scriptTag.appendChild(scriptBody);
    document.body.append(scriptTag);
  }

  _listen() {
    return new Promise(resolve => {
      window.addEventListener("message", ({data}) => {
        // We only accept messages from ourselves
        if (data.handShake != this._handShake) return;
        resolve(data);
      }, false);
    })
  }
}

const windowData = new ExtractPageVariable('somePageVariable').data;
windowData.then(console.log);
windowData.then(data => {
   // Do work here
});
Epizootic answered 30/3, 2019 at 12:55 Comment(3)
Excellent answer, clearly written, and great to see the process as well as the final result. And it works fantastically, to boot. All-star!Intranuclear
@GiffordN. I was considering wrapping this up into a reusable library and throwing it on npm.Epizootic
Unfortnately, this does not work with Firefox. I tested with Firefox Developer Edition 96, It says the object is undefined although it is defined when I paste the object name in Debugger. EDIT: It does work, but variables can't have the window prefix. After I put there simply the variable name, it works. Thank you!Flosser
F
5

As explained partially in other answers, the JS variables from the page are isolated from your Chrome extension content script. Normally, there's no way to access them.

But if you inject a JavaScript tag in the page, you will have access to whichever variables are defined there.

I use a utility function to inject my script in the page:

/**
 * inject - Inject some javascript in order to expose JS variables to our content JavaScript
 * @param {string} source - the JS source code to execute
 * Example: inject('(' + myFunction.toString() + ')()');
 */
function inject(source) {
  const j = document.createElement('script'),
    f = document.getElementsByTagName('script')[0];
  j.textContent = source;
  f.parentNode.insertBefore(j, f);
  f.parentNode.removeChild(j);
}

Then you can do:

function getJSvar(whichVar) {
   document.body.setAttribute('data-'+whichVar,whichVar);
}
inject('(' + getJSvar.toString() + ')("somePageVariable")');

var pageVar = document.body.getAttribute('data-somePageVariable');

Note that if the variable is a complex data type (object, array...), you will need to store the value as a JSON string in getJSvar(), and JSON.parse it back in your content script.

Fatality answered 9/5, 2019 at 18:4 Comment(2)
This is the best option by farCardsharp
I like this solution. Thanks for providing it. However, it only worked for me after I replaced document.body.setAttribute('data-'+whichVar,whichVar); with document.body.setAttribute('data-'+whichVar,window[whichVar]);. Before doing so, it was simply setting the property value to a string equal to the property name, such as: my_variable="my_variable" instead of my_variable="the_correct_value".Edinburgh
P
3

This is way late but I just had the same requirement & created a simple standalone class to make getting variable values (or calling functions on objects in the page) really really easy. I used pieces from other answers on this page, which were very useful.

The way it works is to inject a script tag into the page which accesses the variable you want, then it creates a div to hold the serialised version of the value as innerText. It then reads & deserialises this value, deletes the div and script elements it injected, so the dom is back to exactly what it was before.

    var objNativeGetter = {

        divsToTidyup: [],
        DIVID: 'someUniqueDivId',
        _tidyUp: function () {
            console.log(['going to tidy up ', this.divsToTidyup]);
            var el;
            while(el = this.divsToTidyup.shift()) {
                console.log('removing element with ID : ' + el.getAttribute('id'));
                el.parentNode.removeChild(el);
            }
        },

        // create a div to hold the serialised version of what we want to get at
        _createTheDiv: function () {
            var div = document.createElement('div');
            div.setAttribute('id', this.DIVID);
            div.innerText = '';
            document.body.appendChild(div);
            this.divsToTidyup.push(div);
        },

        _getTheValue: function () {
            return JSON.parse(document.getElementById(this.DIVID).innerText);
        },

        // find the page variable from the stringified version of what you would normally use to look in the symbol table
        // eg. pbjs.adUnits would be sent as the string: 'pbjs.adUnits'
        _findTheVar: function (strIdentifier) {
            var script = document.createElement('script');
            script.setAttribute('id', 'scrUnique');
            script.textContent = "\nconsole.log(['going to stringify the data into a div...', JSON.stringify(" + strIdentifier + ")]);\ndocument.getElementById('" + this.DIVID + "').innerText = JSON.stringify(" + strIdentifier + ");\n";
            (document.head||document.documentElement).appendChild(script);
            this.divsToTidyup.push(script);
        },

        // this is the only call you need to make eg.:
        // var val = objNativeGetter.find('someObject.someValue');
        // sendResponse({theValueYouWant: val});
        find: function(strIdentifier) {
            this._createTheDiv();
            this._findTheVar(strIdentifier);
            var ret = this._getTheValue();
            this._tidyUp();
            return ret;
        }
    };

You use it like this:

chrome.runtime.onMessage.addListener(
    function(request, sender, sendResponse) {

        var objNativeGetter = {
        .... the object code, above
        }

        // do some validation, then carefully call objNativeGetter.find(...) with a known string (don't use any user generated or dynamic string - keep tight control over this)
        var val = objNativeGetter.find('somePageObj.someMethod()');
        sendResponse({theValueYouWant: val});
    }
);
Phial answered 26/3, 2020 at 17:50 Comment(0)
C
2

I actually worked around it using the localStorge API. Note: to use this, our contentscript should be able to read the localStorage. In the manifest.json file, just add the "storage" string:

"permissions": [...,"storage"]

The hijack function lives in the content script:

function hijack(callback) {
    "use strict";
    var code = function() {
      //We have access to topframe - no longer a contentscript          
      var ourLocalStorageObject = {
        globalVar: window.globalVar,
        globalVar2: window.globalVar2
      };
      var dataString = JSON.stringify(ourLocalStorageObject);
      localStorage.setItem("ourLocalStorageObject", dataString);
    };
    var script = document.createElement('script');
    script.textContent = '(' + code + ')()';
    (document.head||document.documentElement).appendChild(script);
    script.parentNode.removeChild(script);
    callback();
  }

Now we can call from the contentscript

document.addEventListener("DOMContentLoaded", function(event) { 
    hijack(callback);
});

or if you use jQuery in your contentscript, like I do:

$(document).ready(function() { 
    hijack(callback);
});

to extract the content:

function callback() {
    var localStorageString = localStorage.getItem("ourLocalStorageObject");
    var ourLocalStorageObject= JSON.parse(localStorageString);

    console.log("I can see now on content script", ourLocalStorageObject);
    //(optional cleanup):
    localStorage.removeItem("ourLocalStorageObject");
}

This can be called multiple times, so if your page changes elements or internal code, you can add event listeners to update your extension with the new data.

Edit: I've added callbacks so you can be sure your data won't be invalid (had this issue myself)

Clam answered 21/6, 2015 at 16:22 Comment(4)
I think this only gives read-only access to simple variables (as supported by JSON) and can't retrieve window, for example.Mullen
Since we are sharing the same DOM, this is not necessarily true. Here's why: you have the ability to inject script tags at any time during the contentScript's execution to the actual top frame, that can manipulate the frame's internals. The code in your contentScript.js can therefore be used to communicate with the injected code - and it has window's top frame access. You cannot retrieve a reference to the window object, that is correct, but you can establish a protocol for 'talking' to it in order to pass around functions and data. This is a hacky way of doing it, but it is possible.Clam
You can establish the protocol but you still can't pass around that data (circular references and such). What you can do is inject the actual functions that will interact with the dom, but still that's not what your solution outlines — my comment simply expressed the limitations of your solution.Mullen
This is a nice hack!Autoeroticism
M
1

If you know which variables you want to access, you can make a quick custom content-script to retrieve their values.

In popup.js :

chrome.tabs.executeScript(null, {code: 'var name = "property"'}, function() {
    chrome.tabs.executeScript(null, {file: "retrieveValue.js"}, function(ret) {
        for (var i = 0; i < ret.length; i++) {
            console.log(ret[i]); //prints out each returned element in the array
        }
    });
});

In retrieveValue.js :

function returnValues() {
    return document.getElementById("element")[name];
    //return any variables you need to retrieve
}
returnValues();

You can modify the code to return arrays or other objects.

Maryjomaryl answered 19/5, 2017 at 2:3 Comment(0)
L
1

In manifest v3, there is a notion of ExecutionWorld where you can decide that your script will run in an Isolated environment or in the "main world," which is the execution environment shared with the host page's JavaScript.

Specify the world property in Content Script or chrome.scripting like this:

  chrome.scripting
    .executeScript({
      target: { tabId: tab.id },
      files: ["index.js"],
      world: "MAIN",
    })
    .then(() => console.log("script injected"));
} 

Now you can access everything on the page from within index.js

Lordinwaiting answered 21/5 at 14:7 Comment(0)
S
-1

No.

Content scripts execute in a special environment called an isolated world. They have access to the DOM of the page they are injected into, but not to any JavaScript variables or functions created by the page. It looks to each content script as if there is no other JavaScript executing on the page it is running on. The same is true in reverse: JavaScript running on the page cannot call any functions or access any variables defined by content scripts.

Isolated worlds allow each content script to make changes to its JavaScript environment without worrying about conflicting with the page or with other content scripts. For example, a content script could include JQuery v1 and the page could include JQuery v2, and they wouldn't conflict with each other.

Another important benefit of isolated worlds is that they completely separate the JavaScript on the page from the JavaScript in extensions. This allows us to offer extra functionality to content scripts that should not be accessible from web pages without worrying about web pages accessing it.

Sternforemost answered 17/10, 2010 at 23:41 Comment(2)
reference link would be very helpfulGooseneck
@Gooseneck Here is the link :-)~Percolator
T
-1

Works with any data type. Date need to be parsed after retrieving.

/**
 * Retrieves page variable or page function value in content script.
 * 
 * Example 1:
 * var x = 'Hello, World!';
 * var y = getPageValue('x'); // Hello, World!
 * 
 * Example 2:
 * function x() = { return 'Hello, World!' }
 * var y = getPageValue('x()'); // Hello, World!
 * 
 * Example 3:
 * function x(a, b) = { return a + b }
 * var y = getPageValue('x("Hello,", " World!")'); // Hello, World!
 */
 function getPageValue(code) {
    const dataname = (new Date()).getTime(); 
    const content = `(()=>{document.body.setAttribute('data-${dataname}', JSON.stringify(${code}));})();`;
    const script = document.createElement('script');
    
    script.textContent = content;
    document.body.appendChild(script);
    script.remove();

    const result = JSON.parse(document.body.getAttribute(`data-${dataname}`));
    document.body.removeAttribute(`data-${dataname}`);
   
    return result;
}
Tiga answered 30/4, 2022 at 6:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.