Document.querySelector returns null until element is inspected using DevTools
Asked Answered
T

3

9

I am attempting to create a Chrome extension that finds "Sponsored" posts on Facebook and removes them.

While doing this, I noticed this rather bizarre behavior of Google Chrome on Facebook.com, where certain types of queries for existing elements (in my case document.querySelector('a[href*="/ads/about"]');) would return null. But if you "inspect"-click them (using the Inspect Tool or CTRL+SHIFT+C), they would show up in DevTools, and then running the query again in the console will show the element. Without any scrolling, moving, resizing, or doing anything to the page.

This can easily be replicated using the instructions above, but for the sake of clarity, I made this following video that shows exactly the weird behavior:

https://streamable.com/mxsf86

Is this some sort of dom-querying caching issue? Have you ever encountered anything similar? Thanks

EDIT: the issue has now been reduced to the query returning null up until the element is hovered, and it's not a DevTools-related issue anymore.

Todtoday answered 27/8, 2020 at 17:44 Comment(9)
"and then running the query again in the console will show the element." sounds like the first time you run the code the element is simply not there and you need to wait for it to be added to the DOM.Baresark
After seeing the video: are you sure that link doesn't show up on click or longer mouseover or whatever?Baresark
Perhaps, the div you're trying to get is built using React Portals and placed in another DOM tree.Abfarad
@Baresark I am running the query after like 3-4sec from the actual time I see it on the page... so how could it "not be added to the DOM" yet?Todtoday
@Todtoday $(element).on("mouseover", event => $("<a/>").attr("href", "google.com").appendTo(event.target)) can do it.Baresark
@Baresark the issue is not the "thing that appears on mouseover", it's the actual post itself. The query is for the post itself. The mouseover information is non-relevant for this issue.Todtoday
@Todtoday look, I'm not particularly interested in debugging this but supposedly you are. Yet, you also don't seem to be particularly interested in tying to follow up on my suggestion. So, I don't really know what to do. For the last time - it seems like the element is not there before you query it. It is after you do something with your mouse. Try to find out if the element is there before you do this "something with your mouse". This will allow you to eliminate the mouse stuff as a factor or investigate it further. That's my advice - take it or leave it, don't just bicker with it, though.Baresark
@Baresark you are right! I'm sorry I didn't understand your request the first time, but indeed, the issue has now been reduced to the query returning null up until the element is hovered. I will update the post to reflect this, cheers.Todtoday
The Sponsored is a role="button" with tabindex 0, which reloads content on click and hover. You can see it in the network tab as well. The a is simply not there before that. It does not matter whether you hover it with DevTools open or not.Pantelleria
U
3

As already noticed, the sponsored links are simply not at their position before some mouse event occurs. Once the mouse event occurs, the elements are added to the DOM, supposedly this is how Facebook avoids people crawling it too easily.

So, if you have a quest to find the sponsored links, then you will need to do the following

  • find out what is the exact event which results in the links being added
  • conduct experiments until you find out how you can programmatically generate that event
  • implement a crawling algorithm that does some scrolling on the wall for a long while and then induces the given event. At that point you might get many sponsored links

Note: sponsored links are paid by companies and they would not be very happy if their ad slots are being used up by uninterested bots.

Udder answered 27/8, 2020 at 18:18 Comment(0)
T
0

The approach I took to solve this issue is as follows:

// using an IIFE ("Immediately-Invoked Function Expression"):
(function() {
    'use strict';

// using Arrow function syntax to define the callback function
// supplied to the (later-created) mutation observer, with
// two arguments (supplied automatically by that mutation
// observer), the first 'mutationList' is an Array of
// MutationRecord Objects that list the changes that were
// observed, and the second is the observer that observed
// the change:
const nodeRemoval = (mutationList, observer) => {

  // here we use Array.prototype.forEach() to iterate over the
  // Array of MutationRecord Objects, using an Arrow function
  // in which we refer to the current MutationRecord of the
  // Array over which we're iterating as 'mutation':
  mutationList.forEach( (mutation) => {

    // if the mutation.addedNodes property exists and
    // also has a non-falsy length (zero is falsey, numbers
    // above zero are truthy and negative numbers - while truthy -
    // seem invalid in the length property):
    if (mutation.addedNodes && mutation.addedNodes.length) {

        // here we retrieve a list of nodes that have the
        // "aria-label" attribute-value equal to 'Advertiser link':
        mutation.target.querySelectorAll('[aria-label="Advertiser link"]')
          // we use NodeList.prototype.forEach() to iterate over
          // the returned list of nodes (if any) and use (another)
          // Arrow function:
          .forEach(
            // here we pass a reference to the current Node of the
            // NodeList we're iterating over, and use
            // ChildNode.remove() to remove each of the nodes:
            (adLink) => adLink.remove() );
    }
  });
},
      // here we retrieve the <body> element (since I can't find
      // any element with a predictable class or ID that will
      // consistently exist as an ancestor of the ad links):
      targetNode = document.querySelector('body'),

      // we define the types of changes we're looking for:
      options = {
          // we're looking for changes amongst the
          // element's descendants:
          childList: true,
          // we're not looking for attribute-changes:
          attributes: false,
          (if this is false, or absent, we look only to
          changes/mutations on the target element itself):
          subtree: true
},
      // here we create a new MutationObserver, and supply
      // the name of the callback function:
      observer = new MutationObserver(nodeRemoval);

    // here we specify what the created MutationObserver
    // should observe, supplying the targetNode (<body>)
    // and the defined options:
    observer.observe(targetNode, options);

})();

I realise that in your question you're looking for elements that match a different attribute and attribute-value (document.querySelector('a[href*="/ads/about"]')) but as that attribute-value wouldn't match my own situation I couldn't use it in my code, but it should be as simple as replacing:

mutation.target.querySelectorAll('[aria-label="Advertiser link"]')

With:

mutation.target.querySelector('a[href*="/ads/about"]')

Though it's worth noting that querySelector() will return only the first node that matches the selector, or null; so you may need to incorporate some checks into your code.

While there may look to be quite a bit of code, above, uncommented this becomes merely:

(function() {
    'use strict';

const nodeRemoval = (mutationList, observer) => {
  mutationList.forEach( (mutation) => {
    if (mutation.addedNodes && mutation.addedNodes.length) {
        mutation.target.querySelectorAll('[aria-label="Advertiser link"]').forEach( (adLink) => adLink.remove() );
    }
  });
},
      targetNode = document.querySelector('body'),
      options = {
          childList: true,
          attributes: false,
          subtree: true
},
      observer = new MutationObserver(nodeRemoval);

    observer.observe(targetNode, options);

})();

References:

Tallyman answered 27/8, 2020 at 19:6 Comment(0)
H
0

I ran into the same issue on chrome. If it helps anyone, I solved it by accessing the frame by

window.frames["myframeID"].document.getElementById("myElementID")
Hobbyhorse answered 6/7, 2022 at 13:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.