In chrome extension, how to send a cross-origin message from a parent content script to a content script in specific child iframe
Asked Answered
C

2

11

I am developing a Chrome extension with a manifest that, for now, enables access to all hosts. The background script injects content scripts into all frames. After the DOM is loaded, the content script in the top page/frame begins to walk the DOM tree. When the walker encounters an iframe, it needs to message the specific content script associated with that iframe's window (possibly cross-origin) to begin it's work and includes some serialized data with this message. The parent window suspends execution and waits for the child to complete it's walk and send a message back that it is done along with serialized data. The parent then continues its work. I have tried two approaches to this problem:

  1. frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation(). For example, on the yahoo home page (https://www.yahoo.com), when posting a message to my content script associated with iframe source https://s.yimg.com/rq/darla/2-9-9/html/r-sf.html, the message is never received. This is an ad-related iframe. Maybe the blocking of messages is intentional. There is no error when the message is posted and I use a targetOrigin of "*".
  2. chrome.runtime.sendMessage: I can send a message to the background page but cannot figure out how to tell the background page to which frame to relay the message. The parent window content script does not know the chrome extension frameId associated with the child frame element it encountered in the DOM walk. So it cannot tell the background page how to direct the message.

For point 2, I have tried two techniques that I found here on stackoverflow:

  1. Using concept described in this question: In parent window, determine iframe's position in the window.frames array and post a message to the background page with this index. The background page posts a message to all frames with the desired index in the message data. Only the iframe that finds it's window object position in the window.parent.frames array matches the index received from the message proceeds with it's walk. This works OK but is vulnerable to changes in the window.frames array during the asynchronous messaging process (if an iframe is deleted after message is sent, the index value may no longer match the desired frame).
  2. Instead of the index value from point 1, use frameElement.name in the parent window. With same messaging technique, send name to child iframe for comparison to its window.name value. I believe window.name gets it's value from the frameElement.name at the time of the iframe element creation. However, since I don't control the frame element creation, the name attribute is often an empty string and can't be relied on to uniquely match iframe elements to their windows.

Is there a way for me to reliably send a message to a content script associated with an iframe element found in walking a DOM tree?

Crap answered 29/6, 2016 at 15:8 Comment(9)
This question may be of help: #26010855 As for point 1, maybe the frame is blocked from loading by an adblocker?Fortune
I don't have any other extensions installed on my development browser (i.e. no ad blocker) and the iframe appears loaded (I can see the ad) though perhaps I should check it's readyState.Crap
Thanks for the link Xan. That was the post I used as the basis for point 1 solution. But am I wrong that the approach is vulnerable to iframe deletions during the asynchronous messaging from parent to background to children?Crap
Correct observation.Fortune
Maybe you can use ports (chrome.runtime.connect) from inside the frame's content script, thus the comm channels will stay open and, if needed, you can even store an array of Port objects in the background page to identify an arbitrary frame by the Port's name property, which would be generated by a caller according to some scheme.Roughandtumble
I like the idea of ports after I make the initial match between the iframe element in the parent document and the content window/script the iframe element contains. But I don't think the port will solve the problem of making this initial connection. Unless I'm missing something? Xan's solution using window.frames makes this initial connection possible, but can fail due to asynchronous messaging.Crap
Did you find a viable solution to this problem after all? I am trying the same thing, walking through elements and sending random numbers to the iframes, but some of them do not receive messages.Tarrsus
The idea of using "run_at":"document_start" as described in the accepted answer helps, but I still encounter frames that don't respond. I haven't looked at this issue for a couple years, so can't say how it's working on current version of chrome. Unfortunately, the extension message system had flaws when it was released and it doesn't seem like there was much focus on fixing them.Crap
Thanks for your message. I had the run_at document start as well. The interesting thing is I can deliver the messages to those iframes by sending from Chrome Dev Tools. But from the content script, it's very faulty. function receivedMessage (e) {console.log(e);} this.addEventListener("message", receivedMessage, false); document.getElementsByTagName('iframe')['1'].contentWindow.postMessage("hi there", "*")Tarrsus
H
12

When you call chrome.runtime.sendMessage from a content script, the second parameter of the chrome.runtime.onMessage listener ("sender") includes the properties url and frameId. You can send a message (from an extension page, e.g. the background page) to a specific frame using chrome.tabs.sendMessage with the given frameId.

If you want to know the list of all frames (and their frame IDs) at any time, use the chrome.webNavigation.getAllFrames. If you do that, then you can construct a tree of the frames in a tab, and then send this information to all frames for further processing.

Reliable postMessage / onMessage

frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation()

This can be countered by running your script at "run_at":"document_start" and immediately register the message event listener. Then your handler will always be called first and the page cannot cancel it via event.stopImmediatePropagation(). However, do not blindly trust the information from other frames and always verify the message (e.g. by communicating with the other frames via the background page).

Combining both approaches

The first method offers a secure way to exchange data between frames, but does not offer a general way to link the frame to a specific DOM element.
The second method allows you to target a specific (i)frame element, but any web page can do that and therefore the method on its own is not reliable. By combining both, you get a secure communication channel that is linked to a DOM element.

This is a basic example that applies the above methods to communicate between frames A and B:

  1. Content script in A:

    1. Send a message to the background page (e.g. a message including the index of frame B).
  2. Background page:

    1. Receives the message from A.
    2. Generate a random nonce, say R (crypto.getRandomValues).
    3. Store a mapping from R to frameId (and optionally other information that was included in the message from A).
    4. Call the response callback with this random value.
  3. Content script in A:

    1. Receive R from the background page.
    2. Calls postMessage on frame B and pass R.
  4. Content script in B:

    1. Receive R from A.
    2. Send a message to the background page to retrieve the frameId (and optionally other information from A).

Note: For a rock-solid application, you need to account for the fact that the frame is removed during any of those steps. If you neglect the asynchronous nature of this process, you may leave your application in an inconsistent state.

Hilversum answered 3/7, 2016 at 7:1 Comment(7)
If the background page messages a parent frame content script with an array of frameid/url pairs of its child windows, how can the content script match an individual frame element found in a DOM walk to one of these pairs (and thereby know the specific frameId)? I can't use the url to match the iframe element src attribute since iframes may share the same src and redirects can cause the src not to match the final document location. I think that I'm missing a step in your solution.Crap
Regarding stopImmediatePropagation, I attach two listeners to a child window. Each listener prints a message to the console. In the first listener, if I call event.stopImmediatePropagation, the second listener never prints to console when calling iframe.contentWindow.postMessage from the parent window. If I do not call stop in the first listener, both message print to console. My browser is Google Chrome 51.0.2704.106 (64-bit), linux. I ran this test using the developer console command line. I see in the example from my question that their script is calling stop on an event.Crap
@Crap The frames don't contact each other directly, but through the background page. If frame A knows (from the background page) that the parent frame is frame B, then A can send a message and the frame ID of B to the background page, and the background page can then forward the message to frame B. I removed the part about stopImmediatePropagation from my answer, since it is the correct behavior.Hilversum
I see how that works going from child to parent because the child can have only one parent and can learn that parent frameId from the background page. But how can the parent frame know the frameId of a specific child frame element found in the DOM walk, assuming that parent has multiple children? The background page can provide the parent frame with an array of frameId,url pairs, but how can the parent frame match each pair to the correct iframe element in the DOM? I need the message chain to start at the root of the DOM tree and travel down.Crap
Hoping to clarify - Parent frame has two children, A Frame and B Frame. By walking DOM, Parent script has found html elements associated with A Frame and B Frame. Parent wants to send a message to content script associated with B Frame element only by routing message through background script. So Parent needs to provide background script with B Frame element's frameId. How does Parent script know B Frame's frameId? Parent can ask background script to send him his child frameId's, but how does he know which is for B Frame element?Crap
@Rob-W and DAR: awesome question, answer, and comments. Thank you both!Underplay
@RobW Would love to hear your thoughts on the answer I have posted below. I feel it is more secure and reliable. Thanks!Divergence
D
-1

tl;dr

  1. My answer will describe a CORS-proof solution in the specific case when the child frame has user focus. In many usecases, usually the frame we want to interact with has focus.
  2. In 2022, Firefox and Safari already have CORS-proof standard API for this, so if you're targeting them, consider using the standard API instead.
  3. My solution does not use window.postMessage or cryptographic random values.

Prerequisites

The frame - to which you are sending a message - needs be a 'valid frame'. A 'valid frame' is:

  1. a frame which has user focus, OR
  2. parent of another valid frame

For simplicity of discussion, I'll assume:

  1. we have <all_urls> host permission
  2. we are working inside only one tab

Glossary

A frame tree is such that each frame has a unique parent but can have multiple children. The depth of a frame is:

  1. Zero if it is the root document, OR
  2. One plus depth of its parent frame

Procedure

Step 1. Track depth of each frame

tl;dr: When a frame loads, we record its depth in the frame tree.

I will assume your content script (CS) already injects itself into each iframe on the page. As soon as it is injected, the CS needs to report its own frame depth to the background page (BG). Using this information, BG will maintain the list of frame IDs at each depth level.

  1. CS can get its own depth by using recursive algorithm similar to the one described here
  2. BG can access sender.frameId in the onMessage listener to correctly get the frame Id.

BG now has a list reportedFrameDepths (for example) where reportedFrameDepths[depth] is a list/set of all frameIds at that depth.

Step 2. Check which child frame is focused

tl;dr: Given a frame, we can find which one of its child frame is focused.

We can enumerate all candidate children of this frame by checking reportedFrameDepths[depth + 1], where depth is the frame depth of this frame. Only one of the frames in this list should have user focus.

The focused child will have non-null document.activeElement value, and document.hasFocus() will be true. We need to check the latter as in certain cases (for example, mail.google.com), document.activeElement is set to a non-focused element (<body>) for many frames.

So we can send a message to all the candidate frames (specify { frameId } in options field of tabs.sendMessage) and get a boolean response from them if they have focus. The one frameId that responds true should be the intended focused child frame.

Step 3. Repeat step 2 recursively.

If you can find the focused child A of a given frame, you can also find the focused child B of that focused child A.

Repeating step 2 starting from root document will lead you to the deepest focused child. The recursion stops when there is no further focused child.

This is the end, you now have the frame ID of the deepest focused child. You can now send a message directly to this frame.

Gotchas

This is not a trivial solution to implement, due to:

  1. the amount of asynchronicity. There is a long messaging chain across multiple CS and the BG. Ensure your code can handle if the messaging chain is interrupted midway due to some other parts of the code crashing.
  2. Tabs and frames can reload, navigate away or destroy themselves. Make sure your implementation handles these cases. Especially be wary of caching or data stores as they can become obsolete.

That said, I have implemented the solution for a similar usecase and it works reliably and fast enough (extra overhead I observed is less than 5ms). The exact implementation will vary depending on your product's needs, and the above explanation should serve as a good reference.

Divergence answered 11/8, 2022 at 16:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.