Failure to catch dynamic elem (image) with Puppeteer in Node.js
Asked Answered
G

1

0

I load page, wait till the image is loaded (at page.goto()) and selector is found yet I fail to catch already loaded img, always receiving Promise { <pending> }...
How to get image URL ?

const puppeteer = require('puppeteer');
var $url = "https://twitter.com/astro_greek/status/1754801971725312352";

const $selector_timeout =  5000; // default "selector wait" timeout 
const img_selector = 'img'; // [alt="Image"]

(async () => {
    // Launch the browser and open a new blank page
    const browser = await puppeteer.launch({headless: false});
    const page = await browser.newPage();
    await page.goto($url , { waitUntil: 'networkidle2'}); // all dynamic items are loaded        
     
    let image2 = await page.waitForSelector(img_selector, {timeout: $selector_timeout} )
      .then(() => {     
        console.log("Success with selector.");
        let image = page.$(img_selector)
        .then(()=>{
            console.log("Image is caught." );
            page.$eval( img_selector, el => el.scrollIntoView());
            let body = page.$(img_selector )
            .then((el)=>{  
                console.log("typeof el: " + typeof el);
                console.log("Element outerHtml: " + el.getProperty('outerHTML'));
                console.log("Element innerHtml: " + el.getProperty('innerHTML'));
                console.log("Element href: " + el.getProperty('href')); 
            });
            console.log("Body: " );
            console.log(body);    
        }).catch((err)=>{ console.log(err);});
        console.log("Image: ");
        console.log(image);
    }).catch((err)=> { 
        console.log("Failure to get tax info HTML element. ERR:", err);                                
    } );
    console.log("Image 2: " + image2);      
 
    await browser.close();
})();

The result for all 3 cases: image, body and el are Promise { <pending> } or [object Promise]:

Success with selector.
Image:
Promise { <pending> }
Image 2: undefined
Image is caught.
Body:
Promise { <pending> }
Element: JSHandle@node
Promise { <pending> }
typeof el: object
Element outerHtml: [object Promise]
Element innerHtml: [object Promise]
Element href: [object Promise]

Update 1

When I try the following after page.goto():

 let img = await page.$("img")
    .then((response)=> {  
        //console.dir(response); 
        console.log("Image properties: "  + response.getProperties()); 
        //let href = response.getProperty('href').jsonValue();
        //console.log("Image href: "  + href); 
    });

the result is:

Image properties: [object Promise]
Galiot answered 8/2, 2024 at 14:8 Comment(4)
Does this answer your question? How do I return the response from an asynchronous call?Gurdwara
@KrzysztofKrzeszewski, it does not. See Update 1.Galiot
This nested code combining then and await is an antipattern. Mixing the two leads to unreadable, buggy code. page.$eval( img_selector, el => el.scrollIntoView()); was never awaited, and both of your page.$(img_selector ) calls are disconnected from the main promise chain, creating race conditions. The code will close the browser in parallel with these selections, because all happen simultaneously after the waitForSelector call. If you remove all thens and strictly use await in front of every Puppeteer call, you should be OK.Imitable
As far as getting the actual src property of the image: page.$eval("img", el => el.src) or page.$eval("img", el => el.getAttribute("src")), but that's sort of beside the point here--the first order of business is writing correct code to isolate the element before bothering to pull anything from it.Imitable
G
0

After some tries and errors:

/*
let img_all_properties = await img.getProperties();
console.log("img all props type: " + (typeof img_all_properties)  );
for (const [key, value] of img_all_properties) {  
    console.log(`The value for key ${key} is ${value}`);
}
*/
let img_src_prop = await img.getProperty('src');
let img_src = await img_src_prop.jsonValue();
console.log("img src: " + img_src );

with result:

img src: https://pbs.twimg.com/profile_images/1470781968971358211/2Qd3T5VJ_normal.jpg
Galiot answered 8/2, 2024 at 14:57 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.