I've written a script in php
to get the html content or source code from a webpage but I could not succeed. When I execute my script, it opens the page itself. How can I get the html element or source code?
This is the script:
<?php
include "simple_html_dom.php";
function get_source($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$htmlContent = curl_exec($ch);
curl_close($ch);
$dom = new simple_html_dom();
$dom->load($htmlContent);
return $dom;
}
$scraped_page = get_source("https://stackoverflow.com/questions/tagged/web-scraping");
echo $scraped_page;
?>
Currently I'm getting like this:
My expected output is something like:
Btw, echoing $htmlContent
also gives me what you can see in image 1.
echo $scraped_page;
displays the document you've loaded, so you should be able to use this to extract the data instead. – Enterostomyecho '<pre>';
before andecho '</pre>';
after the echo. Or view the source in your browser. – Enterostomy__toString()
function that just returns the bare source. If you want to do something else you need to do something else. – EssieessingerPossible duplicate
flag when the question there is totally different from what I've asked here. Thanks anyway. – Fixateecho $scraped_page
not showing what you expected? What is it showing? What did you expect? if the curl request succeeded, it should be showing you some HTML. If it isn't, you probably need to find out why the request failed, or what else went wrong with your script. "Didn't succeed" as a description of your problem doesn't really give us much to go on. What do you mean by "opens the page itself"? Which page? Opens how, exactly? You're just echoing the result of the curl request, that's all. We would really like to help, but we need you to be more specific about your problem. Thankyou. – Khalilahkhalin$htmlContent
instead rather than echoing $dom, which it seems is likely to be an object. – Khalilahkhalin