Fatal error: Uncaught Error: Call to a member function find() on string in C:\xampp\

saversites · May 20, 2018

Php Buddies,

What I am trying to do is learn to build a simple web crawler.
So at first, I will feed it a url to start with.
It will then fetch that page and extract all the links into a single array.
Then it will fetch each of those links pages and extract all their links into a single array likewise. It will do this until it reaches it's max link deep level.
Here is how I coded it:

<?php 
	include('simple_html_dom.php'); 
	$current_link_crawling_level = 0; 
$link_crawling_level_max = 2
	if($current_link_crawling_level == $link_crawling_level_max)
{
    exit(); 
}
else
{
    $url = 'https://www.yahoo.com'; 
    $curl = curl_init($url); 
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
    curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); 
    curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); 
    curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); 
    $html = curl_exec($curl); 
    
    $current_link_crawling_level++;    
	    //to fetch all hyperlinks from the webpage 
    $links = array(); 
    foreach($html->find('a') as $a) 
    { 
        $links[] = $a->href; 
        echo "Value: $value<br />\n"; 
        print_r($links); 
        
        $url = '$value'; 
        $curl = curl_init($value); 
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
        curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); 
        curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); 
        curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); 
        $html = curl_exec($curl); 
	        //to fetch all hyperlinks from the webpage 
        $links = array(); 
        foreach($html->find('a') as $a) 
        { 
            $links[] = $a->href; 
            echo "Value: $value<br />\n";
            print_r($links); 
            $current_link_crawling_level++;
        } 
    echo "Value: $value<br />\n";
    print_r($links);  
}
	?>

I have a feeling I got confused and messed it up in the foreach loops. Nestled too much. Is that the case ? Hint where I went wrong.

Unable to test the script as I have to first sort out this error:
Fatal error: Uncaught Error: Call to a member function find() on string in C:\xampp\h

After that, I will be able to test it. Anyway, just looking at the script, you think I got it right or what ?

Thanks

saversites · May 20, 2018

I just replaced:

//$html = file_get_html('http://nimishprabhu.com');

with:

$url = 'https://www.yahoo.com'; 
$curl = curl_init($url); 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); 
$html = curl_exec($curl);

That is all!
That should not result in that error! :eek:

saversites · May 20, 2018

UPDATE:

I have been given this sample code just now ...

Possible solution with str_get_html:
	$url = 'https://www.yahoo.com'; 
$curl = curl_init($url); 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); 
$response_string = curl_exec($curl); 
	$html = str_get_html($response_string);
//to fetch all hyperlinks from a webpage 
$links = array(); 
foreach($html->find('a') as $a) { 
    $links[] = $a->href; 
} 
print_r($links); 
echo "<br />";

Gonna experiment with it.
Just sharing it here for other future newbies!

saversites · May 20, 2018

I am told:
"file_get_html is a special function from simple_html_dom library. If you open source code for simple_html_dom you will see that file_get_html() does a lot of things that your curl replacement does not. That's why you get your error."

Anyway, folks, I really don't wanna be using this limited capacity file_get_html() and so let's replace it with cURL. I tried my best in giving a shot at cURL here. What-about you ? Care to show how to fix this thingY ?

saversites · May 20, 2018

I did a search on the php manual for str_get_html to be sure what the function does. But, I am shown no results.
And so, I ask: Just what does it do ?

saversites · May 20, 2018

Php Buddies,

Look at these 2 updates. They both succeed in fetching the php manual page but fail to fetch the yahoo homepage. Why is that ?
The 2nd script is like the 1st one except a small change. Look at the commented-out parts in script 2 to see the difference. The added code comes after the commented-out code part.

SCRIPT 1

<?php 
	//HALF WORKING
	include('simple_html_dom.php'); 
	$url = 'http://php.net/manual-lookup.php?pattern=str_get_html&scope=quickref'; // WORKS ON URL
//$url = 'https://yahoo.com'; // FAILS ON URL
	$curl = curl_init($url); 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); 
$response_string = curl_exec($curl); 
	$html = str_get_html($response_string);
	//to fetch all hyperlinks from a webpage 
$links = array(); 
foreach($html->find('a') as $a) { 
    $links[] = $a->href; 
} 
print_r($links); 
echo "<br />"; 

?>

SCRIPT 2

<?php 
	//HALF WORKING
	include('simple_html_dom.php'); 
	$url = 'http://php.net/manual-lookup.php?pattern=str_get_html&scope=quickref'; // WORKS ON URL
//$url = 'https://yahoo.com'; // FAILS ON URL
$curl = curl_init($url); 
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, 0); 
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 0); 
$response_string = curl_exec($curl); 
	$html = str_get_html($response_string);
	/*
//to fetch all hyperlinks from a webpage 
$links = array(); 
foreach($html->find('a') as $a) { 
    $links[] = $a->href; 
} 
print_r($links); 
echo "<br />"; 
*/
	// Hide HTML warnings
libxml_use_internal_errors(true);
$dom = new DOMDocument;
if($dom->loadHTML($html, LIBXML_NOWARNING)){
    // echo Links and their anchor text
    echo '<pre>';
    echo "Link\tAnchor\n";
    foreach($dom->getElementsByTagName('a') as $link) {
        $href = $link->getAttribute('href');
        $anchor = $link->nodeValue;
        echo $href,"\t",$anchor,"\n";
    }
    echo '</pre>';
}else{
    echo "Failed to load html.";
	}
	?>

Don't forget my previous post!

Cheers!

Sign In

Fatal error: Uncaught Error: Call to a member function find() on string in C:\xampp\

Recommended Posts

saversites

Link to comment

Share on other sites

saversites

Link to comment

Share on other sites

saversites

Link to comment

Share on other sites

saversites

Link to comment

Share on other sites

saversites

Link to comment

Share on other sites

saversites

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Store

Support