Extract link from specific page/URL
This is a simple function to extract link from a specific page/URL :
function extract_url($main_url){
$cek_url = parse_url($main_url);
$prefix_url = $cek_url['scheme'].'://'.$cek_url['host'];
$f = fopen($main_url,"r");
$inputStream = fread($f,65535);
fclose($f);
if (preg_match_all("/
foreach($matches[1] as $link){
if(!eregi(’mailto:|javascript:|ymsgr:’,$link)){
if(eregi(”http://”,$link)){
$url = $link;
}
else{
$url = $prefix_url.$link;
}
if(eregi(’PHPSESSID’,$url)){
$url = explode(”PHPSESSID”,$url);
$url = substr($url[0],0,-1);
}
$output[] = $url;
}
}
}
return array_unique($output);
}
?>
And here is sample how to use the function:
$start = time();
print_r(extract_url("http://dev.sandalian.com/"));
$end = time();
echo 'done in '.($end-$start).' second';
?>
This script only working when remote_url are allowed. Otherwise you will need to use CURL.
August 30th, 2007 at 7:28 am
Looks goods, but it can be more complicated then that
When its comes to extract url from the whole domain (subpages) etc.
August 31st, 2007 at 4:56 am
tujuannya buat apa? mendingan bikin email address extractor pak.
August 31st, 2007 at 5:47 am
@ Jamal Soueidan
Yupe, but this is the basic function to do all that stuffs.
@ Rizky
Yeh, kan dimulai dari ini dulu Pak, biar bisa ngikutin link baru di extract alamat emailnya
September 1st, 2007 at 8:08 pm
@ Rizky
Yeh, kan dimulai dari ini dulu Pak, biar bisa ngikutin link baru di extract alamat emailnya
Please Translate it to english
September 2nd, 2007 at 11:35 pm
why dont use CURL since u know that FOPEN is more possible to disable
*maaf, bahasa inggris saya pathing pechothot
September 3rd, 2007 at 4:48 am
@pangsit
since I was too lazy to write CURL script :p
March 17th, 2008 at 12:30 am
do you know how can i extract all URLS from string??