php - How can one detect if a server/script is accessing their site through cURL/file_get_contents()? (excluding user-agents and IP addresses) -
i've come across question user having difficulties accessing image through script (using curl/file_get_contents()):
the image link seems return 403 error when using file_get_contents() request it. in curl, more detailed error returned:
you denied access system. turn off engine or surf proxy, fake ip if want access. proxy or not accepted web tools intrusion prevention system.
binh minh online data services @ 2008 - 2012
i failed access same image after fiddling around curl request myself. tried changing user-agent exact browsers user-agent can access image. i've tried script on personal local server, (obviously) uses same ip address browser... far know, user-agents , ip addresses out of situation.
how else can detect script performing request?
btw, not crazy. i'm curious xd
it indeed cookie set javascript redirect, original image. problem curl/fgc wont parse html , set cookie cookies set server curl store in cookie jar.
this code before redirect, makes cookie via javascript no name location.href value:
<!doctype html public "-//w3c//dtd xhtml 1.0 transitional//en" "http://www.w3.org/tr/xhtml1/dtd/xhtml1-transitional.dtd"> <head> <title>http://phim.xixam.com/thumb/giotdang.jpeg</title> <meta http-equiv="refresh" content="0;url=http://phim.xixam.com/thumb/giotdang.jpeg"> </head> <script type="text/javascript"> window.onload = function checknow() { var today = new date(); var expires = 3600000*1*1; var expires_date = new date(today.gettime() + (expires)); var ua = navigator.useragent.tolowercase(); if ( ua.indexof( "safari" ) != -1 ) { document.cookie = "location.href"; } else { document.cookie = "location.href;expires=" + expires_date.togmtstring(); } } </script> <body> </body></html> but not lost, because pre-setting/forging cookie can circumvent security measure (a reason why using cookies kind of security bad).
cookie.txt
# netscape http cookie file # http://curl.haxx.se/rfc/cookie_spec.html # file generated libcurl! edit @ own risk. phim.xixam.com false /thumb/ false 1338867990 location.href so finnished curl script like:
<?php function curl_get($url){ $return = ''; (function_exists('curl_init')) ? '' : die('curl must installed!'); //forge cookie $expire = time()+3600000*1*1; $cookie =<<<cookie # netscape http cookie file # http://curl.haxx.se/rfc/cookie_spec.html # file generated libcurl! edit @ own risk. phim.xixam.com false /thumb/ false $expire location.href cookie; file_put_contents(dirname(__file__).'/cookie.txt',$cookie); //browser masquerade curl request $curl = curl_init(); $header[0] = "accept: text/xml,application/xml,application/json,application/xhtml+xml,"; $header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; $header[] = "cache-control: max-age=0"; $header[] = "connection: keep-alive"; $header[] = "keep-alive: 300"; $header[] = "accept-charset: iso-8859-1,utf-8;q=0.7,*;q=0.7"; $header[] = "accept-language: en-us,en;q=0.5"; $header[] = "pragma: "; curl_setopt($curl, curlopt_cookiejar, dirname(__file__).'/cookie.txt'); curl_setopt($curl, curlopt_cookiefile, dirname(__file__).'/cookie.txt'); curl_setopt($curl, curlopt_url, $url); curl_setopt($curl, curlopt_useragent, 'mozilla/5.0 (windows nt 5.1; rv:5.0) gecko/20100101 firefox/5.0 firefox/5.0'); curl_setopt($curl, curlopt_httpheader, $header); curl_setopt($curl, curlopt_header, 0); //pass referer check curl_setopt($curl, curlopt_referer, 'http://xixam.com/forum.php'); curl_setopt($curl, curlopt_encoding, 'gzip,deflate'); curl_setopt($curl, curlopt_autoreferer, true); curl_setopt($curl, curlopt_returntransfer, true); curl_setopt($curl, curlopt_followlocation, true); curl_setopt($curl, curlopt_timeout, 30); curl_setopt($curl, curlopt_ssl_verifypeer, false); $html = curl_exec($curl); curl_close($curl); return $html; } $image = curl_get('http://phim.xixam.com/thumb/giotdang.jpeg'); file_put_contents('test.jpg',$image); ?> the way stop crawler log visitors ips in database , increment value based on visits per ip, once week or @ top hits ip , reverse lookup of ip , see if hosting provider if block @ firewall or in htaccess, other cant stop request resource if publicly available hurdle can overcome.
hope helps.
Comments
Post a Comment