python - scraping the file with html saved in local system -


for example had site "www.example.com" want scrape html of site saving on local system. testing saved page on desktop example.html

now had written spider code below

class examplespider(basespider):    name = "example"    start_urls = ["example.html"]     def parse(self, response):        print response        hxs = htmlxpathselector(response) 

but when run above code getting error below

valueerror: missing scheme in request url: example.html 

finally intension scrape example.html file consists of www.example.com html code saved in local system

can 1 suggest me on how assign example.html file in start_urls

thanks in advance

you can crawl local file using url of following form:

 file:///127.0.0.1/path/to/file.html 

it doesn't require http server installed on machine.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -