python - scraping the file with html saved in local system -
for example had site "www.example.com" want scrape html of site saving on local system. testing saved page on desktop example.html
now had written spider code below
class examplespider(basespider): name = "example" start_urls = ["example.html"] def parse(self, response): print response hxs = htmlxpathselector(response) but when run above code getting error below
valueerror: missing scheme in request url: example.html finally intension scrape example.html file consists of www.example.com html code saved in local system
can 1 suggest me on how assign example.html file in start_urls
thanks in advance
you can crawl local file using url of following form:
file:///127.0.0.1/path/to/file.html it doesn't require http server installed on machine.
Comments
Post a Comment