Tag Archives: sitemap
Crawling sitemaps with Python
This a basic script I have created to crawl an xml sitemap file (does not support nested sitemaps). It will report if the request was successfully processed by the server or if, instead, it returned some kind of error. #!/usr/bin/env python from sys import argv from re import findall from socket import setdefaulttimeout from urllib2 [...]
