urlcache={}
def get_page(url):
if (url not in urlcache) or (urlcache[url]==1) or (urlcache[url]==2):
time.sleep(1)
try:
r = requests.get("http://en.wikipedia.org%s" % url)
if r.status_code == 200:
urlcache[url] = r.text
else:
urlcache[url] = 1
except:
urlcache[url] = 2
return urlcache[url]
If I am understanding this bit of code right, the value gets updated if the webpage exists (which does seem to be the case for all URLs), and it is 1 if the status code is not 200 (any error in the page).
But I don’t get under what circumstances the value would be 2.