Checking if a website is up via Python

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Checking if a website is up via Python



By using python, how can I check if a website is up? From what I read, I need to check the "HTTP HEAD" and see status code "200 OK", but how to do so ?



Cheers





Duplicate: stackoverflow.com/questions/107405/…
– Daniel Roseman
Dec 22 '09 at 21:43




11 Answers
11



You could try to do this with getcode() from urllib


getcode()


>>> print urllib.urlopen("http://www.stackoverflow.com").getcode()
>>> 200



EDIT: For more modern python, i.e. python3, use:


python3


import urllib.request
print(urllib.request.urlopen("http://www.stackoverflow.com").getcode())
>>> 200





Following question, using urlopen.getcode does fetch the entire page or not?
– OscarRyz
Dec 22 '09 at 23:13


urlopen.getcode





As far as i know, getcode retreives the status from the response that is sent back
– Anthony Forloney
Dec 23 '09 at 0:38


getcode





@Oscar, there's nothing in urllib to indicate it uses HEAD instead of GET, but the duplicate question referenced by Daniel above shows how to do the former.
– Peter Hansen
Dec 23 '09 at 2:38





+1 didn't even read the answer :)
– Dead account
Jan 26 '10 at 16:56





@l1zard like so: req = urllib.request.Request(url, headers = headers) resp = urllib.request.urlopen(req)
– jamescampbell
Jan 14 '16 at 17:23



req = urllib.request.Request(url, headers = headers) resp = urllib.request.urlopen(req)



I think the easiest way to do it is by using Requests module.


import requests

def url_ok(url):
r = requests.head(url)
return r.status_code == 200





this does not work here for url = "http://foo.example.org/" I would expect 404, but get a crash.
– Jonas Stein
Jun 2 '13 at 0:11



url = "http://foo.example.org/"





This returns False for any other response code than 200 (OK). So you wouldn't know if it's a 404. It only checks if the site is up and available for public.
– caisah
Jun 3 '13 at 20:42


False





@caisah, did you test it? Jonas is right; I get an exception; raise ConnectionError(e) requests.exceptions.ConnectionError: HTTPConnectionPool(host='nosuch.org2', port=80): Max retries exceeded with url: / (Caused by <class 'socket.gaierror'>: [Errno 8] nodename nor servname provided, or not known)
– AnneTheAgile
Nov 14 '13 at 16:27





I've test it before posting it. The thing is, that this checks if a site is up and doesn't handle the situtation when host name is invalid or other thing that go wrong. You should think of those exceptions and catch them.
– caisah
Nov 17 '13 at 13:56






Pretty cool. Very HTTP.
– SDsolar
May 5 at 20:29



You can use httplib


import httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("HEAD", "/")
r1 = conn.getresponse()
print r1.status, r1.reason



prints


200 OK



Of course, only if www.python.org is up.


www.python.org





This only checks domains, need something efficient like this for webpages.
– User
Jan 9 '14 at 21:59


import httplib
import socket
import re

def is_website_online(host):
""" This function checks to see if a host name has a DNS entry by checking
for socket info. If the website gets something in return,
we know it's available to DNS.
"""
try:
socket.gethostbyname(host)
except socket.gaierror:
return False
else:
return True


def is_page_available(host, path="/"):
""" This function retreives the status code of a website by requesting
HEAD data from the host. This means that it only requests the headers.
If the host cannot be reached or something else goes wrong, it returns
False.
"""
try:
conn = httplib.HTTPConnection(host)
conn.request("HEAD", path)
if re.match("^[23]dd$", str(conn.getresponse().status)):
return True
except StandardError:
return None





is_website_online just tells you if a host name has a DNS entry, not whether a website is online.
– Craig McQueen
Dec 22 '09 at 23:38


is_website_online



The HTTPConnection object from the httplib module in the standard library will probably do the trick for you. BTW, if you start doing anything advanced with HTTP in Python, be sure to check out httplib2; it's a great library.


HTTPConnection


httplib


httplib2


from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
req = Request("http://stackoverflow.com")
try:
response = urlopen(req)
except HTTPError as e:
print('The server couldn't fulfill the request.')
print('Error code: ', e.code)
except URLError as e:
print('We failed to reach a server.')
print('Reason: ', e.reason)
else:
print ('Website is working fine')



Works on Python 3



If by up, you simply mean "the server is serving", then you could use cURL, and if you get a response than it's up.



I can't give you specific advice because I'm not a python programmer, however here is a link to pycurl http://pycurl.sourceforge.net/.



If server if down, on python 2.7 x86 windows urllib have no timeout and program go to dead lock. So use urllib2


import urllib2
import socket

def check_url( url, timeout=5 ):
try:
return urllib2.urlopen(url,timeout=timeout).getcode() == 200
except urllib2.URLError as e:
return False
except socket.timeout as e:
print False


print check_url("http://google.fr") #True
print check_url("http://notexist.kc") #False



Hi this class can do speed and up test for your web page with this class:


from urllib.request import urlopen
from socket import socket
import time


def tcp_test(server_info):
cpos = server_info.find(':')
try:
sock = socket()
sock.connect((server_info[:cpos], int(server_info[cpos+1:])))
sock.close
return True
except Exception as e:
return False


def http_test(server_info):
try:
# TODO : we can use this data after to find sub urls up or down results
startTime = time.time()
data = urlopen(server_info).read()
endTime = time.time()
speed = endTime - startTime
return 'status' : 'up', 'speed' : str(speed)
except Exception as e:
return 'status' : 'down', 'speed' : str(-1)


def server_test(test_type, server_info):
if test_type.lower() == 'tcp':
return tcp_test(server_info)
elif test_type.lower() == 'http':
return http_test(server_info)



Here's my solution using PycURL and validators


import pycurl, validators


def url_exists(url):
"""
Check if the given URL really exists
:param url: str
:return: bool
"""
if validators.url(url):
c = pycurl.Curl()
c.setopt(pycurl.NOBODY, True)
c.setopt(pycurl.FOLLOWLOCATION, False)
c.setopt(pycurl.CONNECTTIMEOUT, 10)
c.setopt(pycurl.TIMEOUT, 10)
c.setopt(pycurl.COOKIEFILE, '')
c.setopt(pycurl.URL, url)
try:
c.perform()
response_code = c.getinfo(pycurl.RESPONSE_CODE)
c.close()
return True if response_code < 400 else False
except pycurl.error as err:
errno, errstr = err
raise OSError('An error occurred: '.format(errstr))
else:
raise ValueError('"" is not a valid url'.format(url))



You may use requests library to find if website is up i.e. status code as 200


requests


status code


200


import requests
url = "https://www.google.com"
page = requests.get(url)
print (page.status_code)

>> 200






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard