WHOIS Domain Availability Checker
WHOIS Domain Availability Checker
A WHOIS domain availability checker that reads domains from a file and checks them in batches.
This script implements a WHOIS client that connects to a WHOIS server and checks the availability of domain names. It’s designed to efficiently check multiple domains by maintaining a single connection and reusing it for all queries, which reduces overhead and respects server resources.
The script includes robust error handling for network issues and automatic reconnection capabilities to handle server-side disconnections.
Features:
- Batch processing of domains from a file
- Persistent connection reuse
- Automatic reconnection on failure
- Comprehensive error handling
- Detailed logging of operations
#!/usr/bin/env python3
"""
A WHOIS domain availability checker that reads domains from a file and checks them in batches.
This script implements a WHOIS client that connects to a WHOIS server and checks the
availability of domain names. It's designed to efficiently check multiple domains by
maintaining a single connection and reusing it for all queries, which reduces overhead
and respects server resources.
The script includes robust error handling for network issues and automatic reconnection
capabilities to handle server-side disconnections.
Features:
- Batch processing of domains from a file
- Persistent connection reuse
- Automatic reconnection on failure
- Comprehensive error handling
- Detailed logging of operations
"""
import socket # Import the socket module for network communication
from pathlib import Path # Import Path for cross-platform file path handling
from datetime import date # Import date for timestamp operations (though not currently used)
import time # Import time for adding delays between requests
class WhoisChecker:
"""
A class to handle WHOIS queries to check domain availability.
Uses a single persistent connection to efficiently check multiple domains.
WHOIS Protocol Overview:
- WHOIS is a query and response protocol that's widely used to query databases
- These databases store registered users or assignees of domain names
- The standard WHOIS port is 43
- Clients connect to the server, send a query (domain name), and receive a response
"""
def __init__(self, server="whois.dns.pl", port=43, timeout=10):
"""
Initialize the WhoisChecker with server details.
Args:
server (str): The WHOIS server to connect to (default: "whois.dns.pl")
whois.dns.pl is the Polish domain registry WHOIS server
port (int): The port to connect to (default: 43)
Port 43 is the standard WHOIS port
timeout (int): Connection timeout in seconds (default: 10)
Prevents hanging connections if the server doesn't respond
"""
# Store the WHOIS server address - this is where we'll connect to make queries
self.server = server
# Store the port number - WHOIS services typically operate on port 43
self.port = port
# Store the timeout value - this prevents the program from hanging indefinitely
# if the server doesn't respond
self.timeout = timeout
# Initialize socket variable as None - we're not connected initially
# A socket is an endpoint for sending and receiving data across a network
self.sock = None
def connect(self):
"""
Establish a connection to the WHOIS server.
This method creates a TCP socket connection to the WHOIS server specified
during initialization. It sets a timeout to prevent hanging connections
and handles any exceptions that might occur during the connection process.
Socket Programming Concepts:
- AF_INET: Address Family for IPv4 addresses
- SOCK_STREAM: Socket type for TCP connections (reliable, ordered delivery)
- settimeout(): Sets a timeout for blocking socket operations
Returns:
bool: True if connection successful, False otherwise
"""
try:
# Create a new TCP socket object
# socket.AF_INET means we're using IPv4 addresses
# socket.SOCK_STREAM means we're using TCP protocol (reliable, ordered delivery)
self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Set a timeout so we don't wait forever if the server doesn't respond
# This is important to prevent the program from hanging indefinitely
self.sock.settimeout(self.timeout)
# Actually connect to the WHOIS server at the specified address and port
# The connect() method establishes a connection to the server
self.sock.connect((self.server, self.port))
# If we reach this point, the connection was successful
return True
except Exception as e:
# If there's any error during connection, print it and return False
# This catches any exception that might occur during the connection process
print(f"Failed to connect to {self.server}:{self.port} - {e}")
return False
def disconnect(self):
"""
Close the connection to the WHOIS server if it exists.
This method properly closes the socket connection to free up system resources.
It's important to close connections when done to prevent resource leaks.
Resource Management:
- Closing sockets releases network resources
- Setting the reference to None prevents accidental reuse
- This follows the principle of cleaning up after ourselves
"""
# Check if we have an active socket connection
# If sock is None, no connection exists, so nothing to close
if self.sock:
# Close the socket connection to free up resources
# This releases the network connection and associated system resources
self.sock.close()
# Set the socket reference to None since it's closed
# This prevents accidental attempts to use the closed socket
self.sock = None
def check_domain(self, domain):
"""
Check if a .pl domain is available via WHOIS using existing connection.
This method sends a domain name to the WHOIS server and analyzes the response
to determine if the domain is available or already registered. It handles
the low-level socket communication and response parsing.
WHOIS Response Analysis:
- Available domains typically return messages indicating "not found" or similar
- Registered domains return detailed registration information
- Different WHOIS servers may use slightly different response formats
Args:
domain (str): The domain name to check (e.g., "example.pl")
Returns:
bool or None: True if domain is available, False if registered, None if error
"""
# Check if we have an active connection to the WHOIS server
# Without a connection, we can't send the query
if not self.sock:
print("Not connected to WHOIS server. Call connect() first.")
return None
try:
# Send the domain name to the WHOIS server followed by a carriage return and newline
# This is the standard way to query WHOIS servers
# encode() converts the string to bytes, which is required for socket transmission
self.sock.send(f"{domain}\r\n".encode())
# Receive the response from the server
# Initialize empty bytes object to store the complete response
response = b""
# Loop to receive all parts of the response
# WHOIS responses may come in multiple chunks
while True:
# Receive up to 4096 bytes of data from the server
# 4096 bytes is a common buffer size for network operations
data = self.sock.recv(4096)
# Check if we've received the complete response
# WHOIS responses typically end when the server stops sending data
# An empty response (not data) indicates the server has finished sending
if not data:
# End of response detected (server closed connection or finished sending)
break
# Append the received data to our response variable
# This builds the complete response from potentially multiple chunks
response += data
# Convert the byte response to a string for processing
# Using 'utf-8' encoding and ignoring any problematic characters
# This allows us to work with the response as text
response_text = response.decode('utf-8', errors='ignore')
# Check if the response indicates the domain is NOT registered/available
# Different WHOIS servers use different phrases to indicate unavailability
# Common indicators of an available domain include:
# - "No information available"
# - "not found" (case-insensitive)
# - "not registered" (case-insensitive)
# - "No data is found"
if ("No information available" in response_text or
"not found" in response_text.lower() or
"not registered" in response_text.lower() or
"No data is found" in response_text):
# Domain is available (not found in the registry)
return True
else:
# Domain is registered (found in the registry)
# The response contains registration information
return False
# Handle specific connection-related errors that commonly occur with WHOIS servers
except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError) as e:
# Handle connection-related errors specifically
# These errors often occur when the server closes the connection unexpectedly
print(f"Connection error checking {domain}: {e}")
# Close the current connection to clean up resources
self.disconnect()
# Try to establish a new connection
if self.connect():
# If reconnection succeeds, try checking the domain again
# This implements a simple retry mechanism
return self.check_domain(domain) # Retry once after reconnection
else:
# If reconnection fails, return None to indicate unknown status
return None # Unknown status due to connection issues
# Handle any other errors that might occur during the domain check
except Exception as e:
# Handle any other errors that occur during the domain check
# This is a catch-all for unexpected issues
print(f"Error checking {domain}: {e}")
# If there's an error, close the current connection and try to reconnect
# This helps recover from various network issues
self.disconnect()
# Try to establish a new connection
if self.connect():
# If reconnection succeeds, try checking the domain again
# This implements a simple retry mechanism
return self.check_domain(domain) # Retry once after reconnection
else:
# If reconnection fails, return None to indicate unknown status
# This signals that we couldn't determine the domain status
return None # Unknown status due to connection issues
def check_domains_batch(self, domains, delay=1):
"""
Check multiple domains using a single connection to be efficient.
This method implements batch processing of domain availability checks.
Rather than establishing a new connection for each domain, it reuses
the same connection for all domains in the list, which is much more
efficient and respectful to the WHOIS server.
Efficiency Benefits:
- Reduces connection overhead (TCP handshake, etc.)
- Faster overall processing time
- Less load on the WHOIS server
- Better compliance with rate limiting
Args:
domains (list): List of domain names to check
delay (int): Delay in seconds between requests to avoid rate limiting
Helps prevent overwhelming the server with too many requests
Returns:
dict: Dictionary mapping domain names to their availability status
{domain_name: True/False/None}
True = Available, False = Registered, None = Error/Unknown
"""
# If we're not connected, try to connect first
# This ensures we have a connection before attempting to check domains
if not self.sock and not self.connect():
# Return an empty dictionary if we can't connect
# This indicates that no checks were performed
return {}
# Create an empty dictionary to store our results
# This will map domain names to their availability status
results = {}
# Loop through each domain in the list
# Using enumerate to get both index and value
for i, domain in enumerate(domains):
# Check the current domain and store the result
# The check_domain method handles the actual WHOIS query
status = self.check_domain(domain)
# Store the result in our dictionary with the domain as the key
results[domain] = status
# Print the result for immediate feedback
# This provides real-time updates on the checking process
print(f"Checked {domain}: {'AVAILABLE' if status else 'REGISTERED' if status is not None else 'ERROR'}")
# Add a delay between requests to avoid overwhelming the server
# Rate limiting helps prevent being blocked by the server
# Only add delay if this isn't the last domain in the list
if i < len(domains) - 1: # Don't delay after the last request
# Wait for the specified number of seconds
# This gives the server time to process and helps with rate limiting
time.sleep(delay)
# Return the complete dictionary of results
# Contains all domain names mapped to their availability status
return results
def check_domains_batch_with_reconnect(self, domains, delay=1, max_retries=3):
"""
Check multiple domains with automatic reconnection if connection drops.
This method enhances the basic batch checking by adding robust reconnection
logic. WHOIS servers sometimes close connections unexpectedly, especially
when processing multiple requests. This method handles such cases by
automatically reconnecting and retrying failed checks.
Robustness Features:
- Automatic reconnection when connection is lost
- Configurable retry attempts per domain
- Proper cleanup of failed connections
- Detailed logging of retry attempts
Args:
domains (list): List of domain names to check
delay (int): Delay in seconds between requests
Helps prevent overwhelming the server with too many requests
max_retries (int): Maximum number of reconnection attempts per domain
Prevents infinite retry loops on persistent failures
Returns:
dict: Dictionary mapping domain names to their availability status
{domain_name: True/False/None}
True = Available, False = Registered, None = Error/Unknown
"""
# Initialize an empty dictionary to store results
# This will hold the final status for each domain
results = {}
# Iterate through each domain in the input list
# Using enumerate to get both index and domain name
for i, domain in enumerate(domains):
# Track the number of retry attempts for this domain
retry_count = 0
# Initialize status as None (unknown)
status = None
# Continue trying until we get a result or exceed max retries
while retry_count <= max_retries:
try:
# Check if we have a connection, if not, try to connect
# This handles cases where the connection was dropped
if not self.sock:
if not self.connect():
# If we can't reconnect, mark the domain as unknown
print(f"Could not reconnect to check {domain}")
status = None
# Break out of the retry loop since we can't proceed
break
# Try to check the domain using the existing connection
# This calls the check_domain method which handles the actual query
status = self.check_domain(domain)
# If we got a definitive result (not None), break the retry loop
# A None result typically indicates a connection or query error
if status is not None:
# We have a valid result, so exit the retry loop
break
else:
# Increment the retry counter for connection issues
retry_count += 1
if retry_count <= max_retries:
# Log the retry attempt
print(f"Retrying {domain} ({retry_count}/{max_retries})...")
# Brief pause before retry to allow server recovery
time.sleep(2)
# Catch any exceptions that occur during the checking process
except Exception as e:
# Log the exception that occurred
print(f"Exception while checking {domain}: {e}")
# Increment the retry counter
retry_count += 1
# Check if we still have retries left
if retry_count <= max_retries:
# Log the retry attempt
print(f"Retrying {domain} ({retry_count}/{max_retries})...")
# Disconnect to clean up the current connection
# This helps ensure we start fresh on the next attempt
self.disconnect()
# Brief pause before retry to allow server recovery
time.sleep(2)
else:
# If we've exceeded max retries, set status to None
status = None
# Store the final result for this domain in our results dictionary
results[domain] = status
# Print the final result for this domain
# This provides immediate feedback on the outcome
print(f"Final result for {domain}: {'AVAILABLE' if status else 'REGISTERED' if status is not None else 'ERROR'}")
# Add delay between requests to avoid overwhelming the server
# Only add delay if this isn't the last domain in the list
if i < len(domains) - 1:
# Wait for the specified number of seconds before the next request
time.sleep(delay)
# Return the complete dictionary of results for all domains
return results
def check_domains_from_file(self, filename, delay=1):
"""
Check domains from a file using a single connection.
This method provides a convenient way to check domain availability by reading
domain names from a text file. Each line in the file should contain a single
domain name. Empty lines are automatically filtered out.
File Format Expected:
- One domain name per line
- Lines starting with whitespace are stripped
- Empty lines are ignored
- UTF-8 encoding is assumed
Args:
filename (str): Path to the file containing domain names (one per line)
delay (int): Delay in seconds between requests
Helps prevent overwhelming the server with too many requests
Returns:
dict: Dictionary mapping domain names to their availability status
{domain_name: True/False/None}
True = Available, False = Registered, None = Error/Unknown
"""
try:
# Open the specified file in read mode with UTF-8 encoding
# Using 'with' ensures the file is properly closed even if an error occurs
with open(filename, "r", encoding="utf-8") as f:
# Read all lines from the file and process them
# Strip whitespace from each line and filter out empty lines
# This creates a list of clean domain names ready for checking
domains = [line.strip() for line in f.readlines() if line.strip()]
# Pass the list of domains to the batch checking method
# This reuses the connection for all domains in the file
return self.check_domains_batch(domains, delay)
# Handle the case where the specified file doesn't exist
except FileNotFoundError:
# Inform the user that the file wasn't found
print(f"File {filename} not found.")
# Return an empty dictionary to indicate no checks were performed
return {}
# Handle any other exceptions that might occur during file operations
except Exception as e:
# Log the error that occurred during file reading
print(f"Error reading file {filename}: {e}")
# Return an empty dictionary to indicate no checks were performed
return {}
# Test the code when this script is run directly (not imported)
# The __name__ == "__main__" guard ensures this code only runs when the script
# is executed directly, not when it's imported as a module
if __name__ == "__main__":
# Create an instance of our WhoisChecker class
# This initializes the checker with default server settings
checker = WhoisChecker()
# Try to connect to the WHOIS server
# The connect() method establishes the initial network connection
if checker.connect():
print("Connected to WHOIS server")
# Read domains from file and check them with reconnection capability
# This approach allows for batch processing of domains from a file
try:
# Attempt to open and read the domains.txt file
# This file should contain one domain name per line
with open("domains.txt", "r", encoding="utf-8") as f:
# Process the file: strip whitespace from each line and filter out empty lines
# This creates a clean list of domain names ready for checking
domains = [line.strip() for line in f.readlines() if line.strip()]
# Check domains with automatic reconnection capability
# This method handles connection drops and retries automatically
results = checker.check_domains_batch_with_reconnect(domains, delay=2, max_retries=2)
except FileNotFoundError:
# Handle the case where domains.txt doesn't exist
print("domains.txt file not found. Using default test domains.")
# Fall back to a predefined list of test domains
results = checker.check_domains_batch_with_reconnect(
["xke.pl", "abc.pl", "nonexistentdomain12345.pl"],
delay=2,
max_retries=2
)
print("\nResults:")
# Print the results for each domain with appropriate symbols
# ✓ for available domains
# ✗ for registered domains
# ? for domains that couldn't be checked
for domain, status in results.items():
if status is True:
print(f"✓ {domain} is AVAILABLE")
elif status is False:
print(f"✗ {domain} is REGISTERED")
else:
print(f"? Could not check {domain}")
# Always disconnect when done to free up resources
# This is important for proper cleanup and preventing resource leaks
checker.disconnect()
else:
# Handle the case where the initial connection to the WHOIS server fails
print("Failed to connect to WHOIS server")