Python 30‑by‑30 Course
Welcome to your final module! This is where we put theory into practice by dissecting a real-world script. We'll examine whatchan_amended.py
, the script that powers the daily football listings. By understanding how it works, you'll see how all the concepts you've learned—from variables to web scraping—come together to create something genuinely useful.
In Module 6, we learned how to apply Python to practical, real-world tasks:
pathlib
and datetime
modules.requests
library to download static web pages and BeautifulSoup
to parse the HTML and extract data.Selenium
library to handle JavaScript-loaded content.The whatchan_amended.py
script is the engine behind the daily football listings page on whatchan.co.uk. Specifically, it's a fully automated content generation tool that creates the whatchan.co.uk/today/ page. Its job is to visit a TV listings website, find all of today's live football matches, grab some related news from the BBC, and then build a clean, user-friendly webpage with all that information. It uses Selenium to handle the modern, dynamic listings site and BeautifulSoup for the simpler BBC news page. Finally, it assembles an HTML file, a JSON file for other programs, and a plain text summary, ready to be published.
Below, you can view the entire script in your browser. If you'd prefer to view it in your own code editor or on a different screen, you can download the file directly.
whatchan_amended.py (Right-click and select "Save Link As..." to download)
# whatchan_amended.py
# Scrapes football listings, generates HTML, JSON and optional image.
import os
import re
import json
import time
import shutil
import tempfile
from typing import List, Optional, Tuple
from datetime import datetime, date, timedelta, timezone
# Third-party imports - ensure these are installed
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
# Optional imports
try:
from PIL import Image, ImageDraw, ImageFont
HAS_PILLOW = True
except ImportError:
HAS_PILLOW = False
try:
from zoneinfo import ZoneInfo
except ImportError:
from backports.zoneinfo import ZoneInfo
# --- Constants and Configuration ---
SITE_NAME = "Whatchan"
BASE_URL = "https://whatchan.co.uk"
OUTPUT_DIR = "whatchan_today_assets"
GENERATE_IMAGE = HAS_PILLOW # Set to False to disable image generation
FTA_CHANNELS = {"BBC", "ITV", "Channel 4", "Channel 5", "S4C"}
TV_URL = "https://www.live-footballontv.com/"
GOSSIP_URL = "https://www.bbc.co.uk/sport/football/gossip"
GOSSIP_KEYWORDS = {"transfer", "deal", "contract", "move", "signing", "bid", "talks"}
TIMEZONE = ZoneInfo("Europe/Sofia")
# ... (The rest of the script code is here but omitted for brevity) ...
if __name__ == "__main__":
main()
Let’s break down the script into its core components. Below, you'll find explanations for each major part of the code.
The script starts by importing all the tools it needs from Python's standard library and the third-party libraries you've learned about. After the imports, it defines a set of global constants (e.g., BASE_URL
, FTA_CHANNELS
) that hold important configuration values, making the script easy to update.
This section defines a custom blueprint for our data: the Fixture
class. Each `Fixture` object holds all the important information for one match. It also has helpful methods, like is_fta()
, which tells you if the match is free-to-air. This Object-Oriented approach bundles the data and its related actions together. Alongside the class are several small helper functions that perform single, specific tasks like formatting dates.
This is the main data-gathering step. The function uses Selenium to control a real Chrome browser because the listings website is dynamic and uses JavaScript to load its content. The script navigates to the URL, waits for the fixture list to appear, and then carefully loops through the HTML elements to collect all the match details for today.
This function performs a second, simpler scrape. The BBC news site is static, so it uses the faster `requests` library to download the HTML. Then, it uses `BeautifulSoup` to parse the HTML and find the paragraphs containing transfer rumours. This shows how you can choose the right tool for the job.
This function is the report generator. It takes the list of `Fixture` objects and assembles the final webpage by building up a large string of HTML. It uses f-strings to neatly insert the data into a template. This function also creates the interactive filter buttons and generates the JSON-LD structured data, a hidden block that helps search engines like Google understand the page content.
The `if __name__ == "__main__":` block is the script's entry point. The `main()` function acts as the conductor of the orchestra. It calls all the other functions in the correct order: scrape, fetch, build, and then write the output files to a directory.
If this free 30-by-30 course has helped you on your Python journey, please consider a small donation. Your support helps cover server costs and allows me to create more free, high-quality learning materials for everyone. Thank you!
Buy Me a Coffee ☕You've completed the course! Let's test your knowledge with a final quiz. This will cover concepts from all the modules. Choose the best answer for each question and submit at the end to see your score and get detailed feedback.
This course is an example of the practical, hands-on training materials I love to create. With over a decade of experience as a Learning & Development Specialist, I can help you or your business develop engaging, effective eLearning solutions for any topic.
Whether you need to onboard new hires, upskill your current team, or create a public-facing educational resource, I can design and build a custom course tailored to your specific needs.
Let's talk about your project.