Adrian Dane

Python 30‑by‑30 Course

Module 4: Files, Parsing, and Basic Web I/O

Your programs have lived in their own little world so far. This week, we open the doors. You'll learn how to make your scripts read and write files, understand common data formats like CSV and JSON, and even fetch live data from the internet.

Contents

Quick Recap of Module 3

In Module 3, we learned how to structure and organize our code:

  • Functions: We learned to create reusable blocks of code with def, pass in parameters, and get results back with return.
  • Modules: We organized our functions into separate .py files (modules) and used import to use them in other scripts. We also learned about Python's huge standard library.
  • Nested Data & Recursion: We worked with more complex data, like lists inside lists, and learned the concept of a function that calls itself to solve problems.
  • Comprehensions: We discovered Python's elegant, one-line shortcuts for creating lists, sets, and dictionaries from other sequences.
  • Virtual Environments: We learned the best practice of creating isolated project environments with venv and managing packages with pip.

Day 16: Giving Your Program a Notepad (Reading & Writing Files)

Objectives

Most useful programs need to save information or read it from somewhere. That's where files come in. Python makes it easy to work with files using the built-in open() function. The modern, safe way to do this is with a with statement. This is great because it automatically handles closing the file for you, even if your program runs into an error.

When you open a file, you need to tell Python your intention by providing a mode. The most common ones are 'r' to read, 'w' to write (this will erase the file if it already exists!), and 'a' to append (which adds new content to the end of the file). For reading, a great technique is to loop directly over the file object, which processes it one line at a time. This is very efficient because it doesn't load the whole file into memory at once.

To write to a file, you just open it in 'w' or 'a' mode and use the .write() method. Just remember to add your own newline characters (\n) if you want to write on separate lines! This skill is the foundation for everything from saving user settings to creating log files.

🤯 This `with open(...)` thing is new. What's the idea?

Working with files is like borrowing a book from the library. 📚

  • First, you have to open() the book to start using it.
  • The with statement is like your library card. It's a promise that when you're done with the book (i.e., when your with block ends), it gets automatically and safely returned (closed()).
  • The mode ('r', 'w', 'a') is just you telling the librarian your plan. "I'm just here to read this book," or "I want to take this blank notebook and write in it."

The with statement is a safety feature that prevents you from accidentally leaving a file open, which can cause problems.

Practice ✍️

Create a simple to-do list program. Write a script that asks the user for a task and appends it as a new line to a file called todo.txt. Then, write a second script that reads todo.txt and prints out each task on a numbered list.

Click to see a sample script
# add_task.py
task = input("What task do you want to add? ")
# 'a' mode appends to the end of the file
with open('todo.txt', 'a') as f:
    f.write(task + '\n')
print("Task added!")

# ---------------------------------------------
# read_tasks.py
print("\n--- Your To-Do List ---")
try:
    with open('todo.txt', 'r') as f:
        for i, line in enumerate(f, start=1):
            # .strip() removes whitespace and newlines
            print(f"{i}. {line.strip()}")
except FileNotFoundError:
    print("No tasks yet. Run add_task.py to add one!")
        

Day 17: Speaking Spreadsheet (Working with CSV)

Objectives

CSV (Comma-Separated Values) is a super common file format for storing table data, like you'd find in a spreadsheet. While you could try to read a .csv file and split each line by commas, you'll run into trouble as soon as your data contains commas itself! The better way is to use Python's built-in csv module, which is designed to handle all these tricky edge cases for you.

The csv.reader object lets you loop through a CSV file, and it gives you each row as a list of strings. This is good, but it's even better if your file has a header row with column names. For that, you can use csv.DictReader. This magical tool gives you each row as a dictionary, where the keys are the column headers. This makes your code much more readable, as you can access data with row['UserName'] instead of the more cryptic row[0].

Writing CSV files is just as easy with csv.writer and csv.DictWriter. This module is a must-know for any data-related task, from processing sales reports to analyzing user data exported from a web service.

🤯 Why not just `.split(',')`? What's the `csv` module for?

The csv module is like a professional translator for spreadsheet data. 🌐

Imagine your data is: John Doe,"New York, NY",35

  • If you just split by the comma, you'd get ['John Doe', '"New York', ' NY"', '35']. That's a mess!
  • The csv module is a smart translator that understands the "grammar" of CSVs. It knows that the comma inside the quotes is part of the city name and shouldn't be used as a separator.
  • It will correctly give you: ['John Doe', 'New York, NY', '35'].

It handles all the tricky rules for you, so you can focus on working with your clean data.

Practice ✍️

Create a simple CSV file named scores.csv with the headers player,score. Add a few rows of data. Write a Python script that reads this file and finds the player with the highest score.

Click to view a sample script
# csv_highscore.py
import csv

# First, let's create the dummy CSV file for the example
with open('scores.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerow(['player', 'score'])
    writer.writerow(['Zelda', '8500'])
    writer.writerow(['Mario', '9500'])
    writer.writerow(['Luigi', '9000'])

highest_score = -1
best_player = ""

with open('scores.csv', 'r', newline='') as f:
    # Use DictReader for easy access by column name
    reader = csv.DictReader(f)
    for row in reader:
        score = int(row['score'])
        if score > highest_score:
            highest_score = score
            best_player = row['player']

print(f"The high score holder is {best_player} with {highest_score} points!")
        

Day 18: The Language of the Web (JSON and APIs)

Objectives

Most of the data you get from the internet comes in a format called JSON (JavaScript Object Notation). It's become the universal language for web services. The best part? It looks almost exactly like Python dictionaries and lists! This makes it incredibly easy to work with in Python. A JSON "object" is just like a Python dictionary, and a JSON "array" is just like a Python list.

Python's built-in json module is your translator. You use json.load() to read a JSON file and turn it into a Python dictionary or list, and json.dump() to write a Python dictionary or list out to a JSON file. It's that simple!

So where do you get this JSON data? From an API (Application Programming Interface). Think of an API as a "menu" that a website provides for computer programs. Instead of scraping a web page, you can make a clean request to an API endpoint (a specific URL) and get back structured JSON data. For this, you'd typically use a library like requests to handle the web communication. This lets you pull live data into your scripts, like weather forecasts, stock prices, or social media posts.

🤯 What's an API? And what is JSON?

An API is like a restaurant waiter for data. 👨‍🍳

  • You don't go into the restaurant's kitchen (the database) yourself. That would be messy and unsafe.
  • Instead, you look at a menu (the API documentation) which tells you what you can order.
  • You give your order ("I'd like today's weather for London") to the waiter (the API URL).
  • The waiter goes to the kitchen and brings you back your food, neatly arranged on a plate.

JSON is the plate. It's the standard, predictable format that the waiter uses to serve you your data, making it easy for your program to "eat" (process).

Practice ✍️

Create a Python dictionary that represents a user profile (e.g., with keys for name, username, and email). Use the `json` module to write this dictionary to a file called profile.json in a nicely formatted way (hint: use the `indent` argument).

Click to see a sample script
# json_profile_writer.py
import json

user_profile = {
    "name": "Ada Lovelace",
    "username": "ada_lovelace_1815",
    "email": "ada@example.com",
    "active": True,
    "courses": ["Mathematics", "Computer Science"]
}

# 'w' mode to write a new file
with open('profile.json', 'w') as f:
    # indent=4 makes the JSON file human-readable
    json.dump(user_profile, f, indent=4)

print("Profile successfully written to profile.json!")
        

Day 19: Building Your Own Command-Line Tools

Objectives

So far, we've used `input()` to make our scripts interactive. But professional tools are often run from the command line with arguments, like my_script.py input.txt --verbose. This makes them much easier to automate. The basic way to access these is with `sys.argv`, which gives you a simple list of the words the user typed after the script name.

However, parsing this list yourself is tedious. A much better way is to use the standard library's argparse module. This amazing module lets you define the arguments your script expects. You can specify positional arguments (like a filename) and optional flags (like --verbose or -o output.txt). `argparse` then does all the hard work for you: it parses the user's input, converts values to the right type (like numbers), and even automatically generates a helpful --help message that explains how to use your tool.

Learning `argparse` is a huge step toward writing scripts that are not just for you, but are powerful, reusable tools that others can easily use.

🤯 `argparse` seems like a lot. Why not just use `sys.argv`?

argparse is like creating a professional order form for your script. 📝

Imagine your script needs a filename and an optional number.

  • Using `sys.argv` is like the user shouting their order at you: "report.txt 50!". You have to figure out which is which and hope they got the order right.
  • Using `argparse` is like giving them a form: It has a required field labeled "Input File" and an optional field labeled "Line Count" (e.g., --lines 50).

argparse takes the user's command and neatly fills out the form for you. It tells you if they missed a required field and even provides a --help button to show them how to fill out the form correctly. It's much more robust and user-friendly.

Practice ✍️

Write a script greeter.py that takes a person's name as a required command-line argument and an optional --greeting argument that defaults to "Hello". The script should then print the greeting followed by the name. For example, python greeter.py World --greeting "Hi there,".

Click to view a sample CLI tool
# greeter.py
import argparse

# 1. Create the parser
parser = argparse.ArgumentParser(description="A friendly greeter program.")

# 2. Add the arguments you expect
parser.add_argument("name", help="The name of the person to greet.")
parser.add_argument("--greeting", default="Hello", help="The greeting to use.")

# 3. Parse the arguments from the command line
args = parser.parse_args()

# 4. Use the parsed arguments in your code
print(f"{args.greeting}, {args.name}!")

# To run from your terminal:
# python greeter.py Alice
# python greeter.py Bob --greeting "Good morning"
        

Day 20: Turning Data into a Story (Generating Reports)

Objectives

This is where it all comes together! A very common task for a programmer is to take a source of raw data—like a CSV file or an API response—and transform it into a clean, human-readable summary. This is a two-step process: aggregation and presentation.

First, you need to aggregate your data. This means calculating summary statistics like totals, averages, maximums, or counting occurrences. You might loop through your data, adding values to a dictionary to group them by category (like sales per month). Once you have your summary numbers, the next step is presentation. You need to format this information into a clear report. F-strings are your best friend here, allowing you to easily embed your calculated values into a multi-line string.

You can then print this report to the console or, even better, write it to a file (.txt, .html, etc.) using the file I/O skills you learned on Day 16. The ability to automate the transformation of messy data into a clean summary is one of the most powerful and practical skills you can have.

🤯 This just sounds like a mix of old skills. What's the new idea?

Generating a report is like a news anchor delivering the nightly news. 📰

You're right, there isn't a single "new" command here. The skill is combining everything you've learned into a single, powerful workflow.

  1. The Reporter (File/API Reader): Your code first goes out and gathers the raw facts from a file or a web API.
  2. The Analyst (Aggregation): Back at the newsroom, you analyze the raw numbers, calculating totals and finding trends. This is your for loop or dictionary logic.
  3. The News Anchor (Presentation): Finally, you don't just show the audience the raw data. You use f-strings to present a clear, formatted story: "Total sales this month were $X, an increase of Y%!"

The new skill is seeing how these individual pieces fit together to create a complete, automated solution that turns data into insight.

Practice ✍️

Using the scores.csv file from Day 17's practice, expand your script. Instead of just finding the high score, generate a full text report in a file named report.txt. The report should include the total number of players, the average score, and the name of the player with the highest score.

Click to reveal a sample script
# report_generator.py
import csv

# Assuming scores.csv from Day 17 exists
# Player,Score
# Zelda,8500
# Mario,9500
# Luigi,9000

players = []
scores = []

try:
    with open('scores.csv', 'r', newline='') as f:
        reader = csv.DictReader(f)
        for row in reader:
            players.append(row['player'])
            scores.append(int(row['score']))

    # 1. Aggregation
    player_count = len(players)
    average_score = sum(scores) / player_count
    high_score = max(scores)
    # Find the player who corresponds to the high score
    best_player_index = scores.index(high_score)
    best_player = players[best_player_index]

    # 2. Presentation
    report_content = (
        f"--- Gaming Session Report ---\n"
        f"Date: 2025-08-29\n"
        f"-----------------------------\n"
        f"Total Players: {player_count}\n"
        f"Average Score: {average_score:.0f}\n"
        f"High Score: {best_player} with {high_score} points!\n"
    )
    
    # 3. Write to file
    with open('report.txt', 'w') as f_out:
        f_out.write(report_content)
    
    print("Report successfully generated to report.txt")

except FileNotFoundError:
    print("Error: scores.csv not found.")
except (ValueError, ZeroDivisionError):
    print("Error: Could not process data in scores.csv. Is it empty or corrupt?")
        

Further resources