Bài 13: File I/O - Làm Việc Với Files (Phần 1)

Mục Tiêu Bài Học

Sau khi hoàn thành bài này, bạn sẽ:

  • ✅ Mở và đóng files với open()
  • ✅ Đọc files với các methods khác nhau
  • ✅ Ghi data vào files
  • ✅ Sử dụng with statement
  • ✅ Hiểu file modes và encoding

File I/O Là Gì?

File I/O (Input/Output) là việc đọc từ và ghi vào files trên disk.

Tại sao cần File I/O?

  • Data Persistence - Lưu data lâu dài
  • Configuration - Đọc settings
  • Data Processing - Xử lý large datasets
  • Logging - Ghi logs
  • Data Exchange - Share data giữa programs
# Without files - data lostuser_data = {"name": "Alice", "age": 25}# Program exits → data gone! # With files - data persistedimport jsonwith open('user.json', 'w') as f:    json.dump(user_data, f)# Data saved, can use later!

Opening Files - open()

Basic Syntax

# Syntax: open(filename, mode, encoding) # Open file for readingfile = open('data.txt', 'r') # Use filecontent = file.read()print(content) # ⚠️ Always close filefile.close()

File Modes

# Reading modes'r'   # Read (default) - error if file doesn't exist'r+'  # Read and write # Writing modes'w'   # Write - creates new or overwrites existing'w+'  # Write and read - overwrites # Appending modes'a'   # Append - creates new or adds to existing'a+'  # Append and read # Binary modes'rb'  # Read binary'wb'  # Write binary'ab'  # Append binary # Examplefile = open('data.txt', 'r')   # Read onlyfile = open('data.txt', 'w')   # Write (overwrite)file = open('data.txt', 'a')   # Append

Encoding

# UTF-8 encoding (recommended for Vietnamese)file = open('data.txt', 'r', encoding='utf-8') # Different encodingsfile = open('data.txt', 'r', encoding='ascii')file = open('data.txt', 'r', encoding='latin-1') # Default encoding (platform dependent)file = open('data.txt', 'r')  # May cause issues with Vietnamese # Best practice - always specify UTF-8file = open('data.txt', 'r', encoding='utf-8')

Reading Files

read() - Read All Content

# Read entire file as stringfile = open('data.txt', 'r', encoding='utf-8')content = file.read()print(content)file.close() # Read limited charactersfile = open('data.txt', 'r', encoding='utf-8')content = file.read(100)  # Read first 100 charactersprint(content)file.close()

readline() - Read One Line

# Read single linefile = open('data.txt', 'r', encoding='utf-8') line1 = file.readline()print(line1)  # First line with \n line2 = file.readline()print(line2)  # Second line file.close() # Read all lines one by onefile = open('data.txt', 'r', encoding='utf-8')line = file.readline()while line:    print(line.strip())  # Remove \n    line = file.readline()file.close()

readlines() - Read All Lines as List

# Read all lines into listfile = open('data.txt', 'r', encoding='utf-8')lines = file.readlines()file.close() print(lines)  # ['line1\n', 'line2\n', 'line3\n'] # Process each linefor line in lines:    print(line.strip())  # Remove \n

Iterating Over File

# Most Pythonic way - iterate directlyfile = open('data.txt', 'r', encoding='utf-8') for line in file:    print(line.strip()) file.close() # Memory efficient - processes line by line# Good for large files

Writing Files

write() - Write String

# Write to file (overwrites)file = open('output.txt', 'w', encoding='utf-8') file.write("Hello, World!\n")file.write("This is line 2\n")file.write("This is line 3\n") file.close() # File content:# Hello, World!# This is line 2# This is line 3

writelines() - Write List of Strings

# Write multiple linesfile = open('output.txt', 'w', encoding='utf-8') lines = [    "Line 1\n",    "Line 2\n",    "Line 3\n"] file.writelines(lines)file.close() # ⚠️ writelines doesn't add \n automaticallylines = ["Line 1", "Line 2", "Line 3"]file = open('output.txt', 'w', encoding='utf-8')file.writelines(lines)  # Line 1Line 2Line 3 (no newlines!)file.close() # ✅ Add \n manuallylines = [f"{line}\n" for line in ["Line 1", "Line 2", "Line 3"]]file = open('output.txt', 'w', encoding='utf-8')file.writelines(lines)file.close()

Append Mode

# Write mode - overwritesfile = open('data.txt', 'w', encoding='utf-8')file.write("First write\n")file.close() file = open('data.txt', 'w', encoding='utf-8')file.write("Second write\n")  # First content gone!file.close()# File content: Second write # Append mode - adds to endfile = open('data.txt', 'a', encoding='utf-8')file.write("First append\n")file.close() file = open('data.txt', 'a', encoding='utf-8')file.write("Second append\n")  # Added to endfile.close()# File content:# First append# Second append

with Statement - Context Manager

Best practice: Dùng with để tự động close file.

Why Use with?

# ❌ Without with - easy to forget closefile = open('data.txt', 'r')content = file.read()# ... lots of code# Forgot to close! File handle leaks # ❌ Without with - error handling issuesfile = open('data.txt', 'r')content = file.read()# Error occurs here!file.close()  # Never executed! # ✅ With statement - auto closeswith open('data.txt', 'r', encoding='utf-8') as file:    content = file.read()    # Use content# File automatically closed here, even if error occurs!

Using with

# Reading with 'with'with open('data.txt', 'r', encoding='utf-8') as file:    content = file.read()    print(content)# File closed automatically # Writing with 'with'with open('output.txt', 'w', encoding='utf-8') as file:    file.write("Hello, World!\n")    file.write("Line 2\n")# File closed and flushed automatically # Multiple fileswith open('input.txt', 'r', encoding='utf-8') as infile, \     open('output.txt', 'w', encoding='utf-8') as outfile:    content = infile.read()    outfile.write(content.upper())# Both files closed automatically

File Methods Inside with

# Read allwith open('data.txt', 'r', encoding='utf-8') as file:    content = file.read()    print(content) # Read line by linewith open('data.txt', 'r', encoding='utf-8') as file:    for line in file:        print(line.strip()) # Read all lines as listwith open('data.txt', 'r', encoding='utf-8') as file:    lines = file.readlines()    for line in lines:        print(line.strip()) # Write multiple lineswith open('output.txt', 'w', encoding='utf-8') as file:    file.write("Line 1\n")    file.write("Line 2\n")    file.write("Line 3\n")

File Operations Best Practices

Check File Exists

import os # Check if file existsif os.path.exists('data.txt'):    with open('data.txt', 'r', encoding='utf-8') as file:        content = file.read()else:    print("File not found!") # Check if file and is file (not directory)if os.path.isfile('data.txt'):    print("File exists") # Check if directoryif os.path.isdir('my_folder'):    print("Directory exists")

Error Handling

# Handle file not foundtry:    with open('data.txt', 'r', encoding='utf-8') as file:        content = file.read()except FileNotFoundError:    print("File not found!")except PermissionError:    print("No permission to read file!")except Exception as e:    print(f"Error: {e}") # Safer approachdef read_file(filename):    """    Safely read file.        Returns:        str: File content or None if error    """    try:        with open(filename, 'r', encoding='utf-8') as file:            return file.read()    except FileNotFoundError:        print(f"File not found: {filename}")        return None    except Exception as e:        print(f"Error reading file: {e}")        return None content = read_file('data.txt')if content:    print(content)

File Paths

import os # Absolute pathfile = open('/Users/username/data.txt', 'r') # Relative path (from current directory)file = open('data.txt', 'r')file = open('./data.txt', 'r')file = open('../data.txt', 'r')  # Parent directory # Join paths (cross-platform)path = os.path.join('folder', 'subfolder', 'file.txt')# Windows: folder\subfolder\file.txt# Unix: folder/subfolder/file.txt with open(path, 'r', encoding='utf-8') as file:    content = file.read() # Get current directorycurrent_dir = os.getcwd()print(current_dir) # Build path from current directoryfilepath = os.path.join(current_dir, 'data', 'input.txt')

Ví Dụ Thực Tế

1. Read and Process Log File

"""Read log file and count error messages. Log format:2025-10-27 10:30:15 INFO: Application started2025-10-27 10:30:20 ERROR: Connection failed2025-10-27 10:30:25 INFO: Retrying...""" def analyze_log(filename):    """    Analyze log file.        Returns:        dict: Statistics    """    stats = {        'total_lines': 0,        'info': 0,        'error': 0,        'warning': 0    }        try:        with open(filename, 'r', encoding='utf-8') as file:            for line in file:                stats['total_lines'] += 1                                if 'INFO' in line:                    stats['info'] += 1                elif 'ERROR' in line:                    stats['error'] += 1                elif 'WARNING' in line:                    stats['warning'] += 1                return stats        except FileNotFoundError:        print(f"Log file not found: {filename}")        return None # Usagestats = analyze_log('app.log')if stats:    print(f"Total lines: {stats['total_lines']}")    print(f"INFO: {stats['info']}")    print(f"ERROR: {stats['error']}")    print(f"WARNING: {stats['warning']}")

2. File Copy Utility

"""Copy file content from source to destination.""" def copy_file(source, destination):    """    Copy file.        Args:        source (str): Source file path        destination (str): Destination file path        Returns:        bool: True if success    """    try:        with open(source, 'r', encoding='utf-8') as src:            content = src.read()                with open(destination, 'w', encoding='utf-8') as dst:            dst.write(content)                print(f"Copied {source} to {destination}")        return True        except FileNotFoundError:        print(f"Source file not found: {source}")        return False    except Exception as e:        print(f"Error copying file: {e}")        return False # Usagecopy_file('original.txt', 'backup.txt') # Copy line by line (memory efficient for large files)def copy_file_efficient(source, destination):    """Copy file line by line."""    try:        with open(source, 'r', encoding='utf-8') as src, \             open(destination, 'w', encoding='utf-8') as dst:                        for line in src:                dst.write(line)                return True    except Exception as e:        print(f"Error: {e}")        return False

3. Word Counter

"""Count words in text file.""" def count_words(filename):    """    Count total words in file.        Args:        filename (str): File path        Returns:        int: Word count or None if error    """    try:        with open(filename, 'r', encoding='utf-8') as file:            content = file.read()            words = content.split()            return len(words)        except FileNotFoundError:        print(f"File not found: {filename}")        return None def word_frequency(filename):    """    Count frequency of each word.        Returns:        dict: Word frequencies    """    try:        with open(filename, 'r', encoding='utf-8') as file:            content = file.read().lower()            words = content.split()                        freq = {}            for word in words:                # Remove punctuation                word = word.strip('.,!?;:"\'')                if word:                    freq[word] = freq.get(word, 0) + 1                        return freq        except FileNotFoundError:        print(f"File not found: {filename}")        return None # Usagetotal = count_words('document.txt')print(f"Total words: {total}") freq = word_frequency('document.txt')if freq:    # Top 10 most common words    sorted_words = sorted(freq.items(), key=lambda x: x[1], reverse=True)    print("\nTop 10 words:")    for word, count in sorted_words[:10]:        print(f"{word}: {count}")

4. Simple Database (Text File)

"""Simple user database using text file. Format: username|email|age""" def add_user(filename, username, email, age):    """Add user to database."""    try:        with open(filename, 'a', encoding='utf-8') as file:            file.write(f"{username}|{email}|{age}\n")        return True    except Exception as e:        print(f"Error: {e}")        return False def get_all_users(filename):    """Get all users from database."""    users = []        try:        with open(filename, 'r', encoding='utf-8') as file:            for line in file:                parts = line.strip().split('|')                if len(parts) == 3:                    user = {                        'username': parts[0],                        'email': parts[1],                        'age': int(parts[2])                    }                    users.append(user)                return users        except FileNotFoundError:        return [] def find_user(filename, username):    """Find user by username."""    users = get_all_users(filename)    for user in users:        if user['username'] == username:            return user    return None # UsageDB_FILE = 'users.txt' # Add usersadd_user(DB_FILE, 'alice', '[email protected]', 25)add_user(DB_FILE, 'bob', '[email protected]', 30) # Get all usersusers = get_all_users(DB_FILE)print(f"Total users: {len(users)}")for user in users:    print(user) # Find specific useruser = find_user(DB_FILE, 'alice')if user:    print(f"Found: {user}")

5. Configuration File Manager

"""Manage configuration file. Format: key=value""" def load_config(filename):    """    Load configuration from file.        Returns:        dict: Configuration settings    """    config = {}        try:        with open(filename, 'r', encoding='utf-8') as file:            for line in file:                line = line.strip()                                # Skip empty lines and comments                if not line or line.startswith('#'):                    continue                                # Parse key=value                if '=' in line:                    key, value = line.split('=', 1)                    config[key.strip()] = value.strip()                return config        except FileNotFoundError:        print(f"Config file not found: {filename}")        return {} def save_config(filename, config):    """Save configuration to file."""    try:        with open(filename, 'w', encoding='utf-8') as file:            for key, value in config.items():                file.write(f"{key}={value}\n")        return True    except Exception as e:        print(f"Error saving config: {e}")        return False def get_config(config, key, default=None):    """Get config value with default."""    return config.get(key, default) # UsageCONFIG_FILE = 'app.config' # Load configconfig = load_config(CONFIG_FILE) # Get valuesapp_name = get_config(config, 'APP_NAME', 'My App')debug = get_config(config, 'DEBUG', 'False') == 'True'port = int(get_config(config, 'PORT', '8000')) print(f"App: {app_name}")print(f"Debug: {debug}")print(f"Port: {port}") # Update configconfig['VERSION'] = '2.0.0'config['DEBUG'] = 'True'save_config(CONFIG_FILE, config)

Common Pitfalls

# ❌ Forgetting to close filefile = open('data.txt', 'r')content = file.read()# Missing: file.close() # ✅ Use with statementwith open('data.txt', 'r') as file:    content = file.read() # ❌ Wrong encodingwith open('vietnamese.txt', 'r') as file:  # May fail!    content = file.read() # ✅ Specify UTF-8with open('vietnamese.txt', 'r', encoding='utf-8') as file:    content = file.read() # ❌ Reading large file at oncewith open('large.txt', 'r') as file:    content = file.read()  # Memory issue! # ✅ Process line by linewith open('large.txt', 'r') as file:    for line in file:        process(line) # ❌ Not handling errorswith open('data.txt', 'r') as file:    content = file.read()  # Crashes if file not found! # ✅ Handle exceptionstry:    with open('data.txt', 'r') as file:        content = file.read()except FileNotFoundError:    print("File not found!")

Bài Tập Thực Hành

Bài 1: File Reader

Viết function read_file_safe(filename):

  • Read file với error handling
  • Return content hoặc None
  • Print descriptive error messages

Bài 2: File Statistics

Viết function file_stats(filename):

  • Count lines, words, characters
  • Return dict với statistics
  • Handle empty files

Bài 3: File Filter

Viết function filter_lines(input_file, output_file, keyword):

  • Read input file
  • Write lines containing keyword to output
  • Count filtered lines

Bài 4: File Merge

Viết function merge_files(file_list, output_file):

  • Read multiple files
  • Merge content into one file
  • Add separator between files

Bài 5: Todo List Manager

Create todo list app:

  • Add todo to file
  • List all todos
  • Mark todo as done
  • Delete todo
  • Use text file for storage

Tóm Tắt

open(filename, mode, encoding): Mở file
✅ Modes: 'r' (read), 'w' (write), 'a' (append)
read(): Read all content
readline(): Read one line
readlines(): Read all lines as list
write(string): Write string
with statement: Auto close file
✅ Always use encoding='utf-8' for Vietnamese

Bài Tiếp Theo

Bài 13.2: File I/O (Phần 2) - CSV files, JSON files, binary files, và file system operations.


Remember:

  • Always use with statement!
  • Specify encoding='utf-8'
  • Handle exceptions (FileNotFoundError)
  • Close files or use context manager
  • Process large files line by line!