Bài 13: File I/O - Làm Việc Với Files (Phần 1)
Mục Tiêu Bài Học
Sau khi hoàn thành bài này, bạn sẽ:
- ✅ Mở và đóng files với
open() - ✅ Đọc files với các methods khác nhau
- ✅ Ghi data vào files
- ✅ Sử dụng
withstatement - ✅ Hiểu file modes và encoding
File I/O Là Gì?
File I/O (Input/Output) là việc đọc từ và ghi vào files trên disk.
Tại sao cần File I/O?
- ✅ Data Persistence - Lưu data lâu dài
- ✅ Configuration - Đọc settings
- ✅ Data Processing - Xử lý large datasets
- ✅ Logging - Ghi logs
- ✅ Data Exchange - Share data giữa programs
# Without files - data lostuser_data = {"name": "Alice", "age": 25}# Program exits → data gone! # With files - data persistedimport jsonwith open('user.json', 'w') as f: json.dump(user_data, f)# Data saved, can use later!
Opening Files - open()
Basic Syntax
# Syntax: open(filename, mode, encoding) # Open file for readingfile = open('data.txt', 'r') # Use filecontent = file.read()print(content) # ⚠️ Always close filefile.close()
File Modes
# Reading modes'r' # Read (default) - error if file doesn't exist'r+' # Read and write # Writing modes'w' # Write - creates new or overwrites existing'w+' # Write and read - overwrites # Appending modes'a' # Append - creates new or adds to existing'a+' # Append and read # Binary modes'rb' # Read binary'wb' # Write binary'ab' # Append binary # Examplefile = open('data.txt', 'r') # Read onlyfile = open('data.txt', 'w') # Write (overwrite)file = open('data.txt', 'a') # Append
Encoding
# UTF-8 encoding (recommended for Vietnamese)file = open('data.txt', 'r', encoding='utf-8') # Different encodingsfile = open('data.txt', 'r', encoding='ascii')file = open('data.txt', 'r', encoding='latin-1') # Default encoding (platform dependent)file = open('data.txt', 'r') # May cause issues with Vietnamese # Best practice - always specify UTF-8file = open('data.txt', 'r', encoding='utf-8')
Reading Files
read() - Read All Content
# Read entire file as stringfile = open('data.txt', 'r', encoding='utf-8')content = file.read()print(content)file.close() # Read limited charactersfile = open('data.txt', 'r', encoding='utf-8')content = file.read(100) # Read first 100 charactersprint(content)file.close()
readline() - Read One Line
# Read single linefile = open('data.txt', 'r', encoding='utf-8') line1 = file.readline()print(line1) # First line with \n line2 = file.readline()print(line2) # Second line file.close() # Read all lines one by onefile = open('data.txt', 'r', encoding='utf-8')line = file.readline()while line: print(line.strip()) # Remove \n line = file.readline()file.close()
readlines() - Read All Lines as List
# Read all lines into listfile = open('data.txt', 'r', encoding='utf-8')lines = file.readlines()file.close() print(lines) # ['line1\n', 'line2\n', 'line3\n'] # Process each linefor line in lines: print(line.strip()) # Remove \n
Iterating Over File
# Most Pythonic way - iterate directlyfile = open('data.txt', 'r', encoding='utf-8') for line in file: print(line.strip()) file.close() # Memory efficient - processes line by line# Good for large files
Writing Files
write() - Write String
# Write to file (overwrites)file = open('output.txt', 'w', encoding='utf-8') file.write("Hello, World!\n")file.write("This is line 2\n")file.write("This is line 3\n") file.close() # File content:# Hello, World!# This is line 2# This is line 3
writelines() - Write List of Strings
# Write multiple linesfile = open('output.txt', 'w', encoding='utf-8') lines = [ "Line 1\n", "Line 2\n", "Line 3\n"] file.writelines(lines)file.close() # ⚠️ writelines doesn't add \n automaticallylines = ["Line 1", "Line 2", "Line 3"]file = open('output.txt', 'w', encoding='utf-8')file.writelines(lines) # Line 1Line 2Line 3 (no newlines!)file.close() # ✅ Add \n manuallylines = [f"{line}\n" for line in ["Line 1", "Line 2", "Line 3"]]file = open('output.txt', 'w', encoding='utf-8')file.writelines(lines)file.close()
Append Mode
# Write mode - overwritesfile = open('data.txt', 'w', encoding='utf-8')file.write("First write\n")file.close() file = open('data.txt', 'w', encoding='utf-8')file.write("Second write\n") # First content gone!file.close()# File content: Second write # Append mode - adds to endfile = open('data.txt', 'a', encoding='utf-8')file.write("First append\n")file.close() file = open('data.txt', 'a', encoding='utf-8')file.write("Second append\n") # Added to endfile.close()# File content:# First append# Second append
with Statement - Context Manager
Best practice: Dùng with để tự động close file.
Why Use with?
# ❌ Without with - easy to forget closefile = open('data.txt', 'r')content = file.read()# ... lots of code# Forgot to close! File handle leaks # ❌ Without with - error handling issuesfile = open('data.txt', 'r')content = file.read()# Error occurs here!file.close() # Never executed! # ✅ With statement - auto closeswith open('data.txt', 'r', encoding='utf-8') as file: content = file.read() # Use content# File automatically closed here, even if error occurs!
Using with
# Reading with 'with'with open('data.txt', 'r', encoding='utf-8') as file: content = file.read() print(content)# File closed automatically # Writing with 'with'with open('output.txt', 'w', encoding='utf-8') as file: file.write("Hello, World!\n") file.write("Line 2\n")# File closed and flushed automatically # Multiple fileswith open('input.txt', 'r', encoding='utf-8') as infile, \ open('output.txt', 'w', encoding='utf-8') as outfile: content = infile.read() outfile.write(content.upper())# Both files closed automatically
File Methods Inside with
# Read allwith open('data.txt', 'r', encoding='utf-8') as file: content = file.read() print(content) # Read line by linewith open('data.txt', 'r', encoding='utf-8') as file: for line in file: print(line.strip()) # Read all lines as listwith open('data.txt', 'r', encoding='utf-8') as file: lines = file.readlines() for line in lines: print(line.strip()) # Write multiple lineswith open('output.txt', 'w', encoding='utf-8') as file: file.write("Line 1\n") file.write("Line 2\n") file.write("Line 3\n")
File Operations Best Practices
Check File Exists
import os # Check if file existsif os.path.exists('data.txt'): with open('data.txt', 'r', encoding='utf-8') as file: content = file.read()else: print("File not found!") # Check if file and is file (not directory)if os.path.isfile('data.txt'): print("File exists") # Check if directoryif os.path.isdir('my_folder'): print("Directory exists")
Error Handling
# Handle file not foundtry: with open('data.txt', 'r', encoding='utf-8') as file: content = file.read()except FileNotFoundError: print("File not found!")except PermissionError: print("No permission to read file!")except Exception as e: print(f"Error: {e}") # Safer approachdef read_file(filename): """ Safely read file. Returns: str: File content or None if error """ try: with open(filename, 'r', encoding='utf-8') as file: return file.read() except FileNotFoundError: print(f"File not found: {filename}") return None except Exception as e: print(f"Error reading file: {e}") return None content = read_file('data.txt')if content: print(content)
File Paths
import os # Absolute pathfile = open('/Users/username/data.txt', 'r') # Relative path (from current directory)file = open('data.txt', 'r')file = open('./data.txt', 'r')file = open('../data.txt', 'r') # Parent directory # Join paths (cross-platform)path = os.path.join('folder', 'subfolder', 'file.txt')# Windows: folder\subfolder\file.txt# Unix: folder/subfolder/file.txt with open(path, 'r', encoding='utf-8') as file: content = file.read() # Get current directorycurrent_dir = os.getcwd()print(current_dir) # Build path from current directoryfilepath = os.path.join(current_dir, 'data', 'input.txt')
Ví Dụ Thực Tế
1. Read and Process Log File
"""Read log file and count error messages. Log format:2025-10-27 10:30:15 INFO: Application started2025-10-27 10:30:20 ERROR: Connection failed2025-10-27 10:30:25 INFO: Retrying...""" def analyze_log(filename): """ Analyze log file. Returns: dict: Statistics """ stats = { 'total_lines': 0, 'info': 0, 'error': 0, 'warning': 0 } try: with open(filename, 'r', encoding='utf-8') as file: for line in file: stats['total_lines'] += 1 if 'INFO' in line: stats['info'] += 1 elif 'ERROR' in line: stats['error'] += 1 elif 'WARNING' in line: stats['warning'] += 1 return stats except FileNotFoundError: print(f"Log file not found: {filename}") return None # Usagestats = analyze_log('app.log')if stats: print(f"Total lines: {stats['total_lines']}") print(f"INFO: {stats['info']}") print(f"ERROR: {stats['error']}") print(f"WARNING: {stats['warning']}")
2. File Copy Utility
"""Copy file content from source to destination.""" def copy_file(source, destination): """ Copy file. Args: source (str): Source file path destination (str): Destination file path Returns: bool: True if success """ try: with open(source, 'r', encoding='utf-8') as src: content = src.read() with open(destination, 'w', encoding='utf-8') as dst: dst.write(content) print(f"Copied {source} to {destination}") return True except FileNotFoundError: print(f"Source file not found: {source}") return False except Exception as e: print(f"Error copying file: {e}") return False # Usagecopy_file('original.txt', 'backup.txt') # Copy line by line (memory efficient for large files)def copy_file_efficient(source, destination): """Copy file line by line.""" try: with open(source, 'r', encoding='utf-8') as src, \ open(destination, 'w', encoding='utf-8') as dst: for line in src: dst.write(line) return True except Exception as e: print(f"Error: {e}") return False
3. Word Counter
"""Count words in text file.""" def count_words(filename): """ Count total words in file. Args: filename (str): File path Returns: int: Word count or None if error """ try: with open(filename, 'r', encoding='utf-8') as file: content = file.read() words = content.split() return len(words) except FileNotFoundError: print(f"File not found: {filename}") return None def word_frequency(filename): """ Count frequency of each word. Returns: dict: Word frequencies """ try: with open(filename, 'r', encoding='utf-8') as file: content = file.read().lower() words = content.split() freq = {} for word in words: # Remove punctuation word = word.strip('.,!?;:"\'') if word: freq[word] = freq.get(word, 0) + 1 return freq except FileNotFoundError: print(f"File not found: {filename}") return None # Usagetotal = count_words('document.txt')print(f"Total words: {total}") freq = word_frequency('document.txt')if freq: # Top 10 most common words sorted_words = sorted(freq.items(), key=lambda x: x[1], reverse=True) print("\nTop 10 words:") for word, count in sorted_words[:10]: print(f"{word}: {count}")
4. Simple Database (Text File)
"""Simple user database using text file. Format: username|email|age""" def add_user(filename, username, email, age): """Add user to database.""" try: with open(filename, 'a', encoding='utf-8') as file: file.write(f"{username}|{email}|{age}\n") return True except Exception as e: print(f"Error: {e}") return False def get_all_users(filename): """Get all users from database.""" users = [] try: with open(filename, 'r', encoding='utf-8') as file: for line in file: parts = line.strip().split('|') if len(parts) == 3: user = { 'username': parts[0], 'email': parts[1], 'age': int(parts[2]) } users.append(user) return users except FileNotFoundError: return [] def find_user(filename, username): """Find user by username.""" users = get_all_users(filename) for user in users: if user['username'] == username: return user return None # UsageDB_FILE = 'users.txt' # Add usersadd_user(DB_FILE, 'alice', '[email protected]', 25)add_user(DB_FILE, 'bob', '[email protected]', 30) # Get all usersusers = get_all_users(DB_FILE)print(f"Total users: {len(users)}")for user in users: print(user) # Find specific useruser = find_user(DB_FILE, 'alice')if user: print(f"Found: {user}")
5. Configuration File Manager
"""Manage configuration file. Format: key=value""" def load_config(filename): """ Load configuration from file. Returns: dict: Configuration settings """ config = {} try: with open(filename, 'r', encoding='utf-8') as file: for line in file: line = line.strip() # Skip empty lines and comments if not line or line.startswith('#'): continue # Parse key=value if '=' in line: key, value = line.split('=', 1) config[key.strip()] = value.strip() return config except FileNotFoundError: print(f"Config file not found: {filename}") return {} def save_config(filename, config): """Save configuration to file.""" try: with open(filename, 'w', encoding='utf-8') as file: for key, value in config.items(): file.write(f"{key}={value}\n") return True except Exception as e: print(f"Error saving config: {e}") return False def get_config(config, key, default=None): """Get config value with default.""" return config.get(key, default) # UsageCONFIG_FILE = 'app.config' # Load configconfig = load_config(CONFIG_FILE) # Get valuesapp_name = get_config(config, 'APP_NAME', 'My App')debug = get_config(config, 'DEBUG', 'False') == 'True'port = int(get_config(config, 'PORT', '8000')) print(f"App: {app_name}")print(f"Debug: {debug}")print(f"Port: {port}") # Update configconfig['VERSION'] = '2.0.0'config['DEBUG'] = 'True'save_config(CONFIG_FILE, config)
Common Pitfalls
# ❌ Forgetting to close filefile = open('data.txt', 'r')content = file.read()# Missing: file.close() # ✅ Use with statementwith open('data.txt', 'r') as file: content = file.read() # ❌ Wrong encodingwith open('vietnamese.txt', 'r') as file: # May fail! content = file.read() # ✅ Specify UTF-8with open('vietnamese.txt', 'r', encoding='utf-8') as file: content = file.read() # ❌ Reading large file at oncewith open('large.txt', 'r') as file: content = file.read() # Memory issue! # ✅ Process line by linewith open('large.txt', 'r') as file: for line in file: process(line) # ❌ Not handling errorswith open('data.txt', 'r') as file: content = file.read() # Crashes if file not found! # ✅ Handle exceptionstry: with open('data.txt', 'r') as file: content = file.read()except FileNotFoundError: print("File not found!")
Bài Tập Thực Hành
Bài 1: File Reader
Viết function read_file_safe(filename):
- Read file với error handling
- Return content hoặc None
- Print descriptive error messages
Bài 2: File Statistics
Viết function file_stats(filename):
- Count lines, words, characters
- Return dict với statistics
- Handle empty files
Bài 3: File Filter
Viết function filter_lines(input_file, output_file, keyword):
- Read input file
- Write lines containing keyword to output
- Count filtered lines
Bài 4: File Merge
Viết function merge_files(file_list, output_file):
- Read multiple files
- Merge content into one file
- Add separator between files
Bài 5: Todo List Manager
Create todo list app:
- Add todo to file
- List all todos
- Mark todo as done
- Delete todo
- Use text file for storage
Tóm Tắt
✅ open(filename, mode, encoding): Mở file
✅ Modes: 'r' (read), 'w' (write), 'a' (append)
✅ read(): Read all content
✅ readline(): Read one line
✅ readlines(): Read all lines as list
✅ write(string): Write string
✅ with statement: Auto close file
✅ Always use encoding='utf-8' for Vietnamese
Bài Tiếp Theo
Bài 13.2: File I/O (Phần 2) - CSV files, JSON files, binary files, và file system operations.
Remember:
- Always use
withstatement! - Specify
encoding='utf-8' - Handle exceptions (FileNotFoundError)
- Close files or use context manager
- Process large files line by line!