Bài 3: Generators và Iterators - Iterators
Mục Tiêu Bài Học
Sau khi hoàn thành bài này, bạn sẽ:
- ✅ Hiểu iteration protocol trong Python
- ✅ Tạo custom iterators
- ✅ Sử dụng iter() và next()
- ✅ Hiểu iterables vs iterators
- ✅ Áp dụng iterator patterns
- ✅ Xử lý StopIteration exception
Iteration Protocol
Iteration Protocol là cách Python định nghĩa iteration.
Iterables vs Iterators
# Iterable: Object có method __iter__() return iterator# Iterator: Object có methods __iter__() và __next__() # List là iterablenumbers = [1, 2, 3, 4, 5] # Get iterator from iterableiterator = iter(numbers) # Use iteratorprint(next(iterator)) # 1print(next(iterator)) # 2print(next(iterator)) # 3 # For loop uses iteration protocol automaticallyfor num in numbers: print(num) # 1, 2, 3, 4, 5
How For Loop Works
# This for loop:for item in iterable: print(item) # Is equivalent to:iterator = iter(iterable)while True: try: item = next(iterator) print(item) except StopIteration: break
Custom Iterators
Basic Iterator
class CountUp: """Iterator that counts from start to end.""" def __init__(self, start, end): self.current = start self.end = end def __iter__(self): """Return iterator object (self).""" return self def __next__(self): """Return next item or raise StopIteration.""" if self.current > self.end: raise StopIteration value = self.current self.current += 1 return value # Usagecounter = CountUp(1, 5) for num in counter: print(num) # 1, 2, 3, 4, 5 # Manual iterationcounter2 = CountUp(10, 12)print(next(counter2)) # 10print(next(counter2)) # 11print(next(counter2)) # 12# print(next(counter2)) # StopIteration
Iterator với State
class Fibonacci: """Iterator for Fibonacci sequence.""" def __init__(self, max_count): self.max_count = max_count self.count = 0 self.a = 0 self.b = 1 def __iter__(self): return self def __next__(self): if self.count >= self.max_count: raise StopIteration # Current value value = self.a # Update state self.a, self.b = self.b, self.a + self.b self.count += 1 return value # Usagefib = Fibonacci(10)for num in fib: print(num, end=' ') # 0 1 1 2 3 5 8 13 21 34 print() # Convert to listfib_list = list(Fibonacci(8))print(fib_list) # [0, 1, 1, 2, 3, 5, 8, 13]
Infinite Iterator
class InfiniteCounter: """Iterator that counts infinitely.""" def __init__(self, start=0, step=1): self.current = start self.step = step def __iter__(self): return self def __next__(self): value = self.current self.current += self.step return value # Usage with breakcounter = InfiniteCounter(0, 2)for num in counter: if num > 10: break print(num, end=' ') # 0 2 4 6 8 10 print() # Usage with itertools.isliceimport itertools counter2 = InfiniteCounter(1, 3)first_five = list(itertools.islice(counter2, 5))print(first_five) # [1, 4, 7, 10, 13]
Iterable Class
Iterable khác với Iterator: Iterable tạo ra iterator mới mỗi lần.
Iterable vs Iterator
class Range: """Iterable that creates new iterator each time.""" def __init__(self, start, end): self.start = start self.end = end def __iter__(self): """Return NEW iterator each time.""" return RangeIterator(self.start, self.end) class RangeIterator: """Iterator for Range.""" def __init__(self, start, end): self.current = start self.end = end def __iter__(self): return self def __next__(self): if self.current >= self.end: raise StopIteration value = self.current self.current += 1 return value # Usage - can iterate multiple times!my_range = Range(1, 5) # First iterationfor num in my_range: print(num, end=' ') # 1 2 3 4print() # Second iteration works!for num in my_range: print(num, end=' ') # 1 2 3 4print() # Compare with iteratorcounter = CountUp(1, 5)list(counter) # [1, 2, 3, 4, 5]list(counter) # [] - exhausted!
Separate Iterator Class
class Book: """Book with iterable chapters.""" def __init__(self, title): self.title = title self.chapters = [] def add_chapter(self, chapter): self.chapters.append(chapter) def __iter__(self): """Return iterator for chapters.""" return BookIterator(self.chapters) class BookIterator: """Iterator for book chapters.""" def __init__(self, chapters): self.chapters = chapters self.index = 0 def __iter__(self): return self def __next__(self): if self.index >= len(self.chapters): raise StopIteration chapter = self.chapters[self.index] self.index += 1 return chapter # Usagebook = Book("Python Advanced")book.add_chapter("Decorators")book.add_chapter("Generators")book.add_chapter("Context Managers") for chapter in book: print(f"Reading: {chapter}")# Reading: Decorators# Reading: Generators# Reading: Context Managers # Can iterate againfor chapter in book: print(f"Review: {chapter}")
Built-in Functions với Iterators
iter() với Sentinel
# iter(callable, sentinel) - call until sentinel valuewith open('data.txt', 'w') as f: f.write("Line 1\nLine 2\nLine 3\n") # Read file line by linewith open('data.txt', 'r') as f: for line in iter(f.readline, ''): print(line.strip())# Line 1# Line 2# Line 3 # Another example: read until specific valuedef get_input(): """Simulate user input.""" import random values = [1, 2, 3, 4, 5, -1] return values[get_input.index] if get_input.index < len(values) else -1 get_input.index = 0 # Read until -1for value in iter(lambda: (get_input.index := get_input.index + 1, [1, 2, 3, 4, 5, -1][get_input.index - 1])[1], -1): print(value) # 1, 2, 3, 4, 5 # Simpler exampleclass Counter: def __init__(self): self.count = 0 def __call__(self): self.count += 1 if self.count > 5: return -1 return self.count counter = Counter()for num in iter(counter, -1): print(num) # 1, 2, 3, 4, 5
next() với Default
# next(iterator, default) - return default instead of StopIterationnumbers = iter([1, 2, 3]) print(next(numbers, 'No more')) # 1print(next(numbers, 'No more')) # 2print(next(numbers, 'No more')) # 3print(next(numbers, 'No more')) # No more (no exception)
Iterator Patterns
1. Reversed Iterator
class ReversedIterator: """Iterate sequence in reverse.""" def __init__(self, sequence): self.sequence = sequence self.index = len(sequence) def __iter__(self): return self def __next__(self): if self.index == 0: raise StopIteration self.index -= 1 return self.sequence[self.index] # Usagenumbers = [1, 2, 3, 4, 5]for num in ReversedIterator(numbers): print(num, end=' ') # 5 4 3 2 1 print() # Python built-infor num in reversed(numbers): print(num, end=' ') # 5 4 3 2 1
2. Batch Iterator
class BatchIterator: """Iterate in batches of specified size.""" def __init__(self, data, batch_size): self.data = data self.batch_size = batch_size self.index = 0 def __iter__(self): return self def __next__(self): if self.index >= len(self.data): raise StopIteration # Get batch batch = self.data[self.index:self.index + self.batch_size] self.index += self.batch_size return batch # Usagedata = list(range(1, 11))for batch in BatchIterator(data, 3): print(batch)# [1, 2, 3]# [4, 5, 6]# [7, 8, 9]# [10]
3. Filtering Iterator
class FilterIterator: """Iterator with filtering.""" def __init__(self, iterable, predicate): self.iterator = iter(iterable) self.predicate = predicate def __iter__(self): return self def __next__(self): while True: value = next(self.iterator) if self.predicate(value): return value # Usagenumbers = range(1, 11)evens = FilterIterator(numbers, lambda x: x % 2 == 0) for num in evens: print(num, end=' ') # 2 4 6 8 10 print() # Python built-in filterevens2 = filter(lambda x: x % 2 == 0, range(1, 11))print(list(evens2)) # [2, 4, 6, 8, 10]
4. Zip Iterator
class ZipIterator: """Zip multiple iterables.""" def __init__(self, *iterables): self.iterators = [iter(it) for it in iterables] def __iter__(self): return self def __next__(self): values = [] for iterator in self.iterators: try: values.append(next(iterator)) except StopIteration: raise StopIteration return tuple(values) # Usagenames = ['Alice', 'Bob', 'Charlie']ages = [25, 30, 35]cities = ['NYC', 'LA', 'SF'] for name, age, city in ZipIterator(names, ages, cities): print(f"{name}, {age}, {city}")# Alice, 25, NYC# Bob, 30, LA# Charlie, 35, SF # Python built-infor item in zip(names, ages, cities): print(item)
5. Chain Iterator
class ChainIterator: """Chain multiple iterables.""" def __init__(self, *iterables): self.iterables = iterables self.current_iterable_index = 0 self.current_iterator = iter(self.iterables[0]) if iterables else iter([]) def __iter__(self): return self def __next__(self): while True: try: return next(self.current_iterator) except StopIteration: self.current_iterable_index += 1 if self.current_iterable_index >= len(self.iterables): raise StopIteration self.current_iterator = iter(self.iterables[self.current_iterable_index]) # Usagelist1 = [1, 2, 3]list2 = [4, 5, 6]list3 = [7, 8, 9] for num in ChainIterator(list1, list2, list3): print(num, end=' ') # 1 2 3 4 5 6 7 8 9 print() # Python built-infrom itertools import chainfor num in chain(list1, list2, list3): print(num, end=' ') # 1 2 3 4 5 6 7 8 9
Real-world Examples
1. File Line Iterator
class FileLineIterator: """Iterate file lines with line numbers.""" def __init__(self, filename): self.filename = filename self.file = None self.line_number = 0 def __iter__(self): self.file = open(self.filename, 'r') self.line_number = 0 return self def __next__(self): line = self.file.readline() if not line: self.file.close() raise StopIteration self.line_number += 1 return (self.line_number, line.strip()) def __del__(self): if self.file: self.file.close() # Create test filewith open('test.txt', 'w') as f: f.write("First line\nSecond line\nThird line\n") # Usagefor line_num, content in FileLineIterator('test.txt'): print(f"{line_num}: {content}")# 1: First line# 2: Second line# 3: Third line
2. Database Result Iterator
class DatabaseResults: """Iterator for database query results.""" def __init__(self, query): self.query = query self.results = self._execute_query() self.index = 0 def _execute_query(self): """Simulate database query.""" # In real app, would execute actual query return [ {'id': 1, 'name': 'Alice', 'age': 25}, {'id': 2, 'name': 'Bob', 'age': 30}, {'id': 3, 'name': 'Charlie', 'age': 35}, ] def __iter__(self): return self def __next__(self): if self.index >= len(self.results): raise StopIteration result = self.results[self.index] self.index += 1 return result # Usageresults = DatabaseResults("SELECT * FROM users") for user in results: print(f"User {user['id']}: {user['name']}, {user['age']}")# User 1: Alice, 25# User 2: Bob, 30# User 3: Charlie, 35
3. Paginated API Iterator
class PaginatedAPI: """Iterator for paginated API results.""" def __init__(self, base_url, page_size=10): self.base_url = base_url self.page_size = page_size self.current_page = 1 self.current_items = [] self.item_index = 0 def _fetch_page(self, page): """Simulate API call.""" # In real app, would make HTTP request if page > 3: # Simulate 3 pages return [] start = (page - 1) * self.page_size return [ {'id': i, 'value': f'Item {i}'} for i in range(start, start + self.page_size) ] def __iter__(self): return self def __next__(self): # Load next page if needed if self.item_index >= len(self.current_items): self.current_items = self._fetch_page(self.current_page) self.item_index = 0 self.current_page += 1 if not self.current_items: raise StopIteration item = self.current_items[self.item_index] self.item_index += 1 return item # Usageapi = PaginatedAPI('/api/items', page_size=5) count = 0for item in api: print(item) count += 1 if count >= 12: # Limit output break# {'id': 0, 'value': 'Item 0'}# {'id': 1, 'value': 'Item 1'}# ...
4. CSV Reader Iterator
class CSVReader: """Custom CSV reader iterator.""" def __init__(self, filename, delimiter=','): self.filename = filename self.delimiter = delimiter self.file = None self.headers = None def __iter__(self): self.file = open(self.filename, 'r') # Read headers header_line = self.file.readline().strip() self.headers = header_line.split(self.delimiter) return self def __next__(self): line = self.file.readline() if not line: self.file.close() raise StopIteration values = line.strip().split(self.delimiter) # Return dict with headers as keys return dict(zip(self.headers, values)) def __del__(self): if self.file: self.file.close() # Create test CSVwith open('data.csv', 'w') as f: f.write("name,age,city\n") f.write("Alice,25,NYC\n") f.write("Bob,30,LA\n") f.write("Charlie,35,SF\n") # Usagefor row in CSVReader('data.csv'): print(row)# {'name': 'Alice', 'age': '25', 'city': 'NYC'}# {'name': 'Bob', 'age': '30', 'city': 'LA'}# {'name': 'Charlie', 'age': '35', 'city': 'SF'}
5. Range với Step Pattern
class DateRange: """Iterator for date ranges.""" def __init__(self, start_date, end_date, step_days=1): from datetime import datetime, timedelta self.current = start_date self.end_date = end_date self.step = timedelta(days=step_days) def __iter__(self): return self def __next__(self): if self.current > self.end_date: raise StopIteration value = self.current self.current += self.step return value # Usagefrom datetime import datetime start = datetime(2025, 1, 1)end = datetime(2025, 1, 10) for date in DateRange(start, end, step_days=2): print(date.strftime('%Y-%m-%d'))# 2025-01-01# 2025-01-03# 2025-01-05# 2025-01-07# 2025-01-09
Best Practices
# 1. Always implement both __iter__ and __next__class GoodIterator: def __iter__(self): return self def __next__(self): # Return next item or raise StopIteration pass # 2. Raise StopIteration when exhausteddef __next__(self): if self.index >= len(self.data): raise StopIteration # Proper way to stop return self.data[self.index] # 3. Make iterable reusable (separate iterator)class Iterable: def __iter__(self): return Iterator(self.data) # New iterator each time # 4. Clean up resourcesclass FileIterator: def __next__(self): line = self.file.readline() if not line: self.file.close() # Clean up raise StopIteration return line # 5. Use next() with default to avoid exceptionsvalue = next(iterator, None) # Returns None if exhausted
Bài Tập Thực Hành
Bài 1: Prime Number Iterator
Tạo iterator generate prime numbers indefinitely.
Bài 2: Window Iterator
Tạo iterator return sliding window (n consecutive items).
Bài 3: Cycle Iterator
Tạo iterator cycle through items infinitely.
Bài 4: Tree Traversal Iterator
Tạo iterator traverse binary tree (in-order, pre-order, post-order).
Bài 5: Merge Iterator
Tạo iterator merge nhiều sorted iterables thành một sorted sequence.
Tóm Tắt
✅ Iterable: Object có __iter__() return iterator
✅ Iterator: Object có __iter__() và __next__()
✅ StopIteration: Raised khi iterator exhausted
✅ iter(): Convert iterable to iterator, hoặc callable + sentinel
✅ next(): Get next item, có thể có default value
✅ Patterns: Reversed, batch, filter, zip, chain
✅ Real-world: File reading, database, API, CSV
Bài Tiếp Theo
Bài 3.2: Generators và Iterators - Generators với yield, generator expressions, và lazy evaluation! 🚀
Remember:
- Iterator =
__iter__+__next__ - Raise StopIteration when done
- Separate iterator from iterable for reusability
- Clean up resources properly
- Use built-in functions when possible! 🎯