Bài 3: Generators và Iterators - Iterators

Mục Tiêu Bài Học

Sau khi hoàn thành bài này, bạn sẽ:

  • ✅ Hiểu iteration protocol trong Python
  • ✅ Tạo custom iterators
  • ✅ Sử dụng iter() và next()
  • ✅ Hiểu iterables vs iterators
  • ✅ Áp dụng iterator patterns
  • ✅ Xử lý StopIteration exception

Iteration Protocol

Iteration Protocol là cách Python định nghĩa iteration.

Iterables vs Iterators

# Iterable: Object có method __iter__() return iterator# Iterator: Object có methods __iter__() và __next__() # List là iterablenumbers = [1, 2, 3, 4, 5] # Get iterator from iterableiterator = iter(numbers) # Use iteratorprint(next(iterator))  # 1print(next(iterator))  # 2print(next(iterator))  # 3 # For loop uses iteration protocol automaticallyfor num in numbers:    print(num)  # 1, 2, 3, 4, 5

How For Loop Works

# This for loop:for item in iterable:    print(item) # Is equivalent to:iterator = iter(iterable)while True:    try:        item = next(iterator)        print(item)    except StopIteration:        break

Custom Iterators

Basic Iterator

class CountUp:    """Iterator that counts from start to end."""        def __init__(self, start, end):        self.current = start        self.end = end        def __iter__(self):        """Return iterator object (self)."""        return self        def __next__(self):        """Return next item or raise StopIteration."""        if self.current > self.end:            raise StopIteration                value = self.current        self.current += 1        return value # Usagecounter = CountUp(1, 5) for num in counter:    print(num)  # 1, 2, 3, 4, 5 # Manual iterationcounter2 = CountUp(10, 12)print(next(counter2))  # 10print(next(counter2))  # 11print(next(counter2))  # 12# print(next(counter2))  # StopIteration

Iterator với State

class Fibonacci:    """Iterator for Fibonacci sequence."""        def __init__(self, max_count):        self.max_count = max_count        self.count = 0        self.a = 0        self.b = 1        def __iter__(self):        return self        def __next__(self):        if self.count >= self.max_count:            raise StopIteration                # Current value        value = self.a                # Update state        self.a, self.b = self.b, self.a + self.b        self.count += 1                return value # Usagefib = Fibonacci(10)for num in fib:    print(num, end=' ')  # 0 1 1 2 3 5 8 13 21 34 print() # Convert to listfib_list = list(Fibonacci(8))print(fib_list)  # [0, 1, 1, 2, 3, 5, 8, 13]

Infinite Iterator

class InfiniteCounter:    """Iterator that counts infinitely."""        def __init__(self, start=0, step=1):        self.current = start        self.step = step        def __iter__(self):        return self        def __next__(self):        value = self.current        self.current += self.step        return value # Usage with breakcounter = InfiniteCounter(0, 2)for num in counter:    if num > 10:        break    print(num, end=' ')  # 0 2 4 6 8 10 print() # Usage with itertools.isliceimport itertools counter2 = InfiniteCounter(1, 3)first_five = list(itertools.islice(counter2, 5))print(first_five)  # [1, 4, 7, 10, 13]

Iterable Class

Iterable khác với Iterator: Iterable tạo ra iterator mới mỗi lần.

Iterable vs Iterator

class Range:    """Iterable that creates new iterator each time."""        def __init__(self, start, end):        self.start = start        self.end = end        def __iter__(self):        """Return NEW iterator each time."""        return RangeIterator(self.start, self.end) class RangeIterator:    """Iterator for Range."""        def __init__(self, start, end):        self.current = start        self.end = end        def __iter__(self):        return self        def __next__(self):        if self.current >= self.end:            raise StopIteration        value = self.current        self.current += 1        return value # Usage - can iterate multiple times!my_range = Range(1, 5) # First iterationfor num in my_range:    print(num, end=' ')  # 1 2 3 4print() # Second iteration works!for num in my_range:    print(num, end=' ')  # 1 2 3 4print() # Compare with iteratorcounter = CountUp(1, 5)list(counter)  # [1, 2, 3, 4, 5]list(counter)  # [] - exhausted!

Separate Iterator Class

class Book:    """Book with iterable chapters."""        def __init__(self, title):        self.title = title        self.chapters = []        def add_chapter(self, chapter):        self.chapters.append(chapter)        def __iter__(self):        """Return iterator for chapters."""        return BookIterator(self.chapters) class BookIterator:    """Iterator for book chapters."""        def __init__(self, chapters):        self.chapters = chapters        self.index = 0        def __iter__(self):        return self        def __next__(self):        if self.index >= len(self.chapters):            raise StopIteration                chapter = self.chapters[self.index]        self.index += 1        return chapter # Usagebook = Book("Python Advanced")book.add_chapter("Decorators")book.add_chapter("Generators")book.add_chapter("Context Managers") for chapter in book:    print(f"Reading: {chapter}")# Reading: Decorators# Reading: Generators# Reading: Context Managers # Can iterate againfor chapter in book:    print(f"Review: {chapter}")

Built-in Functions với Iterators

iter() với Sentinel

# iter(callable, sentinel) - call until sentinel valuewith open('data.txt', 'w') as f:    f.write("Line 1\nLine 2\nLine 3\n") # Read file line by linewith open('data.txt', 'r') as f:    for line in iter(f.readline, ''):        print(line.strip())# Line 1# Line 2# Line 3 # Another example: read until specific valuedef get_input():    """Simulate user input."""    import random    values = [1, 2, 3, 4, 5, -1]    return values[get_input.index] if get_input.index < len(values) else -1 get_input.index = 0 # Read until -1for value in iter(lambda: (get_input.index := get_input.index + 1,                           [1, 2, 3, 4, 5, -1][get_input.index - 1])[1], -1):    print(value)  # 1, 2, 3, 4, 5 # Simpler exampleclass Counter:    def __init__(self):        self.count = 0        def __call__(self):        self.count += 1        if self.count > 5:            return -1        return self.count counter = Counter()for num in iter(counter, -1):    print(num)  # 1, 2, 3, 4, 5

next() với Default

# next(iterator, default) - return default instead of StopIterationnumbers = iter([1, 2, 3]) print(next(numbers, 'No more'))  # 1print(next(numbers, 'No more'))  # 2print(next(numbers, 'No more'))  # 3print(next(numbers, 'No more'))  # No more (no exception)

Iterator Patterns

1. Reversed Iterator

class ReversedIterator:    """Iterate sequence in reverse."""        def __init__(self, sequence):        self.sequence = sequence        self.index = len(sequence)        def __iter__(self):        return self        def __next__(self):        if self.index == 0:            raise StopIteration        self.index -= 1        return self.sequence[self.index] # Usagenumbers = [1, 2, 3, 4, 5]for num in ReversedIterator(numbers):    print(num, end=' ')  # 5 4 3 2 1 print() # Python built-infor num in reversed(numbers):    print(num, end=' ')  # 5 4 3 2 1

2. Batch Iterator

class BatchIterator:    """Iterate in batches of specified size."""        def __init__(self, data, batch_size):        self.data = data        self.batch_size = batch_size        self.index = 0        def __iter__(self):        return self        def __next__(self):        if self.index >= len(self.data):            raise StopIteration                # Get batch        batch = self.data[self.index:self.index + self.batch_size]        self.index += self.batch_size                return batch # Usagedata = list(range(1, 11))for batch in BatchIterator(data, 3):    print(batch)# [1, 2, 3]# [4, 5, 6]# [7, 8, 9]# [10]

3. Filtering Iterator

class FilterIterator:    """Iterator with filtering."""        def __init__(self, iterable, predicate):        self.iterator = iter(iterable)        self.predicate = predicate        def __iter__(self):        return self        def __next__(self):        while True:            value = next(self.iterator)            if self.predicate(value):                return value # Usagenumbers = range(1, 11)evens = FilterIterator(numbers, lambda x: x % 2 == 0) for num in evens:    print(num, end=' ')  # 2 4 6 8 10 print() # Python built-in filterevens2 = filter(lambda x: x % 2 == 0, range(1, 11))print(list(evens2))  # [2, 4, 6, 8, 10]

4. Zip Iterator

class ZipIterator:    """Zip multiple iterables."""        def __init__(self, *iterables):        self.iterators = [iter(it) for it in iterables]        def __iter__(self):        return self        def __next__(self):        values = []        for iterator in self.iterators:            try:                values.append(next(iterator))            except StopIteration:                raise StopIteration        return tuple(values) # Usagenames = ['Alice', 'Bob', 'Charlie']ages = [25, 30, 35]cities = ['NYC', 'LA', 'SF'] for name, age, city in ZipIterator(names, ages, cities):    print(f"{name}, {age}, {city}")# Alice, 25, NYC# Bob, 30, LA# Charlie, 35, SF # Python built-infor item in zip(names, ages, cities):    print(item)

5. Chain Iterator

class ChainIterator:    """Chain multiple iterables."""        def __init__(self, *iterables):        self.iterables = iterables        self.current_iterable_index = 0        self.current_iterator = iter(self.iterables[0]) if iterables else iter([])        def __iter__(self):        return self        def __next__(self):        while True:            try:                return next(self.current_iterator)            except StopIteration:                self.current_iterable_index += 1                if self.current_iterable_index >= len(self.iterables):                    raise StopIteration                self.current_iterator = iter(self.iterables[self.current_iterable_index]) # Usagelist1 = [1, 2, 3]list2 = [4, 5, 6]list3 = [7, 8, 9] for num in ChainIterator(list1, list2, list3):    print(num, end=' ')  # 1 2 3 4 5 6 7 8 9 print() # Python built-infrom itertools import chainfor num in chain(list1, list2, list3):    print(num, end=' ')  # 1 2 3 4 5 6 7 8 9

Real-world Examples

1. File Line Iterator

class FileLineIterator:    """Iterate file lines with line numbers."""        def __init__(self, filename):        self.filename = filename        self.file = None        self.line_number = 0        def __iter__(self):        self.file = open(self.filename, 'r')        self.line_number = 0        return self        def __next__(self):        line = self.file.readline()        if not line:            self.file.close()            raise StopIteration                self.line_number += 1        return (self.line_number, line.strip())        def __del__(self):        if self.file:            self.file.close() # Create test filewith open('test.txt', 'w') as f:    f.write("First line\nSecond line\nThird line\n") # Usagefor line_num, content in FileLineIterator('test.txt'):    print(f"{line_num}: {content}")# 1: First line# 2: Second line# 3: Third line

2. Database Result Iterator

class DatabaseResults:    """Iterator for database query results."""        def __init__(self, query):        self.query = query        self.results = self._execute_query()        self.index = 0        def _execute_query(self):        """Simulate database query."""        # In real app, would execute actual query        return [            {'id': 1, 'name': 'Alice', 'age': 25},            {'id': 2, 'name': 'Bob', 'age': 30},            {'id': 3, 'name': 'Charlie', 'age': 35},        ]        def __iter__(self):        return self        def __next__(self):        if self.index >= len(self.results):            raise StopIteration                result = self.results[self.index]        self.index += 1        return result # Usageresults = DatabaseResults("SELECT * FROM users") for user in results:    print(f"User {user['id']}: {user['name']}, {user['age']}")# User 1: Alice, 25# User 2: Bob, 30# User 3: Charlie, 35

3. Paginated API Iterator

class PaginatedAPI:    """Iterator for paginated API results."""        def __init__(self, base_url, page_size=10):        self.base_url = base_url        self.page_size = page_size        self.current_page = 1        self.current_items = []        self.item_index = 0        def _fetch_page(self, page):        """Simulate API call."""        # In real app, would make HTTP request        if page > 3:  # Simulate 3 pages            return []                start = (page - 1) * self.page_size        return [            {'id': i, 'value': f'Item {i}'}            for i in range(start, start + self.page_size)        ]        def __iter__(self):        return self        def __next__(self):        # Load next page if needed        if self.item_index >= len(self.current_items):            self.current_items = self._fetch_page(self.current_page)            self.item_index = 0            self.current_page += 1                        if not self.current_items:                raise StopIteration                item = self.current_items[self.item_index]        self.item_index += 1        return item # Usageapi = PaginatedAPI('/api/items', page_size=5) count = 0for item in api:    print(item)    count += 1    if count >= 12:  # Limit output        break# {'id': 0, 'value': 'Item 0'}# {'id': 1, 'value': 'Item 1'}# ...

4. CSV Reader Iterator

class CSVReader:    """Custom CSV reader iterator."""        def __init__(self, filename, delimiter=','):        self.filename = filename        self.delimiter = delimiter        self.file = None        self.headers = None        def __iter__(self):        self.file = open(self.filename, 'r')        # Read headers        header_line = self.file.readline().strip()        self.headers = header_line.split(self.delimiter)        return self        def __next__(self):        line = self.file.readline()        if not line:            self.file.close()            raise StopIteration                values = line.strip().split(self.delimiter)        # Return dict with headers as keys        return dict(zip(self.headers, values))        def __del__(self):        if self.file:            self.file.close() # Create test CSVwith open('data.csv', 'w') as f:    f.write("name,age,city\n")    f.write("Alice,25,NYC\n")    f.write("Bob,30,LA\n")    f.write("Charlie,35,SF\n") # Usagefor row in CSVReader('data.csv'):    print(row)# {'name': 'Alice', 'age': '25', 'city': 'NYC'}# {'name': 'Bob', 'age': '30', 'city': 'LA'}# {'name': 'Charlie', 'age': '35', 'city': 'SF'}

5. Range với Step Pattern

class DateRange:    """Iterator for date ranges."""        def __init__(self, start_date, end_date, step_days=1):        from datetime import datetime, timedelta                self.current = start_date        self.end_date = end_date        self.step = timedelta(days=step_days)        def __iter__(self):        return self        def __next__(self):        if self.current > self.end_date:            raise StopIteration                value = self.current        self.current += self.step        return value # Usagefrom datetime import datetime start = datetime(2025, 1, 1)end = datetime(2025, 1, 10) for date in DateRange(start, end, step_days=2):    print(date.strftime('%Y-%m-%d'))# 2025-01-01# 2025-01-03# 2025-01-05# 2025-01-07# 2025-01-09

Best Practices

# 1. Always implement both __iter__ and __next__class GoodIterator:    def __iter__(self):        return self        def __next__(self):        # Return next item or raise StopIteration        pass # 2. Raise StopIteration when exhausteddef __next__(self):    if self.index >= len(self.data):        raise StopIteration  # Proper way to stop    return self.data[self.index] # 3. Make iterable reusable (separate iterator)class Iterable:    def __iter__(self):        return Iterator(self.data)  # New iterator each time # 4. Clean up resourcesclass FileIterator:    def __next__(self):        line = self.file.readline()        if not line:            self.file.close()  # Clean up            raise StopIteration        return line # 5. Use next() with default to avoid exceptionsvalue = next(iterator, None)  # Returns None if exhausted

Bài Tập Thực Hành

Bài 1: Prime Number Iterator

Tạo iterator generate prime numbers indefinitely.

Bài 2: Window Iterator

Tạo iterator return sliding window (n consecutive items).

Bài 3: Cycle Iterator

Tạo iterator cycle through items infinitely.

Bài 4: Tree Traversal Iterator

Tạo iterator traverse binary tree (in-order, pre-order, post-order).

Bài 5: Merge Iterator

Tạo iterator merge nhiều sorted iterables thành một sorted sequence.

Tóm Tắt

Iterable: Object có __iter__() return iterator
Iterator: Object có __iter__()__next__()
StopIteration: Raised khi iterator exhausted
iter(): Convert iterable to iterator, hoặc callable + sentinel
next(): Get next item, có thể có default value
Patterns: Reversed, batch, filter, zip, chain
Real-world: File reading, database, API, CSV

Bài Tiếp Theo

Bài 3.2: Generators và Iterators - Generators với yield, generator expressions, và lazy evaluation! 🚀


Remember:

  • Iterator = __iter__ + __next__
  • Raise StopIteration when done
  • Separate iterator from iterable for reusability
  • Clean up resources properly
  • Use built-in functions when possible! 🎯