Python Iterators and Generators Tutorial

Anastasios Antoniadis

Updated on:

Iterators are objects that support iteration, making them a fundamental part of Python’s design—especially useful in loops and list comprehensions. In Python, any object capable of producing an iterator is called an iterable.

Creating an iterator from scratch requires several components. Every iterator must implement both the __iter__() and __next__() methods. Additionally, it must maintain its internal state and raise a StopIteration exception when no further values are available. These requirements are collectively known as the iterator protocol.

Because manually building iterators can be quite involved, Python offers a simpler alternative: generators. Generators are special functions that use the yield keyword to produce an iterator, allowing values to be generated one at a time as needed.

Understanding when to implement an iterator directly versus when to use a generator can greatly enhance your programming efficiency and effectiveness in Python. In the following sections of this tutorial, we will explore the differences between iterators and generators to help you choose the best tool for various programming scenarios.

Python Iterators & Iterables

In Python, iterables are objects that can return their elements one at a time, allowing you to loop over them. Common built-in data structures—such as lists, tuples, and sets—are iterables. Other types like strings and dictionaries are also iterable; for example, a string yields its characters one by one, and you can iterate over the keys in a dictionary. In short, if an object can be used in a for-loop, it’s an iterable.

Iterables vs. Iterators

Iterables are objects that can be looped over (e.g., lists, tuples, strings, dictionaries).

Iterators are objects that yield one element at a time and are created from iterables using the iter() function.

Using next() on an iterator retrieves the next element, while trying to use next() on an iterable (without converting it) raises a TypeError.

Iterators are lazy by nature and allow only forward traversal.

For situations where handling a large amount of data or maintaining state is required, a generator might be a more efficient alternative.

Understanding the distinction between iterables and iterators—and knowing how and when to convert between them—is key to writing efficient, Pythonic code.

While every iterator is an iterable, not every iterable is an iterator. An iterable produces an iterator only when you begin iterating over it. Consider the following example:

# Create a list (an iterable)
list_instance = [1, 2, 3, 4]

# Convert the list to an iterator using iter()
print(iter(list_instance))

This might output something like:

<list_iterator object at 0x7fd946309e90>

Although the list is iterable, it is not an iterator by itself. Calling iter() on the list returns an iterator.

To see why not all iterables are iterators, try calling the next() function directly on the list:

list_instance = [1, 2, 3, 4]
print(next(list_instance))

This code raises a TypeError:

--------------------------------------------------------------------
TypeError                         Traceback (most recent call last)
...
TypeError: 'list' object is not an iterator

This error occurs because a list does not implement the iterator protocol—it needs to be converted to an iterator first.

Iterating Over a List with an Iterator

If you want to iterate through a list using an iterator, you first convert the iterable to an iterator. Then you can retrieve its elements one at a time:

# Create a list object
list_instance = [1, 2, 3, 4]

# Convert the list into an iterator
iterator = iter(list_instance)

# Retrieve items one by one using next()
print(next(iterator))  # Output: 1
print(next(iterator))  # Output: 2
print(next(iterator))  # Output: 3
print(next(iterator))  # Output: 4

Python, however, simplifies this process by automatically generating an iterator when you use a for-loop:

# Create a list object
list_instance = [1, 2, 3, 4]

# Iterate through the list with a for-loop
for item in list_instance:
    print(item)

This loop prints:

1
2
3
4

Once the iterator runs out of elements, a StopIteration exception is raised internally, ending the loop.

It’s important to note that iterators allow you to traverse elements only in one direction—there’s no built-in method for moving backward.

The Lazy Nature of Iterators

One of the key characteristics of iterators is their lazy evaluation: they do not produce elements until requested. Moreover, you can create multiple independent iterators from the same iterable. Each iterator maintains its own progress. For example:

list_instance = [1, 2, 3, 4]
iterator_a = iter(list_instance)
iterator_b = iter(list_instance)

print(f"A: {next(iterator_a)}")  # Output: A: 1
print(f"A: {next(iterator_a)}")  # Output: A: 2
print(f"A: {next(iterator_a)}")  # Output: A: 3
print(f"A: {next(iterator_a)}")  # Output: A: 4
print(f"B: {next(iterator_b)}")  # Output: B: 1

Here, iterator_a advances through the list, while iterator_b starts fresh and returns the first element.

While iterators generate elements on demand (i.e., lazily), you can force an iterator to produce all its elements at once by passing it to a container like list(), set(), or tuple(). For example:

# Create an iterable
list_instance = [1, 2, 3, 4]

# Convert the iterable to an iterator and then to a list
iterator = iter(list_instance)
print(list(iterator))

This outputs:

[1, 2, 3, 4]

Note: Keep in mind that forcing an iterator to generate all its elements at once is not advisable for large datasets, as it may consume significant memory or processing time.

Python Generators

A generator is a fast and efficient alternative to creating an iterator. Although generators might resemble regular Python functions, they work quite differently. Instead of returning items and terminating, a generator uses the yield keyword to produce values one at a time on the fly. In other words, a generator is a special kind of function that leverages lazy evaluation.

Unlike typical iterables that store their entire content in memory, generators compute and yield each value only when needed. Consider the task of finding all factors of a positive integer. A traditional function might accumulate all factors in a list and return it, like so:

def factors(n):
    """Return a list of factors of n."""
    return [i for i in range(1, n + 1) if n % i == 0]

# Example usage:
print(factors(28))  # Output: [1, 2, 4, 7, 14, 28]

This function builds and returns the complete list of factors. In contrast, by using a generator, you can yield each factor one at a time:

def factors_gen(n):
    """Yield factors of n one by one."""
    for i in range(1, n + 1):
        if n % i == 0:
            yield i

# Example usage:
for factor in factors_gen(28):
    print(factor)

Here, the use of yield transforms the function into a generator, meaning it doesn’t return all the factors at once. Instead, it produces a generator object that maintains its state between calls. You can then retrieve individual elements by calling the next() function:

factors_of_20 = factors(20)
print(next(factors_of_20))  # Output: 1
print(next(factors_of_20))  # Output: 2
print(next(factors_of_20))  # Output: 4
print(next(factors_of_20))  # Output: 5
print(next(factors_of_20))  # Output: 10
print(next(factors_of_20))  # Output: 20

Another concise way to create a generator is with a generator expression. Similar to list comprehensions, generator expressions use a similar syntax but are enclosed in parentheses:

factors_gen = (val for val in range(1, 21) if 20 % val == 0)
print(factors_gen) # Output: <generator object <genexpr> at 0x7f488aab5230>

print(next(factors_gen)) # Output: 1

print(list(factors_gen)) # Output: [2, 4, 5, 10, 20] 

The above code demonstrates the behavior or generator expressions in Python. The generator expression generates all the numbers between 1 and 20 that evenly divide 20 (i.e., factors of 20). The generator does not compute all values immediately; instead, it waits until values are requested.

The generator does not display values directly. Instead, it prints a reference to the generator object in memory.

The next() function retrieves the first value from the generator, which is 1. The generator remembers its position, so when called again, it continues from where it left off.

The list(factors_gen) consumes the remaining values from the generator. Since next(factors_gen) already retrieved 1 earlier, the list does not include 1 anymore. The remaining numbers [2, 4, 5, 10, 20] are collected and printed.

The yield Keyword

The yield keyword is central to how generators work. Rather than terminating the function as return does, yield pauses the function and saves its current state, including local variables. This allows the function to resume from where it left off the next time next() is called on the generator object. Consider the following example:

def yield_multiple_statements():
    yield "This is the first statement"
    yield "This is the second statement"
    yield "This is the third statement"
    yield "This is the last statement. Don't call next again!"

example = yield_multiple_statements()
print(next(example))  # Output: This is the first statement
print(next(example))  # Output: This is the second statement
print(next(example))  # Output: This is the third statement
print(next(example))  # Output: This is the last statement. Don't call next again!
print(next(example))  # Raises StopIteration, as there are no more values.

In this example, the generator produces four outputs. Calling next() a fifth time raises a StopIteration exception because the generator has been exhausted.

Glossary

TermDefinition
IterableA Python object capable of being looped over, such as lists, sets, tuples, dictionaries, strings, and more.
IteratorAn object that can be iterated over, providing a sequence of values, usually one at a time.
GeneratorA special type of function that uses the yield keyword to produce an iterator, returning a series of values instead of a single result.
Lazy EvaluationAn evaluation strategy where values are computed only when needed, also known as “call-by-need.”
Iterator ProtocolA set of rules that a Python object must follow to be considered an iterator, including the implementation of the __iter__() and __next__() methods.
next()A built-in function that retrieves the next item from an iterator.
iter()A built-in function used to convert an iterable into an iterator.
yield()A Python keyword used in functions to yield a value, returning a generator object that allows the function to produce an iterator instead of a single value.

Wrap-Up

To summarize:

  • Iterators are objects that can be iterated over.
  • Generators are special functions that yield values one at a time using lazy evaluation.
  • Writing a custom iterator requires implementing the __iter__() and __next__() methods.
  • Generators simplify the creation of iterators by using the yield keyword within a function or comprehension.
  • While custom iterators might be chosen for their ability to maintain complex state or provide additional methods, generators are especially useful when dealing with large data sets because they don’t store all elements in memory at once.

Understanding these differences will help you choose the right approach—be it a generator or a custom iterator—for your specific programming needs.

FAQ

Q1: What is an iterator in Python?
A: An iterator is an object that implements the iterator protocol, which consists of the methods __iter__() and __next__(). It provides a way to access elements in a collection one at a time, keeping track of its state as you traverse through the data.

Q2: What is a generator in Python?
A: A generator is a special type of iterator created using a function that includes one or more yield statements. Each yield produces a value and pauses the function, resuming later when the next value is requested. This makes generators very memory efficient for processing large or infinite sequences.

Q3: How do I create an iterator?
A:

Built-in Iterators: Many built-in data types (lists, tuples, dictionaries, etc.) in Python are iterable. You can obtain an iterator from an iterable using the iter() function.

my_list = [1, 2, 3]
iterator = iter(my_list)
print(next(iterator))  # Output: 1

Custom Iterators: You can create your own iterator by defining a class that implements both __iter__() (which should return the iterator object itself) and __next__() (which returns the next value or raises StopIteration when done).

Q4: How do I create a generator?
A:

Generator Function: Define a function that uses the yield keyword to produce a sequence of values.

def count_up_to(n):
    count = 1
    while count <= n:
        yield count
        count += 1

for number in count_up_to(5):
    print(number)

Generator Expression: Use a syntax similar to list comprehensions but with parentheses.

squares = (x*x for x in range(10))
for square in squares:
    print(square)

Q5: What is the difference between a generator and an iterator?
A:

  • Iterator: Any object that implements the iterator protocol (__iter__() and __next__()).
  • Generator: A concise way to create iterators using functions and the yield keyword (or generator expressions). While all generators are iterators, not all iterators are generators. Generators simplify the creation of iterators by handling the state automatically.

Q6: When should I use a generator?
A: Use a generator when you want to iterate over a large dataset or potentially infinite sequence without loading everything into memory at once. They are ideal for scenarios where you process data lazily, compute values on demand, or stream data from a source.

Q7: How does the yield keyword work?
A: When a generator function encounters a yield statement, it outputs the value specified and suspends execution, saving the function’s state. The next time the generator’s __next__() method is called, the function resumes execution immediately after the yield.

Q8: Can I convert a generator to a list?
A: Yes, you can convert a generator into a list using the list() function:

gen = (x for x in range(5))
gen_list = list(gen)
print(gen_list)  # Output: [0, 1, 2, 3, 4]

Keep in mind that doing so will compute all values in the generator, which might negate the memory efficiency benefits if the sequence is very large.

Q9: What happens when a generator has no more values to yield?
A: When a generator function runs out of yield statements (or reaches a return statement), it automatically raises a StopIteration exception. This signals that the generator is exhausted and no further values are available.

Anastasios Antoniadis
Find me on
Latest posts by Anastasios Antoniadis (see all)

Leave a Comment