# 38. Comprehensions

# 38. Comprehensions

Comprehensions are compact syntax for building containers or generator-like iterators from another iterable. CPython implements them as compiled code objects with their own execution scope.

Common forms:

```python
[x * 2 for x in xs]
{x * 2 for x in xs}
{x: x * 2 for x in xs}
(x * 2 for x in xs)
```

These correspond to:

```text
list comprehension
set comprehension
dict comprehension
generator expression
```

They look like expressions, but internally they contain loops, conditionals, local bindings, and sometimes nested code objects.

## 38.1 List Comprehension

A list comprehension builds a list eagerly.

```python
ys = [x * 2 for x in xs]
```

Conceptually:

```python
ys = []
for x in xs:
    ys.append(x * 2)
```

The result is a list object containing all generated values.

The source form is shorter, but the runtime work is still iteration, expression evaluation, and append operations.

## 38.2 Filtering

A comprehension can include an `if` clause.

```python
ys = [x * 2 for x in xs if x > 0]
```

Conceptually:

```python
ys = []
for x in xs:
    if x > 0:
        ys.append(x * 2)
```

The filter runs for each item. If the condition is false, the element expression does not run for that item.

## 38.3 Multiple `for` Clauses

Comprehensions can contain nested loops.

```python
pairs = [(x, y) for x in xs for y in ys]
```

Conceptually:

```python
pairs = []
for x in xs:
    for y in ys:
        pairs.append((x, y))
```

The order is left to right, matching nested loop order.

For:

```python
[(x, y, z) for x in xs for y in ys for z in zs]
```

the conceptual loop nest is:

```python
result = []
for x in xs:
    for y in ys:
        for z in zs:
            result.append((x, y, z))
```

## 38.4 Multiple Filters

Filters attach to the loop level where they appear.

```python
result = [x for x in xs if x > 0 if x % 2 == 0]
```

Conceptually:

```python
result = []
for x in xs:
    if x > 0:
        if x % 2 == 0:
            result.append(x)
```

With multiple loops:

```python
result = [(x, y) for x in xs if x > 0 for y in ys if y > x]
```

Conceptually:

```python
result = []
for x in xs:
    if x > 0:
        for y in ys:
            if y > x:
                result.append((x, y))
```

The order matters because later clauses can use names bound by earlier clauses.

## 38.5 Comprehension Scope

In Python 3, comprehensions have their own scope.

```python
x = 100
ys = [x for x in range(3)]
print(x)
```

Output:

```text
100
```

The `x` inside the comprehension does not overwrite the outer `x`.

Conceptually, the comprehension behaves like a small nested function:

```python
def _listcomp(iterable):
    result = []
    for x in iterable:
        result.append(x)
    return result

ys = _listcomp(range(3))
```

This is not exact source transformation, but it explains the scope.

## 38.6 Why Comprehensions Have Their Own Code Objects

A comprehension needs a place to store its loop variables without leaking them into the surrounding scope.

CPython solves this by compiling many comprehensions into nested code objects.

Example:

```python
def f(xs):
    return [x * 2 for x in xs]
```

The outer function has one code object. The list comprehension has another code object stored in the outer code object’s constants.

You can inspect this:

```python
def f(xs):
    return [x * 2 for x in xs]

for const in f.__code__.co_consts:
    print(type(const), const)
```

One constant is usually a nested code object for the comprehension.

## 38.7 Disassembling a List Comprehension

Use `dis`:

```python
import dis

def f(xs):
    return [x * 2 for x in xs]

dis.dis(f)
```

Then inspect nested code objects:

```python
for const in f.__code__.co_consts:
    if hasattr(const, "co_code"):
        dis.dis(const)
```

You will see two layers:

```text
outer function:
    create comprehension function
    get iterator from xs
    call comprehension function
    return list

inner comprehension:
    build list
    iterate input
    compute x * 2
    append to list
    return list
```

The exact bytecode changes across CPython versions, but the shape remains.

## 38.8 List Append Optimization

A list comprehension usually uses a specialized list append bytecode path.

Conceptually:

```python
result.append(value)
```

But CPython can avoid ordinary method lookup for every append.

Instead of doing this for each item:

```text
load result.append
call append
```

the comprehension body can use an internal append operation.

Conceptually:

```text
LIST_APPEND
```

This saves repeated attribute lookup and method call overhead.

This is one reason list comprehensions are often faster than equivalent Python-level loops with `append`.

## 38.9 Set Comprehension

A set comprehension builds a set eagerly.

```python
unique = {x.lower() for x in words}
```

Conceptually:

```python
unique = set()
for x in words:
    unique.add(x.lower())
```

The result contains unique elements according to normal set hashing and equality.

A set comprehension uses set-add behavior internally, similar to how list comprehensions use append behavior.

## 38.10 Dict Comprehension

A dict comprehension builds a dictionary eagerly.

```python
index = {item.id: item for item in items}
```

Conceptually:

```python
index = {}
for item in items:
    index[item.id] = item
```

If duplicate keys appear, later values overwrite earlier values:

```python
d = {x % 2: x for x in range(5)}
print(d)
```

Output:

```text
{0: 4, 1: 3}
```

The comprehension follows normal dictionary assignment semantics.

## 38.11 Generator Expression

A generator expression is lazy.

```python
g = (x * 2 for x in xs)
```

It does not build a list immediately. It creates a generator-like object that computes values when iterated.

```python
g = (x * 2 for x in range(3))

print(next(g))
print(next(g))
print(next(g))
```

Output:

```text
0
2
4
```

After exhaustion, it raises `StopIteration`.

## 38.12 Generator Expression vs List Comprehension

Compare:

```python
[x * 2 for x in xs]
```

and:

```python
(x * 2 for x in xs)
```

| Feature | List comprehension | Generator expression |
|---|---|---|
| Evaluation | Eager | Lazy |
| Result | List | Generator object |
| Memory | Stores all results | Stores execution state |
| Iteration | Can iterate result many times | One-shot |
| Syntax | Square brackets | Parentheses |
| Use case | Need all values now | Stream values |

A list comprehension is often faster when you need the full list.

A generator expression is often better when you want to stream values or stop early.

## 38.13 One-Shot Nature of Generator Expressions

A generator expression is consumed once.

```python
g = (x for x in range(3))

print(list(g))
print(list(g))
```

Output:

```text
[0, 1, 2]
[]
```

The first `list(g)` exhausts it.

If you need reusable data, build a list or another container.

## 38.14 Early Stopping

Generator expressions are useful with consumers that stop early.

```python
first = next(x for x in xs if x > 100)
```

This computes only until it finds the first matching element.

A list comprehension version:

```python
first = [x for x in xs if x > 100][0]
```

builds the full list of matches before selecting the first item.

For large or infinite inputs, generator expressions are the correct model.

## 38.15 Comprehensions and Closures

Comprehensions can capture outer variables.

```python
def scale(xs, factor):
    return [x * factor for x in xs]
```

The comprehension uses `factor` from the outer function.

Conceptually:

```text
outer function frame:
    factor stored in local or cell

comprehension code object:
    reads factor as free variable
```

The compiler arranges closure cells when the comprehension needs access to outer-scope variables.

## 38.16 Loop Variable Binding

The loop variable belongs to the comprehension scope.

```python
def f():
    x = "outer"
    ys = [x for x in range(3)]
    return x, ys

print(f())
```

Output:

```text
('outer', [0, 1, 2])
```

Inside the comprehension, `x` is a local of the comprehension code object.

The outer `x` remains unchanged.

## 38.17 Assignment Expressions in Comprehensions

Assignment expressions can appear in comprehensions.

```python
result = [y for x in xs if (y := f(x)) > 0]
```

The binding rules are subtle. The assignment expression binds in the containing scope, not in the implicit comprehension scope in the same way as the loop variable.

Example:

```python
def f(xs):
    result = [y for x in xs if (y := x * 2) > 3]
    return y, result
```

After the comprehension, `y` may be visible in the containing function scope if at least one assignment occurred.

This behavior exists because assignment expressions are designed to make the assigned name available outside some expression-local contexts.

## 38.18 Nested Comprehensions

A comprehension can contain another comprehension.

```python
matrix = [[i * j for j in range(3)] for i in range(3)]
```

Conceptually:

```python
matrix = []
for i in range(3):
    row = []
    for j in range(3):
        row.append(i * j)
    matrix.append(row)
```

Each comprehension has its own code object and scope.

Nested comprehensions can therefore create nested function-like execution layers.

## 38.19 Comprehensions Over Dictionaries

Iterating over a dictionary yields keys.

```python
keys = [k for k in d]
```

To use values:

```python
values = [v for v in d.values()]
```

To use key-value pairs:

```python
pairs = [(k, v) for k, v in d.items()]
```

A common transformation:

```python
inverted = {v: k for k, v in d.items()}
```

If values are duplicated, later keys overwrite earlier ones because dictionary keys must be unique.

## 38.20 Comprehensions and Evaluation Order

Comprehension clauses execute left to right.

```python
[(x, y) for x in xs for y in f(x)]
```

For each `x`, `f(x)` is evaluated to produce the inner iterable.

Conceptually:

```python
result = []
for x in xs:
    for y in f(x):
        result.append((x, y))
```

This means later clauses can depend on earlier loop variables.

The element expression runs only after all loop and filter clauses for that output value have succeeded.

## 38.21 Side Effects

Comprehensions can contain side effects, but should usually be used for producing values.

Possible but poor style:

```python
[print(x) for x in xs]
```

This builds a list of `None` values just to perform printing.

Prefer:

```python
for x in xs:
    print(x)
```

Use a comprehension when the result matters.

## 38.22 Exceptions in Comprehensions

Exceptions propagate normally.

```python
result = [10 / x for x in xs]
```

If `x` is zero, `ZeroDivisionError` propagates and the comprehension stops.

Partially built internal containers are discarded unless referenced elsewhere, which ordinary comprehension internals do not expose.

For generator expressions, exceptions occur lazily:

```python
g = (10 / x for x in xs)
```

Creating `g` does not divide. The exception happens when the problematic item is requested.

## 38.23 Comprehensions and `try`

Comprehensions do not allow statements such as `try` directly inside them.

Invalid:

```python
[x for x in xs try ...]
```

Use a helper function:

```python
def parse_or_none(x):
    try:
        return int(x)
    except ValueError:
        return None

values = [y for x in xs if (y := parse_or_none(x)) is not None]
```

Or use an ordinary loop when exception handling is central:

```python
values = []
for x in xs:
    try:
        values.append(int(x))
    except ValueError:
        pass
```

## 38.24 Async Comprehensions

Inside `async def`, comprehensions can use `async for`.

```python
async def collect(stream):
    return [item async for item in stream]
```

Conceptually:

```python
result = []
async for item in stream:
    result.append(item)
return result
```

They can also use `await` in the element expression or filter:

```python
async def collect(xs):
    return [await process(x) for x in xs]
```

Async comprehensions compile to async-aware bytecode and may suspend during execution.

## 38.25 Async Generator Expressions

An async generator expression can use `async for`.

```python
gen = (item async for item in stream)
```

It produces an async generator-like object consumed with `async for` or `anext`.

```python
async for item in gen:
    ...
```

The execution model combines comprehension scope with async iteration and coroutine suspension.

## 38.26 Comprehensions and `locals()`

Because comprehensions have their own scope, `locals()` inside a comprehension-like helper sees comprehension-local variables, not exactly the surrounding locals.

This is easier to see with helper functions than with direct syntax because comprehensions restrict statements.

The important rule:

```text
loop variables in comprehensions do not leak into the surrounding scope
```

For ordinary code, rely on that rule rather than on details of `locals()` inside implementation-created frames.

## 38.27 Comprehensions and Late Binding

Closures inside comprehensions can still show late binding behavior.

```python
funcs = [lambda: x for x in range(3)]
print([f() for f in funcs])
```

Output:

```text
[2, 2, 2]
```

Each lambda closes over the same comprehension variable `x`, whose final value is `2`.

Use a default argument to capture the current value:

```python
funcs = [lambda x=x: x for x in range(3)]
print([f() for f in funcs])
```

Output:

```text
[0, 1, 2]
```

The comprehension scope prevents leakage outward, but it does not create a new binding per iteration for closures.

## 38.28 Comprehensions and Reference Lifetime

A list comprehension does not keep its frame alive after completion unless something captures it.

But a generator expression keeps its frame-like state alive while suspended.

```python
g = (x * 2 for x in range(10))
```

The generator expression holds:

```text
code object
iteration state
current iterator
locals
suspended frame state
```

If it captures a large object, that object may remain alive until the generator is exhausted or discarded.

```python
def f():
    big = bytearray(100_000_000)
    return (x for x in range(3) if big is not None)

g = f()
```

Here, `big` remains alive through the generator expression closure.

## 38.29 Comprehensions and Performance

List comprehensions are often faster than equivalent loops because CPython can use specialized internal operations.

Example:

```python
result = []
for x in xs:
    result.append(x * 2)
```

Compared with:

```python
result = [x * 2 for x in xs]
```

The comprehension can avoid repeated Python-level method lookup for `append`.

However, performance depends on the expression, data size, Python version, and whether laziness matters.

General rule:

```text
use list/set/dict comprehensions when building that container directly
use generator expressions when streaming or stopping early
use explicit loops when control flow is complex
```

## 38.30 Comprehensions and Readability

Comprehensions are clearest when they fit one simple transformation.

Good:

```python
names = [user.name for user in users]
```

Good:

```python
active = [user for user in users if user.active]
```

Often too dense:

```python
result = [(a, b, c) for a in xs if p(a) for b in f(a) if q(b) for c in g(a, b) if r(c)]
```

Use explicit loops when there are many clauses, side effects, exception handling, or complex branching.

## 38.31 CPython Object Flow

For a list comprehension:

```python
[x * 2 for x in xs]
```

CPython conceptually performs:

```text
create result list
get iterator from xs
loop:
    get next item
    store item in comprehension local x
    load x
    load constant 2
    multiply
    append to result list
return result list
```

The list object is held inside the comprehension frame while it is being built.

For a dict comprehension:

```python
{x: x * 2 for x in xs}
```

the flow is:

```text
create result dict
iterate xs
compute key
compute value
store key-value pair
return dict
```

## 38.32 Comprehension Code Object Names

Comprehension code objects have internal names such as:

```text
<listcomp>
<setcomp>
<dictcomp>
<genexpr>
```

You can see them in tracebacks and introspection.

Example:

```python
def f(xs):
    return [10 / x for x in xs]

f([2, 1, 0])
```

The traceback may include `<listcomp>` because the exception occurs inside the comprehension code object.

This shows that comprehension execution has its own frame-like context.

## 38.33 Tracebacks in Comprehensions

If an exception occurs inside a comprehension, the traceback can include both the outer function and the comprehension.

```python
def f(xs):
    return [10 / x for x in xs]

f([1, 0])
```

The division by zero occurs inside the comprehension code.

Conceptually:

```text
frame f
    calls <listcomp>
        division by zero
```

This is another visible effect of comprehension code objects.

## 38.34 Comprehension Variable Lifetime

A comprehension loop variable exists in the comprehension scope.

After completion:

```python
def f():
    result = [x for x in range(3)]
    return "x" in locals()

print(f())
```

This returns:

```text
False
```

The loop variable `x` did not become a local in `f`.

Inside the comprehension frame, `x` existed while the comprehension ran.

## 38.35 Generator Expression Argument Shortcut

A generator expression can be passed as the only argument to a function without extra parentheses.

```python
total = sum(x * x for x in xs)
```

This is equivalent to:

```python
total = sum((x * x for x in xs))
```

But if there are multiple arguments, parentheses are required:

```python
result = func((x for x in xs), other)
```

This is syntax-level convenience. The runtime object is still a generator expression.

## 38.36 Comprehensions and Built-ins

Comprehensions often pair with built-ins.

Examples:

```python
sum(x for x in xs)
any(x > 0 for x in xs)
all(x.valid for x in items)
max(score(x) for x in xs)
```

These use generator expressions and can stop early in some cases.

`any` stops at the first true value.

`all` stops at the first false value.

`sum` consumes the whole generator.

Choosing a generator expression avoids building an unnecessary intermediate list.

## 38.37 Common Misunderstandings

| Misunderstanding | Correct model |
|---|---|
| A comprehension is just syntax rewriting in the same scope | It usually has its own nested code object and scope |
| The loop variable leaks into the outer scope | In Python 3, it does not |
| A generator expression builds a tuple | It creates a generator object |
| List comprehensions are always better | Generator expressions are better for streaming and early stopping |
| Comprehensions cannot capture outer variables | They can capture through closures |
| Each lambda in a comprehension captures a different loop variable | They usually share the same comprehension variable binding |
| Exceptions happen when a generator expression is created | They happen when it is consumed |
| Dict comprehensions keep duplicate keys | Later values overwrite earlier ones |

## 38.38 Reading Strategy

Start with:

```python
def f(xs):
    return [x * 2 for x in xs if x > 0]
```

Inspect:

```python
import dis

dis.dis(f)

for const in f.__code__.co_consts:
    if hasattr(const, "co_code"):
        print(const.co_name)
        dis.dis(const)
```

Then compare with:

```python
def g(xs):
    return (x * 2 for x in xs if x > 0)
```

Track:

```text
outer function bytecode
nested comprehension code object
iteration setup
local loop variable
filter jump
append or yield operation
return value
closure variables
```

Then study set, dict, nested, and async comprehensions.

## 38.39 Chapter Summary

Comprehensions are compiled execution units for building lists, sets, dictionaries, or generator-like iterators. They combine iteration, filtering, expression evaluation, binding, and container construction in expression form.

The core model is:

```text
evaluate outer iterable
    ↓
create comprehension execution scope
    ↓
iterate
    ↓
apply filters
    ↓
compute element, key-value pair, or yielded value
    ↓
append, add, store, or yield
    ↓
return container or generator object
```

List, set, and dict comprehensions are eager. Generator expressions are lazy. CPython implements these constructs using nested code objects, frame state, closure handling, specialized bytecode operations, and normal exception propagation.
