shallow copy, deep copy, and the nested-dict bug
Shallow copy + nested data = surprise mutation
Now we hit the trap that bit you the moment you started learning
about copies. .copy() works perfectly for a flat list. The instant
your data has a nested structure — a list inside a dict, a dict
inside a list, anything more than one level deep — .copy() only
copies the outer layer. The inside is still shared.
This is the mutation bug from earlier in the chapter, scaled up to nested data. And nested data is what every API response, every JSON config, every database row looks like. So this isn't an obscure edge case — it's the common case for any real codebase.
What "shallow" actually copies
user = {"name": "alex", "skills": ["python", "sql"]}
shallow = user.copy()
user.copy() builds a new dict. The new dict has the same keys
("name", "skills") and points each key at the same value the
original pointed at. That's fine for "name" (the string "alex"
is immutable — you can't mutate a string). It's not fine for
"skills", because the list at that key is the exact same list
object in both dicts.
Picture it:
user["skills"] ───┐
▼
["python", "sql"]
▲
shallow["skills"] ───┘
Two dicts. Two "skills" keys. One list. If you .append() to
either of them, the list — the one shared list — grows, and both
dicts see it.
The editor on the right does exactly that. shallow["skills"].append("javascript")
reaches into the shared inner list and adds an item. Run it:
user: {'name': 'alex', 'skills': ['python', 'sql', 'javascript']}
shallow: {'name': 'alex', 'skills': ['python', 'sql', 'javascript']}
deep: {'name': 'alex', 'skills': ['python', 'sql']}
Both user and shallow got the new skill. deep didn't.
The fix: copy.deepcopy
import copy
deep = copy.deepcopy(user)
copy.deepcopy walks every level of the structure and rebuilds all
of it. New outer dict, new inner list, new everything. The result is
a structure with no shared references to the original at any depth.
Mutating any part of deep — deep["skills"].append(...),
deep["name"] = "...", anything — leaves user untouched.
The cost: it's slower than .copy(). For a few records you'll never
notice. For a list of ten thousand nested objects, you might. The
trade is correctness (always works on nested data) for speed
(slightly more allocation).
When AI specifically gets this wrong
Three common patterns in Cursor's nested-data code:
-
"Backup before transform" with
.copy().backup = response.copy() process(response) # mutates response["users"] if something_failed: response = backup # didn't actually back anything upSame shape as the previous-lesson bug, but now it bites at the nested level. The fix:
backup = copy.deepcopy(response). -
Cloning a config to override a few fields.
default_config = {"retries": 3, "headers": {"x-key": "..."}} user_config = default_config.copy() user_config["headers"]["x-key"] = "different-key"You just also changed
default_config["headers"]["x-key"], becauseheadersis the same dict in both. Every subsequent request that usesdefault_configis now using the user's key. The fix:deepcopy. -
Putting nested objects in a cache. When AI builds a memoize helper that stores results in a dict, mutating the cached value from the call site silently changes the cache. We'll see that pattern when we cover decorators and closures.
The rule of thumb to internalize
Flat structure (list of numbers, dict of strings) →
.copy()is fine. Nested structure (list of dicts, dict with lists, anything with another container inside) →copy.deepcopy(thing).
When you read AI code that does .copy() on a structure that's
clearly nested (an API response, a config object, anything from
JSON), flag it. The bug is dormant; it'll surface the first time
something downstream mutates a nested field.
Run the editor. Watch user and shallow both grow. deep stays
clean.