Python Comprehension vs map()
The comprehension techiques for the data types list
, dict
and set
append values to new data objects. They can be used to replace loops with for
and functions like map
and filter
.
The following examples showcase the usage of comprehension versus map()
with the tokenization of a dataset for a Hugging Face pre-trained model.
Basics with lists
# loop
l = []
for n in data:
l.append(n**2)
print(l)
# comprehension
[n**2 for n in data]
>>> # loop
>>> print(l)
[1, 4, 9, 16, 25]
>>> # comprehension
>>> [n**2 for n in data]
[1, 4, 9, 16, 25]
The built-in function map(function, iterable, ...)
applies the function to every item of iterable. The returned iterator yields the results. See map().
# map
# passed to higher-function with IIFE
# i.e. immediately invoked function execution
# e.g. (lambda n: n**2)(3) # 9
list(map(lambda n: n**2, data))
>>> list(map(lambda n: n**2, data))
[1, 4, 9, 16, 25]
Generated bytecode
The following snippets use the module dis
to disassemble the CPython bytecode used by interpreter and compiler.
The input object can be ‘a module, a class, a method, a function, a generator, an asynchronous generator, a coroutine, a code object, a string of source code or a byte sequence of raw bytecode’. See dis.dis().
from dis import dis, show_code
# comprehension
dis(n**2 for n in data)
show_code(n**2 for n in data)
Expand for Output
>>> dis(n**2 for n in data) 1 0 LOAD_FAST 0 (.0) >> 2 FOR_ITER 14 (to 18) 4 STORE_FAST 1 (n) 6 LOAD_FAST 1 (n) 8 LOAD_CONST 0 (2) 10 BINARY_POWER 12 YIELD_VALUE 14 POP_TOP 16 JUMP_ABSOLUTE 2 >> 18 LOAD_CONST 1 (None) 20 RETURN_VALUE >>> show_code(n**2 for n in data) Name:Filename: Argument count: 1 Positional-only arguments: 0 Kw-only arguments: 0 Number of locals: 2 Stack size: 3 Flags: OPTIMIZED, NEWLOCALS, GENERATOR, NOFREE Constants: 0: 2 1: None Variable names: 0: .0 1: n </pre></p> </details> ```python # lambda lfun = lambda n: n**2 dis(lfun) show_code(lfun) ```Expand for Output
>>> lfun = lambda n: n**2 >>> dis(lfun) 1 0 LOAD_FAST 0 (n) 2 LOAD_CONST 1 (2) 4 BINARY_POWER 6 RETURN_VALUE >>> show_code(lfun) Name:Filename: Argument count: 1 Positional-only arguments: 0 Kw-only arguments: 0 Number of locals: 1 Stack size: 2 Flags: OPTIMIZED, NEWLOCALS, NOFREE Constants: 0: None 1: 2 Variable names: 0: n </pre></p> </details> ## Multiple inputs ```python #TODO ``` ```python #TODO ```