139 views
# An Introduction of Python Tricks and Traps This article isn't for CTF specifically. It's more like written for developing Apps with Python. All the contents of this article are discussed in Python 3.X. ----- Online Version: https://demo.codimd.org/s/ry8VVAGMB by [@oyiadin](https://blog.b1n.top/) from **Vidar-Team**, 2019. Published under License [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). **YOU ARE NOT ALLOWED TO DISTRIBUTE THIS ARTICLE WITHOUT GIVING [APPROPRIATE CREDIT](https://creativecommons.org/licenses/by-nc-sa/4.0/#).** ----- P.S. This is an unfinished version, currently I have no time to finish it. But it's still an interesting thing to understand the concepts according to the unfinished titles. ----- ## Literals & Identifiers ### String Prefixes Putting some characters before a string, you can attach some special functionalities to it. * `b` or `B`: To define a `bytes` instead of a `str` * `r` or `R`: To define a "raw string". As the name implies, the patterns like `\n` won't be escaped, they would remain in the way they actually look like * `u`: To define a "unicode" string. **Useless in Python 3.** * `f` or `F`: f-string, new in 3.6 ### Formatted String (or f-string) The format string syntax: ```BNF replacement_field ::= "{" [field_name] ["!" conversion] [":" format_spec] "}" format_spec ::= [[fill]align][sign][#][0][width][grouping_option][.precision][type] fill ::= <any character> align ::= "<" | ">" | "=" | "^" sign ::= "+" | "-" | " " width ::= digit+ grouping_option ::= "_" | "," precision ::= digit+ type ::= "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" ``` Examples: * `"{:=^10}".format("TITLE")`: centering `"TITLE"` with a min-width of `10`, padding with `=` * `f"{__import__('os').name!s}"`: returns `'posix'` on my Macbook. The `!s` means the result would be converted to be a string (explicitly). Meanwhile, `!r` means converting using `repr` instead of `str`. ### Identifiers Starting with `_` If an identifier follows a certain pattern listed below, then it will gain some special features. #### `_*` Won't be imported by statement `from module import *.` Besides, the special identifier `_` is used in the interactive interpreter to store the result of the last evaluation > The name `_` is often used in conjunction with internationalization. #### `__*__` They are system-defined names. It's not recommended to use an identifier named as something like `__*__`. #### `__*` Class-private names. When used within the context of a class definition, they will be re-written to **pretend** to be some "private" attributes. ----- ## Operators and Statements ### `and` and `or` The `and` and `or` act like what they did in C and C++ (i.e. short-circuit logic, or lazy-evaluation). The expression would stop evaluating once the result could be determined definitely. Do remember that `and` and `or` always return what they receive (not `bool`!) That is, `"a" and "b"` returns `"b"` instead of `True`. ### `is` and `==` `==` is for value equality. `is` is for reference equality (something like using builtin function `id`) ### `@` (new in Python 3.5) `A @ B` is equivalent to `A.__matmul__(B)`. It's very useful when you are dealing with matrixes with the help of `numpy`. ### Where to Put `not` in? In most instances, `not` is placed before the expression-to-be-inverted, for example `not a == b`. But in some special cases, you can write it in a more English way: * `if a not in b: pass` * `if a is not b: pass` ### 10 < x < 100 Rather than writing down `10 < x and x < 100`, in Python, you can just follow what it looks like in normal math: `10 < x < 100`. ### `for`-`else` and `while`-`else` The `else` part would be executed after the loop exits normally (i.e., the loop wasn't stopped by any `break`, `return` or `raise` and etc.) A typical way of using it: ```python for i in mylist: if i == target: break process(i) else: print("Oops, the target doesn't exist in mylist!") ``` ----- ## Datamodel ### Truth Value Testing By default, an object is considered true unless its class defines either a __bool__() method that returns False or a __len__() method that returns zero, when called with the object. Here are most of the built-in objects considered false: * constants defined to be false: `None` and `False`. * zero of any numeric type: `0`, `0.0`, `0j` * empty sequences and collections: `''`, `()`, `[]`, `{}`, `set()`, `range(0)` > Do remember that `-1` is treated as **True**! ### Type Hints (new in 3.5) The type hints won't affect the way of code execution. But it's very useful when you are coding in IDE. Most of the Python IDEs can make a full use of your type hints and analyze your code according to these extra information. The basic pattern of type hints: ![](https://codimd.s3.shivering-isles.com/demo/uploads/upload_3f68e765739edf2f58794b320432aaf3.png) ![](https://codimd.s3.shivering-isles.com/demo/uploads/upload_7c628cfb23179849469d3fa89ba2c6ff.png) > Remember to import the required types from `typing`: > `from typing import List, Tuple, Union # etc...` ### Sequences and Mappings Sequence objects in Python: `list`, `tuple`, `set`, `str`, `bytes`, `bytearray`. Mapping objects in Python: `dict`. > `set` is the best choice when you are dealing with unique elements. It supports the operations like `intersection` and `union` natively. > The order of `dict` keys: In Python 3.7, the order is the same as what they were created. But in the previous version of Python, the keys seems like being "shuffled" **(but not randomly)** ### Immutable and Mutable In Python, the objects won't be copied (during argument passing and etc.) in most of times. You will encounter with many problems that are hard to find out if you are not familiar with this feature. Immutable objects: _**`str`**_, `bytes`, `tuple` Mutable object: `list`, `set`, `dict`, `bytearray` and so on #### Object References In Python, objects will be created and used everywhere **(without being copied)** by default. So, as you can see in my previous Python quiz, something strange could happen if you are not careful enough: ```python >>> def forEach(f, x=['a', 'b']): ... for n, i in enumerate(x): ... x[n] = f(x[n]) ... return x ... >>> forEach(lambda x: x.upper()) ['A', 'B'] >>> forEach(lambda x: x) ['A', 'B'] ``` Here are some traps coming from this feature and the corresponding ways to avoid them: ##### 1. Don't use mutable objects as default value in function definition. Solution: Use `None` and some `if`s within the function ##### 2. Don't modify the mutable objects being passed in from other functions. Solution: Never modify the objects that are not created by yourself. Or do a `xxx = xxx.copy()` explicitly at the beginning of your function. ##### 3. Never try to modify a `str` Solution: Create (Formatting, Concatenating...) one by yourself :) ##### 4. Don't use mutable objects as the keys of a `dict` Solution: Never do that, it's a very bad way to design your data structure. If you do insist, you could create your own `list` or `set` and etc. by inheriting from them, and create your own `__hash__` magic method. For example (it's a bad way): ![](https://codimd.s3.shivering-isles.com/demo/uploads/upload_9dafe4a1a258aa83b881905c520ee379.png) ### List / Dict Comprehension and Generator Expression With the help of list comprehension, our code could be shorter and more readable. The key to understand a list compresion, is the very expression before the keyword `if`: ```python assert ['f'+str(i) for i in range(3)] == ['f1', 'f2', 'f3'] ``` In this snippet of code, `'f'+str(i)` will be evaluated in every iteration and finally produced the whole list. Dict comprehension is similar: ```python assert {i: i.upper() for i in 'ab'} == {'a': 'A', 'b': 'B'} ``` But be aware that there is no tuple comprehension. A "tuple comprehension" will be a generator in fact: ```python In [52]: g = (str(i) for i in range(3)) In [53]: g Out[53]: <generator object <genexpr> at 0x1105e6c00> In [54]: g.send(None) Out[54]: '0' In [55]: g.send('0') Out[55]: '1' In [56]: g.send('1') Out[56]: '2' In [57]: g.send('2') ------------------------------------------------------------------------ StopIteration Traceback (most recent call last) <ipython-input-57-e7d830699914> in <module> ----> 1 g.send('2') StopIteration: ``` ----- ## Builtin Functions This section is to introduce you to some useful builtin functions. ### `enumerate` ```python # Showing array in a PHP way: def show(array): print('(', end='') for n, i in enumerate(array): print('\n {} => {}'.format(n, i), end='') print('\n)') show(['obj1', 'obj2', 333]) # ( # 0 => obj1 # 1 => obj2 # 2 => 333 # ) ``` ### `filter` ```python In [58]: raw = ['a', '1', 'a1'] In [59]: filter(lambda x: x.isdigit(), raw) Out[59]: <filter at 0x1105b0ba8> In [60]: list(_) Out[60]: ['1'] ``` ### `map` ```python In [61]: f = lambda *args: ', '.join(map(str, args)) In [62]: f(123, 456, 'aaa', None) Out[62]: '123, 456, aaa, None' ``` ### `reduce` ```python In [65]: from functools import reduce In [66]: cat = lambda *args: reduce(lambda a, b: a + b, args) In [67]: cat('a', 'b', 'cc') Out[67]: 'abcc' ``` > Do remember to import it from `functools` ### `zip` ```python def concatenate(k, v): return list(map(lambda x: '{}:{}'.format(*x), zip(k, v))) assert concatenate( ['key1', 'key2'], ['value1', 'value2']) == ['key1:value1', 'key2:value2'] ``` ### `sorted` and `reversed` ### `classmethod`, `staticmethod` and `property` They are all builtin [decorators](#Decorators). The `classmethod` enables us to call a function of a class without any instances of it. The `staticmethod`-decorated functions are just some normal ones (which happen to be defined inside a class ~~ahhhhhhh~~) The `property`-decorated functions act just like properties (use it without `(...)` calling) ```python= class CLASS(object): def normal_function(self, *args, **kwargs): assert isinstance(self, CLASS) @classmethod def classmethod_function(cls, *args, **kwargs): assert cls == CLASS @staticmethod def staticmethod_function(*args, **kwargs): pass @property def property_function(self): return self.xxx ``` ----- ## Magic Methods ### `__init__` and `__new__` ### `__str__` and `__repr__` ### Rich Comparision Methods `__lt__`, `__le__`, `__eq__`, `__ne__`, `__gt__`, `__ge__` ### `__getattr__` and `__getattribute__` ### `__getitem__` and `__setitem__` ### `__iter__` ### `__call__` ### `with`-statement ---- ## Function-Related Topics ### Function Definition (Special Arguments) #### Iterable and Dictionary unpacking (new in 3.5) #### `lambda` ### Decorators #### Without Arguments #### With Arguments ### Generator Functions (`yield`) ### Coroutine Functions ----- ## Class-Related Topics ### Class Attributes ----- ## The Import System ----- ## Builtin Modules This section is to introduce you to some useful (but less known) builtin modules. ### `pprint` ### `itertools` ### `functools` ### `struct`, `pickle` and `shelve` ### `enum` ### `argparse` ----- ## PEP8 ----- ## What's New in Python 3.X ### Assignment Expressions (new in 3.8) There is new syntax (the "walrus operator", `:=`) to assign values to variables as part of an expression. Example: ```python= if (n := len(a)) > 10: print(f"List is too long ({n} elements, expected <= 10)") ``` ### dataclasses (new in 3.7) The new `dataclass` decorator provides a way to declare data classes. A data class describes its attributes using class variable annotations. Its constructor and other magic methods, such as `__repr__`, `__eq__`, and `__hash__` are generated automatically. Example: ```python @dataclass class Point: x: float y: float z: float = 0.0 p = Point(1.5, 2.5) print(p) # produces "Point(x=1.5, y=2.5, z=0.0)" ``` ### Positional-Only Parameters (new in 3.8) There is new syntax `/` to indicate that some function parameters must be specified positionally (i.e., cannot be used as keyword arguments). This is the same notation as shown by `help` for functions implemented in C (produced by Larry Hastings' "Argument Clinic" tool). Example: ```python def pow(x, y, z=None, /): r = x**y if z is not None: r %= z return r ```