Table of Contents

  • 1  What makes code beautiful?

  • 2  Some rules of thumb

    • 2.1  Some docstring examples

  • 3  Type systems

    • 3.1  Example of strong typing

      • 3.1.1  Your turn – rewrite the cell above with a cast that makes it work

    • 3.2  Example of dynamic typing

    • 3.3  Type summary

  • 4  Flexible functions

    • 4.1  See: vanderplas Section 6 for an explantion of *args and **kwargs

    • 4.2  What is going on under the hood

  • 5  Number 1 python “gotcha”

    • 5.1  Note that the id of the default list is the same for each call!

    • 5.2  The preferred approach – use None as a default value

  • 6  Duck typing and type casting

    • 6.1  Your turn: in the cell below, use numpy.asarray to cast the argument to an array

Wrting python functions

What makes code beautiful?

Reference: Chapter 29 of Beautiful Code by Yukihiro Matsumoto: “Treating code as an essay”

  • Brevity – no unnecessary information – DRY “don’t repeat yourself”

  • Familiarity – use familiar patterns

  • Simplicity

  • Flexibilty – simple things should be simple, complex things should be possible

  • Balance

Coding is a craft, like writing, cooking or furniture making. You develop a sense of balance by following master craftspeople in an apprenticeship. One of the big benefits of github is that it gives you a chance to interact with very good programmers in an informal apprenticeship.

Some rules of thumb

  1. Functions in a program play the role of paragraphs in an essay. They should express a single idea clearly.

  2. That means they should not be longer than a single screen. Paging is distracting and breaks your concentration. It shouldn’t take more than 1 minute to understand what a function does.

  3. Not every function has to be documented, but you should be able to summarize any function you write in a clear, concise, docstring.

  4. The best documentation is a working test case.

  5. You should think about how your function might change in the future, and design in some degree of flexibility.

  6. Functions should have a single entry and a single exit

  7. Whenever possible functions should be free of side effects. Exceptions to this rule include opening and writing files to disk, and modifying large arrays in place to avoid a copy.

  8. If you do modify an erray that is passed as a function argument, return that array to signal the change. In python there is no performance penalty for this, because the array is not copied, instead, a new name is assigned and python now knows that two names point to the same array. When in doubt, use the id function to get the memory location of the new name and the old name – they should be identical

Type systems

In order to understand python functions, it helps to understand how python handles types.

Compare C and python:

  • C: Strongly typed, statically typed

  • Python: Stongly typed, dynamically typed

Example of strong typing

The following cell will raise a TypeError in python. This will also fail to compile in C

[1]:
a = 5
try:
    b = 5 + "3"
except TypeError:
    print("caught a TypeError -- won't work")
caught a TypeError -- won't work

Your turn – rewrite the cell above with a cast that makes it work

Example of dynamic typing

The following cell will run in python, but would fail to compile in C because it reassigns the type of a

[2]:
a = 5
print(f"the type of a is {type(a)}")
a = "5"
print(f"now the type of a is {type(a)}")
the type of a is <class 'int'>
now the type of a is <class 'str'>

Type summary

  • Python is strongly typed, which means that it won’t coerce a type into another type without an explicit cast. (“Explicit is better than implicit”)

  • Python is dynamically typed, which means that a variable name is attached to an instance of an object, but not to the object’s type, so the name can be reassigned to an instance of a different type.

Flexible functions

See: vanderplas Section 6 for an explantion of *args and **kwargs

[3]:
def fibonacci(N, a=0, b=1):
    L = []
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L
[4]:
fibonacci(10)
[4]:
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
[5]:
fibonacci(10, b=3, a=1)
[5]:
[3, 4, 7, 11, 18, 29, 47, 76, 123, 199]

What is going on under the hood

Now rewrite this to be fully flexible – this is what the default arguments code is actually doing:

[6]:
def fibonacci_raw(*args, **kwargs):
    print(f"I got args={args} and kwargs={kwargs}")
    N = args[0]
    L = []
    #
    # the dictionary "get" method takes a second
    # argument which is the default value
    # that is returned when the dictionary key is missing
    #
    a = kwargs.get("a", 0)
    b = kwargs.get("b", 1)
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L
[7]:
fibonacci_raw(10, b=3, a=1, bummer=True)
I got args=(10,) and kwargs={'b': 3, 'a': 1, 'bummer': True}
[7]:
[3, 4, 7, 11, 18, 29, 47, 76, 123, 199]
[8]:
fibonacci_raw(10)
I got args=(10,) and kwargs={}
[8]:
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

Number 1 python “gotcha”

As noted here, there is a subtle issue with using default arguments that are not numbers or strings. Bottom line, do not do this.

https://docs.python-guide.org/writing/gotchas/

Here’s an example of how you can get bitten:

[9]:
def append_to(element, to_list=[]):
    to_list.append(element)
    print(f"\ncalling with element={element}, the id of to_list is {id(to_list)}\n")
    return to_list


my_list = append_to(12)
print(f"first time I call the function I get {my_list}")

my_other_list = append_to(42)
print(f"second time I call the function I get {my_other_list}")

calling with element=12, the id of to_list is 4416911432

first time I call the function I get [12]

calling with element=42, the id of to_list is 4416911432

second time I call the function I get [12, 42]

Note that the id of the default list is the same for each call!

This is generally not what you expect, because you’ll get different behaviour with identical inputs. This violates “no side effects” and also “familiarity”

The preferred approach – use None as a default value

If you want the list to be created fresh by default, then test for None and create it.

Note that now the two lists have different ids.

[10]:
def append_to(element, to_list=None):
    if to_list is None:
        to_list = []
    to_list.append(element)
    print(f"\ncalling with element={element}, the id of to_list is {id(to_list)}\n")
    return to_list


my_list = append_to(12)
print(f"first time I call the function I get {my_list}")

my_other_list = append_to(42)
print(f"second time I call the function I get {my_other_list}")

calling with element=12, the id of to_list is 4416911880

first time I call the function I get [12]

calling with element=42, the id of to_list is 4416911176

second time I call the function I get [42]

Duck typing and type casting

Consider the following function:

[11]:
import numpy as np


def trysort(mylist):
    #
    # this assumes mylist is a "duck" with a sort method
    #
    mylist.sort()
    print(f"inside trysort, mylist is {mylist}")
    return mylist


trysort([3, 2, 1])
trysort(np.array([3, 2, 1]))
try:
    trysort((3, 2, 1))
except AttributeError:
    print("last example failed, tuple has no sort method")
inside trysort, mylist is [1, 2, 3]
inside trysort, mylist is [1 2 3]
last example failed, tuple has no sort method

This is an example of “duck typing”

If it walks like duck, and quacks like a duck
then it's a duck

This function fails because the tuple object has no sort method

Your turn: in the cell below, use numpy.asarray to cast the argument to an array

[ ]:

writing tests

Python has an extensive testing framework called pytest. This is overkill for this class, but we can capture the spririt of pytest by writing test functions with asserts

Example

[18]:
from numpy.testing import assert_allclose
def test_fib():
    #
    # deliberately insert a wrong result
    #
    result=fibonacci(10, b=3, a=1)
    result[0]=5
    answer=[2, 4, 7, 11, 18, 29, 47, 76, 123, 199]
    assert_allclose(result,answer)

test_fib()
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-18-c613ee92d79a> in <module>
      9     assert_allclose(result,answer)
     10
---> 11 test_fib()
     12

<ipython-input-18-c613ee92d79a> in test_fib()
      7     result[0]=5
      8     answer=[2, 4, 7, 11, 18, 29, 47, 76, 123, 199]
----> 9     assert_allclose(result,answer)
     10
     11 test_fib()

~/mini37/envs/e213/lib/python3.6/site-packages/numpy/testing/_private/utils.py in assert_allclose(actual, desired, rtol, atol, equal_nan, err_msg, verbose)
   1491     header = 'Not equal to tolerance rtol=%g, atol=%g' % (rtol, atol)
   1492     assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
-> 1493                          verbose=verbose, header=header, equal_nan=equal_nan)
   1494
   1495

~/mini37/envs/e213/lib/python3.6/site-packages/numpy/testing/_private/utils.py in assert_array_compare(comparison, x, y, err_msg, verbose, header, precision, equal_nan, equal_inf)
    817                                 verbose=verbose, header=header,
    818                                 names=('x', 'y'), precision=precision)
--> 819             raise AssertionError(msg)
    820     except ValueError:
    821         import traceback

AssertionError:
Not equal to tolerance rtol=1e-07, atol=0

Mismatch: 10%
Max absolute difference: 3
Max relative difference: 1.5
 x: array([  5,   4,   7,  11,  18,  29,  47,  76, 123, 199])
 y: array([  2,   4,   7,  11,  18,  29,  47,  76, 123, 199])

Sumary for testing

When we start writing python modules, we can use pytest to search through the file, find any functions with the word “test” in their name, and run those tests, generating a report