{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": true
   },
   "source": [
    "<h1>Table of Contents<span class=\"tocSkip\"></span></h1>\n",
    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#04---Using-numba-to-release-the-GIL\" data-toc-modified-id=\"04---Using-numba-to-release-the-GIL-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>04 - Using numba to release the GIL</a></span><ul class=\"toc-item\"><li><ul class=\"toc-item\"><li><span><a href=\"#Timing-python-code\" data-toc-modified-id=\"Timing-python-code-1.0.1\"><span class=\"toc-item-num\">1.0.1&nbsp;&nbsp;</span>Timing python code</a></span></li><li><span><a href=\"#Now-try-this-with-numba\" data-toc-modified-id=\"Now-try-this-with-numba-1.0.2\"><span class=\"toc-item-num\">1.0.2&nbsp;&nbsp;</span>Now try this with numba</a></span></li><li><span><a href=\"#Make-two-identical-functions:-one-that-releases-and-one-that-holds-the-GIL\" data-toc-modified-id=\"Make-two-identical-functions:-one-that-releases-and-one-that-holds-the-GIL-1.0.3\"><span class=\"toc-item-num\">1.0.3&nbsp;&nbsp;</span>Make two identical functions: one that releases and one that holds the GIL</a></span></li><li><span><a href=\"#now-time-wait_loop_withgil\" data-toc-modified-id=\"now-time-wait_loop_withgil-1.0.4\"><span class=\"toc-item-num\">1.0.4&nbsp;&nbsp;</span>now time wait_loop_withgil</a></span></li><li><span><a href=\"#not-bad,-but-we're-only-using-one-core\" data-toc-modified-id=\"not-bad,-but-we're-only-using-one-core-1.0.5\"><span class=\"toc-item-num\">1.0.5&nbsp;&nbsp;</span>not bad, but we're only using one core</a></span></li></ul></li></ul></li></ul></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "    pip install contexttimer\n",
    "    conda install numba\n",
    "    conda install joblib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Image\n",
    "import contexttimer\n",
    "import time\n",
    "import math\n",
    "from numba import jit\n",
    "from joblib import Parallel\n",
    "import logging"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 04 - Using numba to release the GIL"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Timing python code\n",
    "\n",
    "\n",
    "One easy way to tell whether you are utilizing multiple cores is to track the wall clock time measured by [time.perf_counter](https://docs.python.org/3/library/time.html#time.perf_counter) against the total cpu time used by all threads meausred with [time.process_time](https://docs.python.org/3/library/time.html#time.process_time)\n",
    "\n",
    "I'll organize these two timers using the [contexttimer](https://github.com/brouberol/contexttimer) module."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To install, in a shell window type:\n",
    "\n",
    "     pip install contexttimer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Define a function that does a lot of computation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def wait_loop(n):\n",
    "    \"\"\"\n",
    "    Function under test.\n",
    "    \"\"\"\n",
    "    for m in range(n):\n",
    "        for l in range(m):\n",
    "            for j in range(l):\n",
    "                for i in range(j):\n",
    "                    i=i+4\n",
    "                    out=math.sqrt(i)\n",
    "                    out=out**2.\n",
    "    return out"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### now time it with pure python"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pure python wall time 12.900637587998062 and cpu time 12.683904\n"
     ]
    }
   ],
   "source": [
    "nloops=200\n",
    "with contexttimer.Timer(time.perf_counter) as pure_wall:\n",
    "    with contexttimer.Timer(time.process_time) as pure_cpu:\n",
    "        result=wait_loop(nloops)\n",
    "print(f'pure python wall time {pure_wall.elapsed} and cpu time {pure_cpu.elapsed}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Now try this with numba\n",
    "\n",
    "Numba is a just in time compiler that can turn a subset of python into machine code using the llvm compiler.\n",
    "\n",
    "Reference:  [Numba documentation](http://numba.pydata.org/numba-doc/dev/index.html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Make two identical functions: one that releases and one that holds the GIL"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "@jit('float64(int64)', nopython=True, nogil=True)\n",
    "def wait_loop_nogil(n):\n",
    "    \"\"\"\n",
    "    Function under test.\n",
    "    \"\"\"\n",
    "    for m in range(n):\n",
    "        for l in range(m):\n",
    "            for j in range(l):\n",
    "                for i in range(j):\n",
    "                    i=i+4\n",
    "                    out=math.sqrt(i)\n",
    "                    out=out**2.\n",
    "    return out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "@jit('float64(int64)', nopython=True, nogil=False)\n",
    "def wait_loop_withgil(n):\n",
    "    \"\"\"\n",
    "    Function under test.\n",
    "    \"\"\"\n",
    "    for m in range(n):\n",
    "        for l in range(m):\n",
    "            for j in range(l):\n",
    "                for i in range(j):\n",
    "                    i=i+4\n",
    "                    out=math.sqrt(i)\n",
    "                    out=out**2.\n",
    "    return out"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### now time wait\\_loop\\_withgil"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "numba wall time 0.05427086600684561 and cpu time 0.051916000000000295\n",
      "numba speed-up factor 236.70834219543877\n"
     ]
    }
   ],
   "source": [
    "nloops=500\n",
    "with contexttimer.Timer(time.perf_counter) as numba_wall:\n",
    "    with contexttimer.Timer(time.process_time) as numba_cpu:\n",
    "        result=wait_loop_withgil(nloops)\n",
    "print(f'numba wall time {numba_wall.elapsed} and cpu time {numba_cpu.elapsed}')\n",
    "print(f\"numba speed-up factor {(pure_wall.elapsed - numba_wall.elapsed)/numba_wall.elapsed}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### not bad, but we're only using one core"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  },
  "toc": {
   "nav_menu": {},
   "number_sections": true,
   "sideBar": false,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": true,
   "toc_position": {},
   "toc_section_display": "block",
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}