{ "cells": [ { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 0, "toc": true }, "source": [ "

Table of Contents

\n", "
" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 0 }, "source": [ "# Introduction to computational methods\n", "\n", "This notebook introduces a python program to integrate daily average flow into and out of\n", "the tailings management facility (TMF), and the volume of water in the TMF through time.\n", "\n", "## Data file\n", "This notebook assumes that a comma-separated variable (csv) file named `data/july2016-tmf-flow.csv`\n", "exists in the data folder in the current working directory. The file contains three columns with headers:\n", "\n", "`date` `outflow` `inflow`\n", "\n", "The first 4 lines of the file are:\n", "\n", "`date,outflow,inflow\n", "2016-07-01,0.0703,0.1181\n", "2016-07-02,0.066,0.1121\n", "2016-07-03,0.0621,0.1079`\n", "\n", "The data in the csv file is in the following format\n", "\n", "`date` is in `yyyy-mm-dd`\n", "\n", "`outflow` is the daily average outflow in $m^3/s$\n", "\n", "`inflow` is the daily average inflow in $m^3/s$\n", "\n", "\n", "\n", "## Algorithm\n", "The volume of water in the TMF on day $t$ of the computation is $V(t)$, the daily average outflow is $Q_{out}$\n", "and the daily average inflow is $Q_{out}$\n", "\n", "The basic computational algorithm is:\n", "\n", "1. Read in the date, $Q_{out}$ and $Q_{out}$ data from the file `july2016-tmf-flow.csv`.\n", "2. Compute the change in volume over the day $\\Delta V = (Q_{in} - Q_{out})\\Delta t$\n", "3. Update the volume of the TMF: $V(t+\\Delta t) = V(t) + \\Delta V$\n", "4. Loop through all dates in the file until complete." ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 0 }, "source": [ "## Python code\n", "We'll introduce python concepts as we go in the course, so we don't expect you to fully understand\n", "this code yet. We'll therefore only provide a cursory explanation here.\n" ] }, { "cell_type": "markdown", "metadata": { "lines_to_next_cell": 0 }, "source": [ "## Python modules/libraries\n", "It is good practice to import libaries at the beginning of a program. Libaries are collections of python code.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "import matplotlib.dates as mdates\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `import` command tells python to import the library. The `as` specifies a user - chosen nickname for library\n", "to be used in this program. So for this program, the `numpy` library will also be known as `np`.\n", "\n", "We are importing four libraries or library sections:\n", "\n", "`numpy` is a library with numerical - related codes. We'll use it alot in the course.\n", "\n", "`pandas` is a libary with codes to manipulate tabular data, that is data organized in rows and columns.\n", "\n", "`matplotlib` is a vast library for plotting. We don't need the whole library, but only two subsections.\n", "\n", "`matplotlib.pyplot`, nicknamed `plt` is a subsection containing so-called handle graphics. We'll use it alot.\n", "\n", "`matplotlib.dates`, which we nickname `mdates` is a specialty subsection containing code for manipulating date formats.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
dateoutflowinflow
02016-07-010.07030.1181
12016-07-020.06600.1121
22016-07-030.06210.1079
32016-07-040.05990.1115
42016-07-050.05740.1108
\n", "
" ], "text/plain": [ " date outflow inflow\n", "0 2016-07-01 0.0703 0.1181\n", "1 2016-07-02 0.0660 0.1121\n", "2 2016-07-03 0.0621 0.1079\n", "3 2016-07-04 0.0599 0.1115\n", "4 2016-07-05 0.0574 0.1108" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# use pandas to read in csv data\n", "data_file = \"data/july2016-tmf-flow.csv\"\n", "df_flow = pd.read_csv(data_file)\n", "df_flow.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `pd.read_csv` command from the `pandas` library reads all the rows and columns from the file `july2016-tmf-flow.csv`.\n", "\n", "We could have also written the command as `pandas.read.csv`, but we used our nickname `pd` to save some typing.\n", "\n", "Here we see the power of the pandas library. This command contains code to open the file, read in each line of data from the\n", "csv file,and then stores that data in a `dataframe` that we named `flow`. You can think of a `dataframe` as rows and\n", "columns of data. In our case, the data in the `dataframe` is dates, outflows and inflows." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## convert flow to a numpy array" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of days in file: 31\n" ] }, { "data": { "text/plain": [ "array([['2016-07-01', 0.0703, 0.1181],\n", " ['2016-07-02', 0.066, 0.1121],\n", " ['2016-07-03', 0.0621, 0.1079],\n", " ['2016-07-04', 0.0599, 0.1115]], dtype=object)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flow = df_flow.values\n", "\n", "# find number of days of data (number of rows in array)\n", "\n", "n = flow.shape[0]\n", "print(f\"Number of days in file: {n}\")\n", "flow[:4, :]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This line of code uses the `.shape` method, which returns the dimension (number of rows and columns) in the dataframe\n", "`flow`. `.shape[0]` is the number of rows, or in our case, the number of days of outflow and inflow data in the dataframe,\n", "which we assign to the variable `n`.\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# allocate np arrays for time and volume for calculations -\n", "# doing this in advance makes python run faster\n", "\n", "v = np.zeros(n, float)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function `zeros` from the numpy library creates an array of zeros. Python runs faster if the arrays used for computation\n", "are defined in advance, versus on the fly. Here we create an array `v` of `n` zeros of type `float`. Each element in\n", "`v` will store the volume of water in the TMF on a specific day." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Columns in csv file: Index(['date', 'outflow', 'inflow'], dtype='object')\n", "type of the timestap object: \n" ] } ], "source": [ "# print the columns in the dataframe\n", "print(f\"Columns in csv file: {df_flow.columns}\")\n", "\n", "# convert column called date to pandas date format\n", "# and store in column called yearmonthday\n", "\n", "df_flow[\"yearmonthday\"] = pd.to_datetime(df_flow[\"date\"])\n", "\n", "# see the data in the column yearmonthday\n", "print(f\"type of the timestap object: {type(df_flow['yearmonthday'][0])}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first print the names of the columns in the dataframe that were read in from the `csv` file.\n", "We do this using the `.columns` method on the `flow` dataframe; that is `flow.columns` returns the indices of\n", "the columns in the dataframe.\n", "\n", "The command `flow[\"yearmonthday\"] = pd.to_datetime(flow[\"date\"]` uses the pandas method `.to_datetime`\n", "to convert the data in the column indexed with `date` in the dataframe `flow` to a date format that pandas\n", "understands, and then stores those panda dates in the dataframe `flow` in a new column with index `yearmonthday`.\n", "\n", "We then print out the type of the data in the column `yearmonthday` to prove that it is now a date." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "lines_to_next_cell": 2 }, "outputs": [], "source": [ "# set volume in TMF at time zero to 8.1 x 10^6 m^3\n", "\n", "v[0] = 8.1e6\n", "\n", "\"\"\"\n", "loop through and compute volume through time\n", "since flow is in m^3/s, and flows are daily average flows, must convert\n", "to m^3 in a day by multiplying by 86400 s/d\n", "\"\"\"\n", "seconds_per_day = 86400.0\n", "inflow = flow[:, 1]\n", "outflow = flow[:, 2]\n", "for i in range(n - 1):\n", " v[i + 1] = v[i] + (inflow[i] - outflow[i]) * seconds_per_day\n", "\n", "df_flow[\"volume\"] = v" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is the guts of the computation. We first set the volume in the TMF at time zero to $8.1\\times 10^6 m^3$.\n", "\n", "Then we loop through the days in the dataframe and compute the change in volume over each day as described above:\n", "$V(t+\\Delta t) = V(t) + (Q_{in} - Q_{out})\\Delta t$, where $\\Delta t$ is $86400$ seconds (one day).\n", "\n", "In python we use a `for` loop. The command `for i in range(n - 1):` sets `i` to `0`, then enters the loop and when finished the loop,\n", "sets `i` to `1`, enters the loop,..., `n-1`.\n", "On the last time through the loop, `i=n-1` so that when we compute`v[i+1]` for the last time, we are computing `v[n]`.\n", "\n", "The rest of the code is used to generate the plot of outflow and inflow over time in on subplot,\n", "and volume of water in the TMF over time in another subplot." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# make some plots\n", "\n", "# format for the x axis label to be month - day eg 07-03 for July 3\n", "# format string for dates that will be plotted on the x axis\n", "\n", "myFmt = mdates.DateFormatter(\"%m-%d\")\n", "\n", "# in the plot ax, plot outflow and inflow as separate lines\n", "\n", "plt.style.use(\"ggplot\")\n", "plt.rcParams[\"figure.figsize\"] = [10, 10]\n", "plt.plot(df_flow[\"yearmonthday\"], df_flow[\"inflow\"], label=\"Inflow\")\n", "plt.plot(df_flow[\"yearmonthday\"], df_flow[\"outflow\"], label=\"Outflow\")\n", "plt.legend()\n", "ax = plt.gca()\n", "fig = plt.gcf()\n", "ax.set(ylabel=\"flow $m^3/s$\", xlabel=\"date\")\n", "ax.xaxis.set_major_formatter(myFmt)\n", "fig.autofmt_xdate()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "lines_to_next_cell": 2 }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# # in second plot, plot the volume in TMF in m^3, which is stored in the \"volume\" column\n", "plt.plot(df_flow[\"yearmonthday\"], df_flow[\"volume\"], label=\"Volume in TMF\")\n", "plt.legend()\n", "ax = plt.gca()\n", "fig = plt.gcf()\n", "ax.set(ylabel=\"flow $m^3/s$\", xlabel=\"date\")\n", "ax.xaxis.set_major_formatter(myFmt)\n", "fig.autofmt_xdate()" ] } ], "metadata": { "jupytext": { "cell_metadata_filter": "all", "formats": "ipynb", "notebook_metadata_filter": "all", "text_representation": { "extension": ".py", "format_name": "percent", "format_version": "1.2", "jupytext_version": "0.8.6" } }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" }, "nbsphinx": { "execute": "never" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": {}, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 2 }