<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Thabo Fisher</title>
    <description>The latest articles on Forem by Thabo Fisher (@fisherthabo).</description>
    <link>https://forem.com/fisherthabo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F160688%2F40bcb700-a818-42de-b042-be8ba0bf58a6.jpg</url>
      <title>Forem: Thabo Fisher</title>
      <link>https://forem.com/fisherthabo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/fisherthabo"/>
    <language>en</language>
    <item>
      <title>Three Ways to Automate Python Scripts via Jupyter Notebook</title>
      <dc:creator>Thabo Fisher</dc:creator>
      <pubDate>Thu, 25 Apr 2019 20:58:06 +0000</pubDate>
      <link>https://forem.com/fisherthabo/three-ways-to-automate-python-scripts-via-jupyter-notebook-205g</link>
      <guid>https://forem.com/fisherthabo/three-ways-to-automate-python-scripts-via-jupyter-notebook-205g</guid>
      <description>&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F6r8lypwhzsuizuletlno.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F6r8lypwhzsuizuletlno.png" alt="alt text of image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Whether you're sending reports, executing long running tasks or updating a dashboard you likely have a dozen or so Notebooks that need to be run on a regular basis.  You can set yourself reminders, make sure everyone on the team has the script to run when needed (e.g. you're on vacation) and make sure you’re logged in before the boss so you can update dashboards, but at some point the cost of your time is going to push you toward automation.  We're going to cover three ways to get this done:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Locally setting up a process to automatically run Python script in the background&lt;/li&gt;
&lt;li&gt;Use SeekWell to run notebooks automatically and remotely&lt;/li&gt;
&lt;li&gt;Setting up your own server to remotely run a notebook&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  1. Locally (on your computer)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;   Simple; No additional costs&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt;   Requires your computer be awake and connected to the internet 24/7; Time consuming to set up and varies depending on your operating system&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A. Use &lt;code&gt;nbconvert&lt;/code&gt; to convert your notebook into a .py file&lt;/strong&gt;&lt;br&gt;
a. Navigate to the directory of your notebook via your command line&lt;br&gt;
b. Run &lt;code&gt;jupyter nbconvert --to script 'my-notebook.ipynb'&lt;/code&gt;&lt;br&gt;
c. The above command will create my-notebook.py&lt;br&gt;
d. Run python &lt;code&gt;my-notebook.py&lt;/code&gt;  to test it&lt;br&gt;
e. More on &lt;code&gt;nbconvert&lt;/code&gt; can be &lt;a href="https://nbconvert.readthedocs.io/en/latest/usage.html#convert-script" rel="noopener noreferrer"&gt;found here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;B. Run the script on a schedule&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Windows) via Task Scheduler&lt;/em&gt;&lt;br&gt;
a. Click the Windows Start menu, click Control Panel &amp;gt; Administrative Tools and click Task Scheduler&lt;br&gt;
b. On the The Actions pane, click on the Create Basic Task action&lt;br&gt;
c. If your script is located at "E:\testscript.py" specify C:\path\to\python\python.exe "E:\My script.py" in task scheduler action section.  If you don't know your path to python, check out &lt;a href="https://www.youtube.com/watch?v=n2Cr_YRQk7o" rel="noopener noreferrer"&gt;this video&lt;/a&gt;&lt;br&gt;
d. Navigate to the Trigger section and create a new trigger with the schedule you'd like (e.g. every hour) &lt;a href="https://www.youtube.com/watch?v=n2Cr_YRQk7o" rel="noopener noreferrer"&gt;This video&lt;/a&gt; does a good job walking thru this part&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Mac) via LaunchControl ($15)&lt;/em&gt;&lt;br&gt;
a. Open LaunchControl and select Global Agents&lt;br&gt;
b. Find your script.  It should show green, indicating that it is executable&lt;br&gt;
c. Under Start Calendar Interval, select when and how often you want the script to run&lt;br&gt;
d. You can set additional rules under Keep Alive&lt;br&gt;
e. You can find more details &lt;a href="http://www.prelc.si/koleznik/run-scripts-automatically-on-mac-os-x/" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Use SeekWell
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;   Three click automation from within Jupyter Notebook or the desktop app; Easy and secure access to Google Sheets, Slack and SQL databases; The desktop app includes the ability to use SQL alongside Python&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt;   Requires subscription after free 14 day trial ($49/mo)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://seekwell.io/?ref=devto" rel="noopener noreferrer"&gt;SeekWell's&lt;/a&gt; Chrome Extension and desktop app allow you to schedule a notebook to run daily, hourly or every 5 minutes with just a couple clicks.  You can also send data directly to Google Sheets or Slack without storing API keys in plain text.  This makes it easy to automatically refresh dashboards using Sheets’ or sending alerts to Slack.&lt;/p&gt;

&lt;p&gt;There are two ways to automate with SeekWell--using the Chrome Extension within Jupyter Notebooks or using the desktop app.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Chrome Extension&lt;/em&gt;&lt;br&gt;
a. Add the Chrome extension &lt;a href="https://chrome.google.com/webstore/detail/seekwell/mefkdbekccdbdihhondepjimindlbpfg" rel="noopener noreferrer"&gt;here&lt;/a&gt; and create a SeekWell account &lt;a href="https://seekwell.io/create/?ref=devto" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;br&gt;
b. Open a Jupyter Notebook&lt;br&gt;
c. Click on the SeekWell Chrome Extension and select how often you’d like the notebook run&lt;br&gt;
d. Click save and you’re done!  You can manage all your schedules from your &lt;a href="https://seekwell.io/profile" rel="noopener noreferrer"&gt;dashboard&lt;/a&gt;.&lt;br&gt;
e. (Optional) Specify a destination (e.g. Google Sheets or Slack) in the Extension using the notebook metadata. See &lt;a href="https://chrome.google.com/webstore/detail/seekwell/mefkdbekccdbdihhondepjimindlbpfg?hl=en" rel="noopener noreferrer"&gt;this video&lt;/a&gt; for more info.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;SeekWell Desktop app&lt;/em&gt;&lt;br&gt;
a. Create a SeekWell account &lt;a href="https://seekwell.io/create/?ref=devto" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;br&gt;
b. Download the desktop app as part of the sign-up flow.  If you want to send data to Slack, be sure to add that integration too.&lt;br&gt;
c. If you need help connecting to your database, check out &lt;a href="https://intercom.help/seekwell/seekwell-desktop/getting-started-desktop-app" rel="noopener noreferrer"&gt;this article&lt;/a&gt;.  Code cells on the left default to SQL.  To switch them to Python type &lt;code&gt;/python&lt;/code&gt; in a cell and press &lt;code&gt;enter&lt;/code&gt; or &lt;code&gt;return&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F39znlgnwgijd30g7cs4w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2F39znlgnwgijd30g7cs4w.png" alt="alt text of image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;d. Write your code in the cells and specify a destination for the data.  For Google Sheets, navigate to ‘Sheets’ on the right hand side, select a workbook and designate the sheet and cell location in the field just below the code cell using A1 notation (e.g., Sheet2!B10).  For Slack, specify a channel (e.g., #alerts).&lt;br&gt;
e. To  create a schedule, click on the clock icon in the app.  Select how often you’d like to have it run, and the time of day if applicable.&lt;/p&gt;

&lt;p&gt;Here’s what it looks like in the app:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fo41uxctaleysmwirbewf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fthepracticaldev.s3.amazonaws.com%2Fi%2Fo41uxctaleysmwirbewf.png" alt="alt text of image"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;f. Click save and you're done!  You can manage your schedules from your &lt;a href="https://seekwell.io/profile" rel="noopener noreferrer"&gt;dashboard&lt;/a&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Remotely (in the cloud)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Only costs computing power; Doesn't break when your computer is off / you're on vacation&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Time consuming and complex to set up; Requires engineering and dev ops resources to get started and maintain; May require storing passwords in plain text on a server&lt;/p&gt;

&lt;p&gt;You can set up scripts to run on a server, so they can refresh whether or not you’re not logged in to your machine / connected to the internet.  We're going to use Google Cloud Platform here, but it's possible to do something similar on AWS or your cloud of choice. Here are the steps:&lt;/p&gt;

&lt;p&gt;a. Set up a &lt;a href="https://cloud.google.com/storage/docs/quickstart-console" rel="noopener noreferrer"&gt;Google Cloud Storage bucket&lt;/a&gt;&lt;br&gt;
b. Load your notebook to your bucket&lt;br&gt;
c. Install the &lt;a href="https://cloud.google.com/sdk/" rel="noopener noreferrer"&gt;glcoud CLI&lt;/a&gt;&lt;br&gt;
d. Depending on your prior use of Google Cloud, you will need to &lt;a href="https://cloud.google.com/apis/docs/enable-disable-apis" rel="noopener noreferrer"&gt;enable certain API's&lt;/a&gt; (e.g. Google Cloud Storage)&lt;br&gt;
e. Run the following commands in your terminal (bash), be sure to change REPLACEWITHYOURBUCKET to your bucket created in step a.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Compute Engine Instance parameters
export IMAGE_FAMILY="tf-latest-cu100" 
export ZONE="us-central1-b"
export INSTANCE_NAME="notebook-executor"
export INSTANCE_TYPE="n1-standard-8"
# Notebook parameters
export INPUT_NOTEBOOK_PATH="gs://REPLACEWITHYOURBUCKET/input.ipynb"
export OUTPUT_NOTEBOOK_PATH="gs://REPLACEWITHYOURBUCKET/output.ipynb"
export STARTUP_SCRIPT="papermill ${INPUT_NOTEBOOK_PATH} ${OUTPUT_NOTEBOOK_PATH}"

gcloud compute instances create $INSTANCE_NAME \
        --zone=$ZONE \
        --image-family=$IMAGE_FAMILY \
        --image-project=deeplearning-platform-release \
        --maintenance-policy=TERMINATE \
        --accelerator='type=nvidia-tesla-t4,count=2' \
        --machine-type=$INSTANCE_TYPE \
        --boot-disk-size=100GB \
        --scopes=https://www.googleapis.com/auth/cloud-platform \
        --metadata="install-nvidia-driver=True,startup-script=${STARTUP_SCRIPT}"

gcloud --quiet compute instances delete $INSTANCE_NAME --zone $ZONE]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;f. This should run the notebook once and place the results in your bucket as output.ipynb&lt;/p&gt;

&lt;p&gt;g. Next, we need a way to trigger this script to run automatically, which we can do with an &lt;a href="https://cloud.google.com/appengine/docs/standard/python3/scheduling-jobs-with-cron-yaml" rel="noopener noreferrer"&gt;App Engine cron job&lt;/a&gt;&lt;br&gt;
h. Follow the instructions &lt;a href="https://cloud.google.com/appengine/docs/standard/python3/quickstart" rel="noopener noreferrer"&gt;here&lt;/a&gt; to create an App Engine instance. This is a &lt;a href="http://flask.pocoo.org/" rel="noopener noreferrer"&gt;Flask&lt;/a&gt; web app.&lt;br&gt;
i. Add an end point to execute the bash script above (be sure to import subprocess), e.g.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def run_notebook():
   cmd = 'LONG BASH SCRIPT ABOVE'
   response = subprocess.run(cmd, stderr=subprocess.PIPE, stdout=subprocess.PIPE, shell = True)
   return 'Success!'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;j. Deploy your web app with gcloud app deploy&lt;/p&gt;

&lt;p&gt;A little bit of legwork up front can set you and your team up with a steady flow of data without worrying about pushing a button every hour. Let me know in the comments if you run into trouble!&lt;/p&gt;

</description>
      <category>python</category>
      <category>jupyternotebooks</category>
      <category>automation</category>
      <category>productivitiy</category>
    </item>
  </channel>
</rss>
