<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Matthias Bussonnier</title>
    <description>The latest articles on Forem by Matthias Bussonnier (@carreau).</description>
    <link>https://forem.com/carreau</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F599314%2Ff8ccbc86-1ebb-4f7b-bfde-86daf024ccd2.png</url>
      <title>Forem: Matthias Bussonnier</title>
      <link>https://forem.com/carreau</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/carreau"/>
    <language>en</language>
    <item>
      <title>Rethinking Jupyter Interactive Documentation</title>
      <dc:creator>Matthias Bussonnier</dc:creator>
      <pubDate>Fri, 07 May 2021 19:45:38 +0000</pubDate>
      <link>https://forem.com/quansightlabs/rethinking-jupyter-interactive-documentation-4okm</link>
      <guid>https://forem.com/quansightlabs/rethinking-jupyter-interactive-documentation-4okm</guid>
      <description>&lt;p&gt;Jupyter Notebook first release was 8 years ago – under the IPython Notebook name at the time. Even if notebooks were not invented by Jupyter; they were definitely democratized by it. Being Web powered allowed development of many changes in the Datascience world. Objects now often expose rich representation; from Pandas dataframes with as html tables, to more recent &lt;a href="https://github.com/scikit-learn/scikit-learn/pull/14180" rel="noopener noreferrer"&gt;Scikit-learn model&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Today I want to look into a topic that has not evolved much since, and I believe&lt;br&gt;
could use an upgrade. Accessing interactive Documentation when in a Jupyter session, and what it could become. At the end I'll link to my current prototype if you are adventurous.&lt;/p&gt;
&lt;h1&gt;
  
  
  The current limitation for users
&lt;/h1&gt;

&lt;p&gt;The current documentation of IPython and Jupyter come in a few forms, but mostly have the same limitation. The typical way to reach for help is to use the &lt;code&gt;?&lt;/code&gt; operator. Depending on the frontend you are using it will bring a pager, or a panel that will display some information about the current object.&lt;/p&gt;

&lt;p&gt;It can show some information about the current object (signature, file, sub/super classes) and the raw DocString of the object.&lt;/p&gt;

&lt;p&gt;You can scroll around but that's about it whether in terminal or Notebooks.&lt;/p&gt;

&lt;p&gt;Compare it to the same documentation on the NumPy website.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fnumpy-linspace-compare.png" class="article-body-image-wrapper"&gt;&lt;img alt="numpy.linspace on numpy.org" src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fnumpy-linspace-compare.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the left is the documentation for NumPy when visiting &lt;a href="https://numpy.org" rel="noopener noreferrer"&gt;the NumPy website&lt;/a&gt;. Let's call that "rendered documentation". On the right what you get in Jupyter Lab or in the IPython or regular Python REPL, let's cal that "help documentation" since it is typically reached via &lt;code&gt;identifier?&lt;/code&gt; or &lt;code&gt;help(identifier)&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Compared to rendered documentation, the help documentation is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hard to read,&lt;/li&gt;
&lt;li&gt;Has no navigation,&lt;/li&gt;
&lt;li&gt;RST Directives have not been interpreted,&lt;/li&gt;
&lt;li&gt;No inline graphs, no rendered math.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is also no access to non-docstring based documentation, &lt;strong&gt;no narrative&lt;/strong&gt;, &lt;strong&gt;no tutorials&lt;/strong&gt;, &lt;strong&gt;no image gallery or examples&lt;/strong&gt;, no search, no syntax highlighting, no way to interact or modify documentation to test effects of parameters.&lt;/p&gt;
&lt;h1&gt;
  
  
  Limitation for authors
&lt;/h1&gt;

&lt;p&gt;Due to Jupyter and IPython limitations to display documentation I believe authors are often contained to document functions.&lt;/p&gt;

&lt;p&gt;Syntax in docstrings is often kept simple for readability, this first version is&lt;br&gt;
often preferred:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You can use ``np.einsum('i-&amp;gt;', a)`` ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the longer form, which makes the reference into a link when viewing rendered&lt;br&gt;
documentation, it is difficult to read when shown as help documentation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You can use :py:func:`np.einsum('i-&amp;gt;', a) &amp;lt;numpy.einsum&amp;gt;` ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This also leads to long discussions about which syntax to use in advanced areas, like formulas in &lt;a href="https://github.com/sympy/sympy/issues/14964" rel="noopener noreferrer"&gt;Sympy's docstrings&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Many projects have to implement dynamic docstrings; for example to include all the parameters a function or class would pass down using &lt;code&gt;**kwargs&lt;/code&gt; (search the matplotlib source code for &lt;code&gt;_kwdoc&lt;/code&gt; for example, or look at the &lt;code&gt;pandas.DataFrame&lt;/code&gt; implementation).&lt;/p&gt;

&lt;p&gt;This can make it relatively difficult for authors and contributors to properly maintain and provide comprehensive docs.&lt;/p&gt;

&lt;p&gt;I'm not sure I can completely predict all the side effects this has on how library maintainers write docs; but I believe there is also a strong opportunity for a tool to help there. See for example &lt;a href="https://github.com/Carreau/velin" rel="noopener noreferrer"&gt;vélin&lt;/a&gt; which attempts to auto reformat and fix common NumPyDoc's format mistakes and&lt;br&gt;
typos – but that's a subject of a future post.&lt;/p&gt;

&lt;h1&gt;
  
  
  Stuck between a Rock and a Hard place
&lt;/h1&gt;

&lt;p&gt;While Sphinx and related projects are great at offering hosted HTML documentation, extensive usage of those makes interactive documentation harder to consume.&lt;/p&gt;

&lt;p&gt;While it is possible to &lt;a href="https://github.com/spyder-ide/docrepr" rel="noopener noreferrer"&gt;run Sphinx on the fly when rendering docstrings&lt;/a&gt;, most Sphinx features only work when building a full project, with the proper configuration and extension, and can be computationally intensive. This makes running Sphinx locally impractical.&lt;/p&gt;

&lt;p&gt;Hosted websites often may not reflect the locally installed version of the libraries and require careful linking, deprecation and narrative around platform or version specific features.&lt;/p&gt;

&lt;h1&gt;
  
  
  This is fixable
&lt;/h1&gt;

&lt;p&gt;For the past few months I've been working on rewriting how IPython (and hence Jupyter) can display documentation. It works both in terminal (IPython) and browser context (notebook, JupyterLab, Spyder) with proper rendering, and currently understands most directives; it could be customized to understand any new ones:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fpapyri-1.png" class="article-body-image-wrapper"&gt;&lt;img alt="papyri1" src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fpapyri-1.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Above is the (terminal) documentation of &lt;code&gt;scipy.polynomial.lagfit&lt;/code&gt;, see how the single backticks are properly understood and refer to known parameters, it detected that  &lt;code&gt;`n`&lt;/code&gt; is incorrect as it should have double backticks; notice the rendering of the math even in terminal.&lt;/p&gt;

&lt;p&gt;For that matter technically this does not care as to whether the DocString is written in RST or Markdown; though I need to implement the latter part. I believe though that some maintainers would be quite happy to use Markdown, the syntax of which more users are familiar with.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fpapyri-nav.gif" class="article-body-image-wrapper"&gt;&lt;img alt="papyri navigation" src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fpapyri-nav.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It supports navigation – here in a terminal – where clicking or pressing enter on a link would bring you to the target page. In the above gif you can see that many tokens of the code example are also automatically type-inferred (thanks &lt;a href="https://github.com/davidhalter/jedi" rel="noopener noreferrer"&gt;Jedi&lt;/a&gt;), and can also be clicked on to navigate to their corresponding page.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fpapyri-terminal-fig.png" class="article-body-image-wrapper"&gt;&lt;img alt="papyri terminal-fig" src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Flabs.quansight.org%2Fimages%2F2021%2F05%2Fpapyri-terminal-fig.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Images are included, even in the terminal when they are not inline but replaced by a button to open them in your preferred viewer (see the &lt;code&gt;Open with quicklook&lt;/code&gt; in the above screenshot).&lt;/p&gt;

&lt;h1&gt;
  
  
  The future
&lt;/h1&gt;

&lt;p&gt;I'm working on a number of other features, in particular:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;rendering of narrative docs – for which I have a prototype,&lt;/li&gt;
&lt;li&gt;automatic indexing of all the figures and plots –  working but slow right now,&lt;/li&gt;
&lt;li&gt;proper cross-library referencing and indexing without the need for intersphinx.
For example, it is possible from the &lt;code&gt;numpy.linspace&lt;/code&gt; page to see all pages that
reference it, or use &lt;code&gt;numpy.linspace&lt;/code&gt; in their example section
(see previous image).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And many others, like showing a graph of the local references between functions, search, and preference configurability. I think this could also support many other desirable features, like user preferences (hide/show type annotation, deprecated directives, and custom color/syntax highlighting) - though I haven't started working on these. I do have some ideas on how this could be used to provide translations as well.&lt;/p&gt;

&lt;p&gt;Right now, is it not as fast and efficient as I would like to – though it's faster than running Sphinx on the fly – but requires some ahead of time processing. And it crashes in many places; it can render most of the documentation of SciPy, NumPy, xarray, IPython and scikit-image.&lt;/p&gt;

&lt;p&gt;I encourage you to think about what features you are missing when using documentation from within Jupyter and let me know. I hope this could become a nice addition to Sphinx when consulting documentation from within Jupyter.&lt;/p&gt;

&lt;p&gt;For now I've submitted a &lt;a href="https://docs.google.com/document/d/1hk-Ww7pUwnoHINNhDeP9UOPvNEemAFe-pohK5dCtZPs/edit?usp=sharing" rel="noopener noreferrer"&gt;Letter of intent to CZI EOSS 4&lt;/a&gt; in an attempt to get some of that work funded to land in IPython, and if you have any interest in contributing or want something like that for your library, feel free to reach out.&lt;/p&gt;

&lt;p&gt;You can find the repository &lt;a href="https://github.com/Carreau/papyri" rel="noopener noreferrer"&gt;on my GitHub account&lt;/a&gt;, it's still in pre-alpha stage. It is still quite unstable with too many hard coded values to my taste, and needs some polish to be considered usable for production. I've focused my effort for now mostly on terminal rendering – a Jupyter notebook or JupyterLab extension would be welcome. So if you are adventurous and like to work from the cutting (or even bleeding) edge, please feel free to try it out and open issues/pull request.&lt;/p&gt;

&lt;p&gt;It also needs to be better documented (pun intended), I'm hoping to use papyri itself to document papyri; but it needs to be a bit more mature for that.&lt;/p&gt;

&lt;p&gt;Stay tuned for more news, I'll try to explain how it works in more detail in a follow-up post, and discuss some of the advantages (and drawbacks) this project has.&lt;/p&gt;

</description>
      <category>python</category>
      <category>opensource</category>
      <category>datascience</category>
      <category>jupyter</category>
    </item>
  </channel>
</rss>
