<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Dian Fay</title>
    <description>The latest articles on Forem by Dian Fay (@dmfay).</description>
    <link>https://forem.com/dmfay</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F17296%2Fd3eddff4-7ac8-4959-a5c5-5477d4cf7850.png</url>
      <title>Forem: Dian Fay</title>
      <link>https://forem.com/dmfay</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dmfay"/>
    <language>en</language>
    <item>
      <title>ectomigo: Safer Schema Migrations</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Tue, 29 Mar 2022 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/ectomigo-safer-schema-migrations-5g6b</link>
      <guid>https://forem.com/dmfay/ectomigo-safer-schema-migrations-5g6b</guid>
      <description>&lt;p&gt;The team I work with at my day job maintains many applications and processes interacting across a smaller number of databases. This is hardly exceptional. We are also constantly adding, subtracting, and refining not only the client programs but also the database schemas themselves. This too is hardly exceptional: business requirements change, external systems expose new information and deprecate old interfaces, von Moltke's Law ("no plan of operations remains certain once the armies have met") comes calling. Every now and again we just make a modeling or implementation mistake that manages to sneak through review and up to production. Sic semper startups.&lt;/p&gt;

&lt;p&gt;So our database schemas are continually evolving. Each of those many applications and processes has to evolve along with them, or we get paged when the renamed column or dropped table breaks something we hadn't accounted for, and instant breakage is the &lt;em&gt;best&lt;/em&gt; case. We've had schema incompatibilities lie in wait for over a month to catch us completely flatfooted. The complexities of even a single moderately-sized codebase are beyond the grasp of human memory. What hope do we have of recalling which relevant subset of database interactions appear where across two or ten or more?&lt;/p&gt;

&lt;p&gt;What we need is a distinctly &lt;em&gt;inhuman&lt;/em&gt; memory, one for which summoning up each and every reference to a changing table or view takes a moment's effort, and which cannot forget. A memory which operates at the level of the organization, rather than that of the project or of the individual developer/reviewer, only able to focus on a single target at a time. A memory we can consult when, or better yet before, code is ready to deploy -- "&lt;a href="https://en.wikipedia.org/wiki/Shift-left_testing"&gt;shifting left&lt;/a&gt;", as they say.&lt;/p&gt;

&lt;p&gt;We need a database.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ectomigo.com"&gt;I built one&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--WoK8yjKy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://di.nmfay.com/images/ectomigo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--WoK8yjKy--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://di.nmfay.com/images/ectomigo.png" alt="a schema migration alters a table, renaming a column; ectomigo leaves a GitHub review comment pointing out references to that table in two repositories. Each reference includes the columns ectomigo has been able to identify. One reference uses the column's new name, indicating it's been updated, but another in the second repository still uses the old name and must be fixed." width="784" height="500"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;ectomigo is a continuous integration module (initially a &lt;a href="https://github.com/ectomigo/ectomigo"&gt;GitHub action&lt;/a&gt;) which parses your source files using &lt;a href="https://tree-sitter.github.io/tree-sitter/"&gt;tree-sitter&lt;/a&gt; to find data access code: SQL scripts and inline SQL in Java, JavaScript, and Python; &lt;a href="https://massivejs.org"&gt;MassiveJS&lt;/a&gt; calls; &lt;a href="https://www.sqlalchemy.org"&gt;SQLAlchemy&lt;/a&gt; definitions; and more languages, data access patterns, analysis features, and platform support on the way after launch. Everything it finds it indexes, storing database object names and the file row-column positions of each reference.&lt;/p&gt;

&lt;p&gt;When you submit schema changes for review, it parses &lt;em&gt;that&lt;/em&gt; code as well, and matches the targets you're altering or dropping against every codebase your organization has enabled. If it does find any matches -- in other words, you still have live references to an affected database object, in this or another repository -- it leaves review comments alerting you to each potential problem.&lt;/p&gt;

&lt;p&gt;ectomigo is &lt;a href="https://github.com/ectomigo/ectomigo"&gt;launching on GitHub&lt;/a&gt; free for public and up to two private projects, with &lt;a href="https://ectomigo.com/pricing"&gt;pricing available&lt;/a&gt; beyond that. The action code and the &lt;a href="https://gitlab.com/ectomigo/core"&gt;core&lt;/a&gt; code analysis library it integrates are open under the AGPL should you be interested in that.&lt;/p&gt;

&lt;p&gt;We've been using early ectomigo builds at my workplace for a couple of months now, and it's already saved our bacon a few times with reports on database object usage in places we'd forgotten. If you're faced with migration risk yourself, I hope it can help you.&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>news</category>
      <category>devops</category>
      <category>database</category>
    </item>
    <item>
      <title>Exploring Databases Visually</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Sun, 04 Apr 2021 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/exploring-databases-visually-41j5</link>
      <guid>https://forem.com/dmfay/exploring-databases-visually-41j5</guid>
      <description>&lt;p&gt;In &lt;a href="https://gitlab.com/dmfay/dotfiles/-/blob/master/zsh/fks.zsh"&gt;"things you can do with a terminal emulator that renders images"&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;One way to look at a database's structure is as a graph of foreign key relationships among tables. Two styles of visual representation predominate: models or entity-relationship diagrams (ERDs) created as part of requirements negotiation and design, and descriptive diagrams of an extant database. The former are drawn by hand on a whiteboard or in diagramming software; the latter are often generated by database management tools with some manual cleanup and organization. Both styles usually take the complete database as their object, and whether descriptive or prescriptive, their role in the software development process is as reference material, or documentation.&lt;/p&gt;

&lt;p&gt;Documentation isn't disposable. Even though these diagrams are out of date practically as soon as they're saved off, they take effort to make, or at least to make legible -- automated tools are only so good at layout, especially as table and relationship counts grow. That effort isn't lightly discarded, and anyway a diagram that's still &lt;em&gt;mostly&lt;/em&gt; accurate remains a useful reference.&lt;/p&gt;

&lt;p&gt;Documentation isn't disposable. But documentation isn't the only tool we have for orienting ourselves in a system: we can also explore, view the system in parts and from different angles, follow individual paths through the model from concept to concept. Exploration depends on adopting a partial, mobile perspective from the inside of the model, with rapid feedback and enough context to navigate but not so much as to be overwhelmed. The view from a single point is more or less important depending on the point itself, but in order to facilitate exploration that view has to be generated and discarded on demand. Look, move, look, move.&lt;/p&gt;

&lt;p&gt;This is a partial perspective of the &lt;a href="https://github.com/devrimgunduz/pagila"&gt;pagila&lt;/a&gt; sample database, from the table &lt;code&gt;film&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--NGRRjHpT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://di.nmfay.com/images/fks-film.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--NGRRjHpT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://di.nmfay.com/images/fks-film.png" alt="the &amp;quot;film&amp;quot; table in a graph showing its dependence via foreign key on the &amp;quot;language&amp;quot; table, and other tables' dependencies on &amp;quot;film&amp;quot;. A film has corresponding records in &amp;quot;film_actor&amp;quot; and &amp;quot;film_category&amp;quot; (junction tables, to &amp;quot;actor&amp;quot; and &amp;quot;category&amp;quot; tables not shown in this partial perspective); copies of a film are in &amp;quot;inventory&amp;quot;; inventory items in turn are referenced in &amp;quot;rental&amp;quot;; and rentals turn up in a set of &amp;quot;payment&amp;quot; tables partitioned by month."&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's generated by &lt;a href="https://gitlab.com/dmfay/dotfiles/-/blob/master/zsh/fks.zsh"&gt;this &lt;code&gt;fks&lt;/code&gt; zsh function&lt;/a&gt; which queries Postgres' catalog of foreign keys using a &lt;a href="https://www.citusdata.com/blog/2018/05/15/fun-with-sql-recursive-ctes/"&gt;recursive common table expression&lt;/a&gt; to identify and visualize everything connected in a straight line to the target. The query output is passed to the &lt;a href="https://graphviz.org"&gt;Graphviz suite's &lt;code&gt;dot&lt;/code&gt;&lt;/a&gt; with a template, rendered to png, and the png displayed with &lt;a href="https://wezfurlong.org/wezterm/"&gt;&lt;code&gt;wezterm imgcat&lt;/code&gt;&lt;/a&gt;. No files are created or harmed at any point in the process.&lt;/p&gt;

&lt;p&gt;Why only a straight line, though? The graph above has obvious gaps: &lt;code&gt;film_actor&lt;/code&gt; implies an &lt;code&gt;actor&lt;/code&gt;, and &lt;code&gt;film_category&lt;/code&gt; its own table on the other side of the junction. &lt;code&gt;inventory&lt;/code&gt; probably wants a &lt;code&gt;store&lt;/code&gt;, and &lt;code&gt;rental&lt;/code&gt; and the payment tables aren't much use without a &lt;code&gt;customer&lt;/code&gt;. The view from &lt;code&gt;rental&lt;/code&gt; is markedly different, with half a dozen tables that weren't visible at all from &lt;code&gt;film&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7KWe-s4V--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://di.nmfay.com/images/fks-rental.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7KWe-s4V--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://di.nmfay.com/images/fks-rental.png" alt='a perspective on the pagila sample database from the "rental" table. The same "payment" tables depend on it, but upstream "inventory" is joined by "customer" and "staff", and further up "store", "address" (relating to customers, staff, and stores), "city", and "country" tables. "Film" and "language" are also present upstream from "inventory".'&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This graph is familiar in part: there's &lt;code&gt;rental&lt;/code&gt; itself, the payment tables, &lt;code&gt;inventory&lt;/code&gt;, &lt;code&gt;film&lt;/code&gt; -- the last shorn of the junctions to the still-missing &lt;code&gt;actor&lt;/code&gt; and &lt;code&gt;category&lt;/code&gt; tables. Those have passed around a metaphorical corner, since in order to get from &lt;code&gt;rental&lt;/code&gt; to &lt;code&gt;film_actor&lt;/code&gt; you must travel first &lt;em&gt;up&lt;/em&gt; foreign keys into &lt;code&gt;film&lt;/code&gt; (via &lt;code&gt;rental.inventory_id&lt;/code&gt; and &lt;code&gt;inventory.film_id&lt;/code&gt;), then &lt;em&gt;down&lt;/em&gt; by way of &lt;code&gt;film_actor.film_id&lt;/code&gt;. &lt;code&gt;language&lt;/code&gt;, meanwhile, is "upwards" of &lt;code&gt;film&lt;/code&gt; and therefore remains visible from &lt;code&gt;rental&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The reason &lt;code&gt;fks&lt;/code&gt; restricts its search to straight lines from the target table is to keep context narrow. You can get a fuller picture of the table structure by navigating and viewing the graph from multiple perspectives; what &lt;code&gt;fks&lt;/code&gt; shows is the set of tables which can affect the target, or which will be affected by changes in the target. If you delete a &lt;code&gt;store&lt;/code&gt; or a &lt;code&gt;film&lt;/code&gt;, rentals from that store or of that film are invalidated (and, unless the intermediary foreign keys are set to cascade, the delete fails). But deleting a &lt;code&gt;film_actor&lt;/code&gt; has nothing to do with &lt;code&gt;rental&lt;/code&gt;, and vice versa.&lt;/p&gt;

&lt;p&gt;There's an actual, serious problem with unrestricted traversal, too. If you recurse through &lt;em&gt;all&lt;/em&gt; relationships, you wind up mapping entire subgraphs, or clusters of related tables. And clusters grow quickly. Stuart Kauffman has a great illustration of the principle in his book &lt;em&gt;At Home in the Universe: The Search for the Laws of Self-Organization and Complexity&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Imagine 10,000 buttons scattered on a hardwood floor. Randomly choose two buttons and connect them with a thread. Now put this pair down and randomly choose two more buttons, pick them up, and connect them with a thread. As you continue to do this, at first you will almost certainly pick up buttons that you have not picked up before. After a while, however, you are more likely to pick at random a pair of buttons and find that you have already chosen one of the pair. So when you tie a thread between the two newly chosen buttons, you will find three buttons tied together. In short, as you continue to choose random pairs of buttons to connect with a thread, after a while the buttons start becoming interconnected into larger clusters.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;When the ratio of threads to buttons, or relationships to tables, passes 0.5, there's a phase transition. Enough clusters exist that the next thread or relationship will likely connect one cluster to another, and the next, and the next. A supercluster emerges, nearly the size of the entire relationship graph. We can see what the relationship:table ratio looks like in a database by querying the system catalogs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;tbls&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;information_schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tables&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;table_schema&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pg_catalog'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'information_schema'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;fks&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_constraint&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;contype&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'f'&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt;
  &lt;span class="n"&gt;fks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;tbls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;fks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;decimal&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;tbls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tbls&lt;/span&gt; &lt;span class="k"&gt;CROSS&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;fks&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The lowest ratio I have in a real working database is 0.56, and it's a small one, with f=14 and t=25. Others range from 0.61 (f=78, t=126) all the way up to 1.96 (f=2171, t=1107 thanks to a heavily partitioned table with multiple foreign keys); pagila itself is in the middle at 1.08 (f=27, t=25). I don't have enough data to back this up, but I think it's reasonable to expect that the number of relationships tends to increase faster than the number of tables. Without restrictions on traversal, you might as well draw a regular ERD: superclusters are inevitable.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;fks&lt;/code&gt; will draw a regular ERD if passed only the database name, but like I said earlier, automated tools are only so good at layout (and in a terminal of limited width, even a smallish database is liable to produce an illegibly zoomed-out model). With no way to add universal render hints, Graphviz does a lot better with the smaller, more restricted graphs from local perspectives inside the database -- and so do humans. Reading a full-scale data model is hard! Tens or hundreds of nodes have to be sorted by relevance to the problem at hand; nodes and relationships which matter have to be mapped, the irrelevant actively ignored, others tagged with a mental question mark. Often a given problem involves more relevant entities than the human mind can track unaided. &lt;code&gt;fks&lt;/code&gt; doesn't resolve the issue completely, but making a database spatial and navigating that space visually goes some way to meet our limitations and those of our tools.&lt;/p&gt;

</description>
      <category>database</category>
      <category>showdev</category>
      <category>shell</category>
    </item>
    <item>
      <title>Extra-fuzzy History Searching with Mnem</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Thu, 17 Sep 2020 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/extra-fuzzy-history-searching-with-mnem-5ccl</link>
      <guid>https://forem.com/dmfay/extra-fuzzy-history-searching-with-mnem-5ccl</guid>
      <description>&lt;p&gt;I use a lot of Rust command-line tools: &lt;a href="https://github.com/BurntSushi/ripgrep"&gt;ripgrep&lt;/a&gt;, &lt;a href="https://github.com/sharkdp/fd"&gt;fd&lt;/a&gt;, &lt;a href="https://github.com/bootandy/dust"&gt;dust&lt;/a&gt;, and more. So when I had my own idea for a better command-line mousetrap, it seemed like the way to go.&lt;/p&gt;

&lt;p&gt;Shells log the commands you enter to a history file. Bash has &lt;code&gt;.bash_history&lt;/code&gt;, zsh uses &lt;code&gt;.histfile&lt;/code&gt;. The &lt;code&gt;EXTENDED_HISTORY&lt;/code&gt; option in the latter adds timestamps, but that's about as fancy as it gets. Both shells (and presumably others) also have "reverse search" functionality which lets you look backwards and forwards through it, one line at a time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--_eiB_S1o--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://di.nmfay.com/images/mnem-ctrl-r.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--_eiB_S1o--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://di.nmfay.com/images/mnem-ctrl-r.gif" alt="reverse searching for rustc calls"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Functional! But not especially friendly. Only seeing one result at a time makes it difficult to evaluate multiple similar matches; matching is strictly linear, as you can see by my typos; and the chronological is only sometimes the most useful order.&lt;/p&gt;

&lt;p&gt;I do a lot with the AWS CLI, SaltStack, and other complicated command-line interfaces. I want to compare invocations to see how I've combined verbs and flags in the past, and for tasks I repeat just often enough to forget how to do them sorting by overall frequency is more useful than sorting by time.&lt;/p&gt;

&lt;p&gt;Enter &lt;a href="https://gitlab.com/dmfay/mnem"&gt;Mnem&lt;/a&gt; (regrettably, I missed getting &lt;code&gt;clio&lt;/code&gt;, the Muse of history, by a matter of weeks):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--7tdFRtpc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://di.nmfay.com/images/mnem.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--7tdFRtpc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://di.nmfay.com/images/mnem.gif" alt="mnem in use"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The idea is pretty simple: load the history file, and reduce every command to its syntactic structure. &lt;code&gt;git commit -m "some message here"&lt;/code&gt; becomes &lt;code&gt;git commit -m &amp;lt;val&amp;gt;&lt;/code&gt;; &lt;code&gt;mv "hither" "thither"&lt;/code&gt; turns into &lt;code&gt;mv &amp;lt;arg1&amp;gt; &amp;lt;arg2&amp;gt;&lt;/code&gt;. Many entries will have the same structure, especially if switches are sorted consistently, so counting up occurrences yields each structure's overall popularity.&lt;/p&gt;

&lt;p&gt;Picking one such aggregate yields a second selector over the original incidences, and selecting one of those prints it to stdout. This can be referenced, copied and pasted, or even &lt;code&gt;eval&lt;/code&gt;ed in the shell.&lt;/p&gt;

&lt;p&gt;So far I've released Mnem to the &lt;a href="https://aur.archlinux.org/packages/mnem/"&gt;Arch AUR&lt;/a&gt; and a Homebrew tap:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;brew tap dmfay/mnem https://gitlab.com/dmfay/homebrew-mnem.git
brew install dmfay/mnem/mnem
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



</description>
      <category>rust</category>
      <category>shell</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Should sponsors be able to hide comments on their posts?</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Thu, 23 Jul 2020 15:43:53 +0000</pubDate>
      <link>https://forem.com/dmfay/should-sponsors-be-able-to-hide-comments-on-their-posts-35b1</link>
      <guid>https://forem.com/dmfay/should-sponsors-be-able-to-hide-comments-on-their-posts-35b1</guid>
      <description>&lt;p&gt;&lt;a href="https://dev.to/github/hi-we-re-github-and-we-re-excited-to-be-at-codeland-32dg"&gt;GitHub is sponsoring CodeLand&lt;/a&gt;, and have the expected friendly DevRel spokesperson, announcement thread, raffle for early access to new stuff, and the like.&lt;/p&gt;

&lt;p&gt;I'm not going to belabor &lt;a href="https://www.theverge.com/2019/10/9/20906213/github-ice-microsoft-software-email-contract-immigration-nonprofit-donation"&gt;the issues I have with GitHub&lt;/a&gt;. Suffice to say that when I asked the friendly DevRel spokesperson &lt;a class="comment-mentioned-user" href="https://dev.to/bdougieyo"&gt;@bdougieyo&lt;/a&gt;
 how he felt about his company's collaboration with the US government's Immigration and Customs Enforcement agency (ICE) it was in good faith -- I'm genuinely interested in hearing how he does or doesn't grapple with his employer's ethics!&lt;/p&gt;

&lt;p&gt;Instead of answering, ignoring, or even brushing me off with patter about how donations to nonprofits absolve GitHub of responsibility, he used the post author tools to hide my comment. And &lt;em&gt;that&lt;/em&gt; I have a problem with: it crosses the line between GitHub using DEV to whitewash its reputation, and DEV providing tools to &lt;em&gt;help&lt;/em&gt; GitHub whitewash its reputation.&lt;/p&gt;

&lt;p&gt;If a comment on a sponsor post violates no rule, sponsors shouldn't be able to hide it for mere political expediency. They're here to attract attention and drum up business; if their practices mean that that attention isn't all as positive and uncritical as they'd like, that's for them to handle -- not DEV.&lt;/p&gt;

&lt;p&gt;What do you think?&lt;/p&gt;

</description>
      <category>meta</category>
      <category>discuss</category>
      <category>codeland</category>
    </item>
    <item>
      <title>Plex: A Life</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Fri, 06 Sep 2019 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/plex-a-life-1n9a</link>
      <guid>https://forem.com/dmfay/plex-a-life-1n9a</guid>
      <description>&lt;p&gt;A little while back I got my hands on a copy of &lt;em&gt;Software Development and Reality Construction&lt;/em&gt;, the output of a conference held in Berlin in 1988. Among a variety of other more or less philosophical treatments of the theory and practice of software development, Don Knuth analyzes errors he made in his work on TeX; Kristen "SIMULA" Nygaard reviews his collaboration with labor unions to ensure that software meant to coordinate and control work does not wind up controlling the workers as well, a rather grim read in the era of Uber and Amazon; Heinz Klein and Kalle Lyytinen embark on a discussion of data modeling as production rather than as interpretation or hermeneutics. In all, it's some of the most insightful writing about programming and software engineering I've encountered.&lt;/p&gt;

&lt;p&gt;This isn't about those contributions.&lt;/p&gt;

&lt;p&gt;There's an entry fairly early on from one Douglas T. Ross, called "From Scientific Practice to Epistemological Discovery". Ross, who died in 2007, was a computer scientist and engineer most remembered today for the influential APT machine tools programming language and for coining the term "computer-aided design".&lt;/p&gt;

&lt;p&gt;This isn't about the things Doug Ross is remembered for.&lt;/p&gt;

&lt;p&gt;Doug Ross had a &lt;em&gt;system&lt;/em&gt;. The system began its public life as an early software engineering methodology in the Cambrian explosion of such methodologies enabled by the spread of high-level programming languages in the 60s and 70s. The system went by a few names. Ross's company, SofTech Inc., called it the Structured Analysis and Design Technique or SADT. The US Air Force, never wont to use merely one acronym where two will do, called it IDEF0: ICAM (Integrated Computer Aided Manufacturing) DEFinition for function modeling.&lt;/p&gt;

&lt;p&gt;To Doug Ross, the system was Plex. And Plex was everything. When the Department of Defense cut the Structured Analysis data modeling approach from IDEF0 in favor of a simpler methodology to be developed by SofTech subcontracters and named IDEF1, Ross decried the decision as destroying the "mathematical elegance and symmetric completeness of SADT [...] IDEF0 became merely the best of a competing zoo of other software development CASE tools, none of which were scientifically founded". He saw his career, and, indeed, his life, as drawing him inevitably toward the discovery and promulgation of his "philosophy of problem-solving", and furthering Plex's development became more and more important to him as time went on. In the mid-80s, he stopped drawing a salary at SofTech and went back to MIT, lecturing part-time on electrical engineering in order to focus more of his efforts on Plex.&lt;/p&gt;

&lt;p&gt;But even MIT was, in Ross's own words, "not yet ready for [Structured Analysis] much less Plex". A graduate seminar on Plex itself was briefly offered in 1984, but was canceled due to lack of student interest. In "From Scientific Practice" Ross bemoans his inability to gain traction for Plex, writing of feeling "an intolerable burden of responsibility to still be the only person in the world (to my knowledge) pursuing it". His only recourse was to turn inward and "generate book after book on Plex in my office at home, in order that Plex will be ready when the world is ready for it!"&lt;/p&gt;

&lt;p&gt;At this point, Doug Ross might be sounding a little bit like a crank. Let me be clear: Douglas T. Ross, computer science pioneer, was &lt;em&gt;absolutely&lt;/em&gt; a crank of the first water. This is just as absolutely to his credit; any fool can make it from the sublime to the ridiculous, but it takes real talent to go in the other direction. And Plex &lt;em&gt;is&lt;/em&gt; sublime, if in its own dry, academic way. Ross is not the celestial paranoiac Francis E. Dec, ranting and raving about the &lt;a href="http://www.bentoandstarchky.com/dec/intro.htm"&gt;Worldwide Deadly Gangster Communist Computer God&lt;/a&gt; and lunar brain storage depots; nor is Plex the gonzo experience of &lt;a href="https://timecube.2enp.com/"&gt;Nature's Harmonic Simultaneous 4-Day Time Cube&lt;/a&gt;. That said, Ross never devolves into the racist vituperations Dec and Time Cube's Gene Ray were sometimes given to, either. So it goes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⁂&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Plex itself is a sprawling, incoherent metaphysics built, according to Ross, on the foundation of a single pun (or, more properly, double entendre): "nothing can be left out". Thus inspired, Ross embarks upon the classic Cartesian thought experiment. But where Descartes discards every proposition except the cogito ("I think, therefore I am"), Ross's buck stops at "nothing doesn't exist".&lt;/p&gt;

&lt;p&gt;Or, in Ross's own framing:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Nothing doesn't exist&lt;/strong&gt;. That is &lt;em&gt;the&lt;/em&gt; &lt;strong&gt;First Definition&lt;/strong&gt; of Plex -- a scientific philosophy whose aim is &lt;em&gt;understanding our understanding of the nature of nature&lt;/em&gt;. Plex does not attempt to understand nature &lt;em&gt;itself&lt;/em&gt;, but only our &lt;em&gt;understanding&lt;/em&gt; of it. We are &lt;em&gt;included&lt;/em&gt; in nature as we do "our understanding", both scientific and informal, so we must understand &lt;em&gt;ourselves&lt;/em&gt; as well -- not just what we &lt;em&gt;think&lt;/em&gt; we are, but as we &lt;em&gt;really&lt;/em&gt; are, as &lt;em&gt;integral, natural &lt;strong&gt;beings&lt;/strong&gt; of nature&lt;/em&gt;. &lt;em&gt;How&lt;/em&gt; one "understand"s and even who "we" &lt;em&gt;are&lt;/em&gt; as we &lt;em&gt;do&lt;/em&gt; "our understanding" necessarily is left completely open, for all that must arise &lt;em&gt;naturally&lt;/em&gt; from the very &lt;em&gt;nature&lt;/em&gt; of nature.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;All emphasis -- all of it, I assure you -- original. Ross's dedication to bold and italic text wavers from work to work and page to page, but on balance "From Scientific Practice to Epistemological Discovery" is in fine form. Early entries he refers to in his "thousands of C-pages" (that is, "chronological working pages", all of which may or may not have been lost) and &lt;a href="https://groups.csail.mit.edu/mac/projects/studentaut/lecture4/plex_lectures_book_ok.htm"&gt;lecture notes he prepared in 1985&lt;/a&gt; sometimes switch between up to eight colors every few words. The lecture notes are of particular interest compared to the other extant materials, comprising a "study of an SADT Data Model which expresses all aspects of any object which obeys laws of physical cause and effect" delivered as a dialogue between Ross and a genie reminiscent of &lt;em&gt;Gödel, Escher, Bach&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Having arrived at the First Definition, Ross next attempts to deduce everything else from it, claiming that Plex need make no assumptions. "Nothing doesn't exist" leads, expanded this way and that, to "Only that which is &lt;em&gt;known by definition&lt;/em&gt; is &lt;strong&gt;known&lt;/strong&gt; -- by definition", as, "&lt;em&gt;without&lt;/em&gt; a definition for something, we only can know it as Nothing". Within the space of a few paragraphs, he's slammed what appears to be his own misinterpretation of Stephen Hawking and (unknowingly?) reinvented Spinoza's pantheism, on the grounds that "Nothing &lt;strong&gt;isn't&lt;/strong&gt; ; Plex is what Nothing &lt;strong&gt;isn't&lt;/strong&gt;". And for what it's worth, this is all still in the first two pages of "From Scientific Practice".&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⁂&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In another instance, Plex guides Ross to enlightenment regarding questions of information theory. It turns out that a single bit actually requires 3/2 binary digits for encoding, "because the value of the &lt;em&gt;half-bit&lt;/em&gt; is 3/4 !!!".&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;-- which ultimately results from the fact that in &lt;em&gt;actuality&lt;/em&gt;, when you don't have something, it is &lt;em&gt;not&lt;/em&gt; the case that you &lt;em&gt;have&lt;/em&gt; it &lt;em&gt;but&lt;/em&gt; it is Nothing -- it is that you &lt;strong&gt;don't have&lt;/strong&gt; it; whereas when you &lt;em&gt;do&lt;/em&gt; have something, that is because you &lt;strong&gt;don't have&lt;/strong&gt; what it &lt;em&gt;isn't&lt;/em&gt;!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At a closer reading, this isn't necessarily the gibberish it might seem at first blush. Plex's foundation in "Nothing" makes &lt;code&gt;zero&lt;/code&gt; the default state. But &lt;code&gt;one&lt;/code&gt; is only understandable when there's an understood meaning for &lt;code&gt;one&lt;/code&gt;. The elaboration about nothings and somethings makes it seem like Ross is counting this other &lt;code&gt;one&lt;/code&gt; -- that is, half a bit -- towards the cost of encoding any other bit. In semiotic terms, this is the &lt;em&gt;interpretant&lt;/em&gt; or subjective value Charles Sanders Peirce sees implicit in signification. But if Ross ever investigated the ways logicians and linguists had already been exploring this territory, there's no indication that he attached any significance (as it were) to their work. And while including the interpretant for half the possible values may yield the same final figure, it does not account for the 3/4 half-bit; so in the face of storage hardware design as practiced, Ross's insistence on 3/2 seems more mystical than scientific.&lt;/p&gt;

&lt;p&gt;I have no idea how &lt;em&gt;au courant&lt;/em&gt; Ross was with the humanities in general, but it seems likely that the answer is "not very". He was, of course, quite well-versed in math and engineering. Even deep in the mire of Plex, one can find him struggling to accommodate the realization that he was, in essence, defining formal systems backwards (he settles this with the ingenious maneuver of declaring the distinction akin to chirality), but the only philosopher he mentions is Plato. His efforts at deductive logic too seem thoroughly warped, as evinced by his "proof that every point is the whole world". For reference, an object's "identity" is tautologically defined as above: the set of "that" which "this" isn't.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  I n = 1: A world of one point is the whole world.
 II Assume the theorem is true for (n - 1) points. (n &amp;gt; 1),
     i.e., for any collection of (n - 1) points, every point is the whole world.
     [ed: remember, Plex needs no assumptions, let alone "assume the theorem is true"]
III To prove the theorem for n points given its truth for (n - 1) points
     (n &amp;gt; 1)
     (a) The identity of any one point, p, in the collection is a collection of (n -
         1) points, each of which is the whole world, by II.
     (b) The identity of any other point, q, i.e., a point of the identity of p, is
         a collection of (n - 1) points, each of which is the whole world, by II.
     (c) The identity of p and the identity of q are identical except that where
         the identity of p has q the identity of q has p. In any case p is the
         whole world by (b) and q is the whole world by (a).
     (d) Hence both p and q are the whole world, as are all the other points (if
         any) in their respective identities (and shared between them).
     (e) Hence all n points are the whole world.
 IV For n = 2, I is used (via II) in IIIa and IIIb, q.e.d.
  V Q.E.D. by natural induction.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;As mentioned, Ross generated a wealth of C-pages, lecture notes, and other writings on Plex, but except for a small fraction apparently hosted on &lt;a href="https://groups.csail.mit.edu/mac/projects/studentaut/index.htm"&gt;his last MIT faculty/program page&lt;/a&gt;, I have no idea where most of this collection ended up. If you're interested in reading further in Ross's own words, the best places to start are probably "From Scientific Practice to Epistemological Discovery" in &lt;a href="https://www.researchgate.net/publication/242530010_Software_Development_and_Reality_Construction"&gt;&lt;em&gt;Software Development and Reality Construction&lt;/em&gt;&lt;/a&gt; or &lt;a href="https://groups.csail.mit.edu/mac/projects/studentaut/The%20Plex%20Tract.htm"&gt;The Plex Tract&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coda
&lt;/h2&gt;

&lt;p&gt;Doug Ross himself remains a rather cryptic figure. There's some biographical information out there, but after his birth to missionary parents in what's now Guangdong and childhood homecoming to the Finger Lakes region of New York it mostly concerns where, when, with whom, and on what he was working. In his writings he comes off somewhat full of himself, as tends to be the case with esoteric philosophers and visionaries for whom the world is not yet and will never be ready. But when Ross talks about the necessary perfection, or perfect necessity, of his marriage to his wife Pat, herself a human computer at MIT's Lincoln Laboratory, it's still a little bit charming. And when he writes, with complete seriousness, that "being a pioneer came naturally" to him, I can't exactly say otherwise.&lt;/p&gt;

&lt;p&gt;I wonder what it was like in that conference hall in 1988. I don't know whether the attendees or the organizers knew what they were in for when Ross got up to talk about this beautiful, all-consuming nonsense that was driving him to desperation. But sense isn't everything; and as a project of &lt;em&gt;reality construction&lt;/em&gt; Plex is a monumental accomplishment. And the reality we ourselves have collectively constructed, in which points are points, a bit corresponds to a single binary digit, and genies obstinately refuse to appear no matter how we manipulate bottles, is the richer for its existence.&lt;/p&gt;

</description>
      <category>history</category>
      <category>philosophy</category>
      <category>biography</category>
    </item>
    <item>
      <title>JOIN Semiotics and MassiveJS v6</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Tue, 13 Aug 2019 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/join-semiotics-and-massivejs-v6-44he</link>
      <guid>https://forem.com/dmfay/join-semiotics-and-massivejs-v6-44he</guid>
      <description>&lt;p&gt;&lt;a href="https://massivejs.org"&gt;MassiveJS&lt;/a&gt; version 6 is imminent. This next release closes the widest remaining gap between Massive-generated APIs and everyday SQL, not to mention other higher-level data access libraries: &lt;code&gt;JOIN&lt;/code&gt;s.&lt;/p&gt;

&lt;p&gt;This is something of a reversal for Massive, which until now has had very limited functionality for working with multiple database entities at once. I've even &lt;a href="https://dev.to/dmfay/centralize-your-query-logic-5bhh"&gt;written about this as a constraint not without benefits&lt;/a&gt; (and, for the record, I think that still -- ad-hoc joins are a tool to be used judiciously in application code!).&lt;/p&gt;

&lt;p&gt;But the main reason for this lack was always that I'd never come up with any solution that didn't fit awkwardly into an already-awkward options object. &lt;a href="https://massivejs.org/docs/persistence#deep-insert"&gt;Deep insert&lt;/a&gt; and &lt;a href="https://massivejs.org/docs/resultset-decomposition"&gt;resultset decomposition&lt;/a&gt; were quite enough to keep track of. I am naturally loath to concede any inherent advantages to constructing models, but this really seemed like one for the longest time.&lt;/p&gt;

&lt;p&gt;There are, however, ways. Here's what Massive joins look like, if we invade the imaginary privacy of an imaginary library system's imaginary patrons:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;whoCheckedOutCalvino&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;libraries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;join&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;books&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;library_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;patron_books&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LEFT OUTER&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;pk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;patron_id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;book_id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;book_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;books.id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;omit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;who_checked_out&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;LEFT OUTER&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;relation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;patrons&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;patron_books.patron_id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nx"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EV&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;books.author ILIKE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;calvino, %&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;(&lt;code&gt;relation&lt;/code&gt; in this sense indicates a table or view.)&lt;/p&gt;

&lt;p&gt;And the output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;[{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;East Virginia State U&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;state&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;EV&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;books&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Calvino, Italo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;library_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;title&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cosmicomics&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;who_checked_out&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Lauren Ipsum&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}]&lt;/span&gt;
  &lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Neitherfolk Public Library&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;state&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;EV&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;books&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Calvino, Italo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;library_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;title&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cosmicomics&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;who_checked_out&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Daler S. Ahmet&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;}]&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;author&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Calvino, Italo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;library_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;title&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Invisible Cities&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;who_checked_out&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
  &lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Or in other words, exactly what you'd hope it would look like -- and what, if you use Massive, you may previously have been dealing with a view and decomposition schema to achieve. This is a moderately complex example, and between defaults (e.g. &lt;code&gt;type&lt;/code&gt; to &lt;code&gt;INNER&lt;/code&gt;) and introspection, declaring a join can be as simple as naming the target: &lt;code&gt;db.libraries.join('books')&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The join schema is something of an evolution on the decomposition schema, sharing the same structure but inferring column lists, table primary keys, and even some &lt;code&gt;on&lt;/code&gt; conditions where unambiguous foreign key relationships exist. It's more concise, less fragile, and still only defined exactly when and where it's needed. Even better, compound entities created from tables can use persistence methods, meaning that &lt;code&gt;join()&lt;/code&gt; can replace many if not most existing usages of deep insert and resultset decomposition.&lt;/p&gt;

&lt;p&gt;It might seem a little unconventional to just invent ersatz database entities out of whole cloth. There's some precedent -- Massive already treats scripts like database functions -- but the compound entities created by &lt;code&gt;Readable.join()&lt;/code&gt; are a good bit more complex than that. There's a method to this madness though, and its origins date back to before Ted Codd came up with the idea of the relational database itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Semiotics from 30,000 Feet
&lt;/h2&gt;

&lt;p&gt;Semiotics is, briefly, the study of meaning-making, with 19th-century roots in both linguistics and formal logic. It's also a sprawling intellectual tradition in dialogue with multifarious other sprawling intellectual traditions, so I am not remotely going to do it justice here. The foundational idea is credited on the linguistics side to Ferdinand de Saussure: meaning is produced in the relation of a &lt;em&gt;signifier&lt;/em&gt; to a &lt;em&gt;signified&lt;/em&gt;, or taken together a &lt;em&gt;sign&lt;/em&gt;. Smoke to fire, letter to sound, and so forth. Everything else proceeds from that relationship. There is, of course, a lot more of that everything else, and like so many other foundational ideas the original Saussurean dyad is something of a museum piece.&lt;/p&gt;

&lt;p&gt;But the idea of theorizing meaning itself in almost algebraic terms would outlive de Saussure. The logician Charles Sanders Peirce had already come to similar conclusions, and had realized to boot that the interpreted value of the signifier's relationship to its signified is as important as the other two. Peirce, following this line of reasoning, understood this "interpretant" itself to be a sign comprising its own signifier and signified which in turn yield their own interpretant, in infinite chains of signification. Louis Hjelmslev, meanwhile, reimagined de Saussure's dyad as a relation of &lt;em&gt;expression&lt;/em&gt; to &lt;em&gt;content&lt;/em&gt;, and added a second dimension of &lt;em&gt;form&lt;/em&gt; and &lt;em&gt;substance&lt;/em&gt;. To Hjelmslev, a sign is a function, in the mathematical sense, mapping the "form of expression" to the "form of content", naming as the "substance of expression" and "substance of content" the raw materials formed into the sign.&lt;/p&gt;

&lt;p&gt;The use of the term "substance" sounds kind of like some sort of philosophically-détourned jargon, but there are no tricks here: it's just &lt;em&gt;stuff&lt;/em&gt;. There's no more specific designation than the likes of "substance" for "that which has been made into a sign"; the category includes everything from physical materials to light, gesture, positioning, electricity, more, in endless combinations. A sign is created by these matters being selected and formed into content and expression: fuel, oxygen, and heat organized into fire and smoke, or sounds uttered in an order corresponding to a known linguistic quantity. It should be said also that consciousness need not enter into it: anything can make a sign, and even a plant can interpret one.&lt;/p&gt;

&lt;p&gt;This all is to say: there's stuff out there, and what it has in common is that it is made to mean things. Most stuff, in fact, is constantly meaning many things at the same time, as long as there's an interpreting process -- and there's always &lt;em&gt;something&lt;/em&gt;. The philosopher-psychologist tag team of Gilles Deleuze and Felix Guattari envisioned the primordial soup of matters-awaiting-further-formation as a spatial dimension: the &lt;em&gt;plane of consistency&lt;/em&gt; or &lt;em&gt;plane of immanence&lt;/em&gt;. Signification, as they proposed in &lt;em&gt;1000 Plateaus&lt;/em&gt;, happens on and above the plane of consistency, as matters are selected and drawn up from it to become substance and sign. The recursive nature of signification means that these signs are then selected into the substance of yet other signs, becoming layers or strata on the plane in a fashion they compare to the formation of sedimentary rock.&lt;/p&gt;

&lt;h2&gt;
  
  
  Signs and Databases
&lt;/h2&gt;

&lt;p&gt;A database management system, like any other program, is an immensely complex system of signs. However, what sets DBMSs (and some other categories of software, like ledgers and version control systems) apart is that they're designed to manage &lt;em&gt;other&lt;/em&gt; systems of signs. Thanks to this recursive aspect, a database can be imagined as a plane of consistency, a space from which any combination of unformed bytes might be drawn up into column-signs and row-signs which in turn are gathered into table-signs and view-signs and query-signs.&lt;/p&gt;

&lt;p&gt;And if tables and views and queries are all still signs at base, where exactly do the differences come in? Tables store persistent data and are therefore mutable, while views and queries do not and are not, and must be constituted from tables themselves and (in the case of views) from each other. Tables constitute a lower stratum of signs, with views forming table- and view-substance into signs on higher strata, and queries higher still, at a sufficient remove from the plane of consistency that they're no longer stored in the database itself.&lt;/p&gt;

&lt;p&gt;This is, of course, arriving at inheritance the long way around. In Massive terms, database entities are first instances of a base &lt;code&gt;Entity&lt;/code&gt; class, after which they inherit a second prototype: one of &lt;code&gt;Sequence&lt;/code&gt;, &lt;code&gt;Executable&lt;/code&gt;, or &lt;code&gt;Readable&lt;/code&gt;. Some of the latter may be further articulated as &lt;code&gt;Writable&lt;/code&gt;s, as well; there are no &lt;code&gt;Writable&lt;/code&gt;s which are not also &lt;code&gt;Readable&lt;/code&gt;s.&lt;/p&gt;

&lt;p&gt;But there's more than one thing happening here, and the ordering of tables, views, and database functions into class-strata is the second step -- matters must be chosen before they can be formed into signs. It's in this first step of stratification that Massive adds script files to the API system of signs, treating them (almost) identically to functions and procedures.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Readable.join()&lt;/code&gt; takes the same idea further to expand on the database's relations: before, a &lt;code&gt;Readable&lt;/code&gt; mapped one-to-one with a single table or view. But as long as SQL can be generated to suit, there's no reason one &lt;code&gt;Readable&lt;/code&gt; couldn't map to multiple relations. &lt;code&gt;Writable&lt;/code&gt;s too, for that matter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;librariesWithBooks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;libraries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;books&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;libraryMembers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;patrons&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;libraries&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// inserts work exactly like deep insert, persisting an&lt;/span&gt;
&lt;span class="c1"&gt;// entire object tree&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newLibrary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;librariesWithBooks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Lichfield Public Library&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EV&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;books&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="na"&gt;library_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Jurgen: A Comedy of Justice&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Cabell, James Branch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;library_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;If On a Winter&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;s Night a Traveller&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;author&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Calvino, Italo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// updates make changes in the origin table, based on&lt;/span&gt;
&lt;span class="c1"&gt;// criteria which can reference the joined tables&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;withCabell&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;librariesWithBooks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;books.author ilike&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cabell, %&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;has_cabell&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// deletes, like updates, affect the origin table only&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;iplPatrons&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;libraryMembers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;destroy&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;libraries.name ilike&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Imaginary Public Library&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  Try it Out!
&lt;/h2&gt;

&lt;p&gt;The first v6 prerelease is available now: &lt;code&gt;npm i massive@next&lt;/code&gt;. There's now a &lt;a href="https://massivejs.org/docs/prerelease"&gt;prerelease section of the docs&lt;/a&gt; going over what's new and different in detail. But to sum up the other changes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Node &amp;lt; 7.6 is no longer supported.&lt;/li&gt;
&lt;li&gt;Implicit ordering has been dropped.&lt;/li&gt;
&lt;li&gt;Resultset decomposition now yields arrays instead of objects by default. The &lt;code&gt;array&lt;/code&gt; schema field is no longer recognized, and you'll need to remove it from your existing decomposition schemas. To yield objects, set &lt;code&gt;decomposeTo: 'object'&lt;/code&gt; instead.&lt;/li&gt;
&lt;li&gt;JSON and JSONB properties are now sorted as their original type instead of being processed as text.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;type&lt;/code&gt; property of the &lt;code&gt;order&lt;/code&gt; option has been deprecated in favor of Postgres-style &lt;code&gt;field::type&lt;/code&gt; casting as used elsewhere. It will continue to work through the 6.x lifecycle but may be removed in a subsequent major release.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a feature I've been wishing I could make happen somehow ever since I first published the original resultset decomposition Gist more than two years ago. It's involved extensive changes to table loading, criteria parsing, and statement generation. I've endeavored &lt;em&gt;not&lt;/em&gt; to break these areas, and have informally experimented by dropping pre-prerelease versions into an existing codebase. Results have been good, but should you find an issue with this or any other Massive functionality, please &lt;a href="https://gitlab.com/dmfay/massive-js/issues"&gt;let me know&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;I'm really excited to see just how far joins expand Massive's capabilities, but in truth there's just one thing I think I and most other Massive users will get the most mileage out of: plain old query predicate generation with criteria objects, without having to define and manage a plethora of views to cover basic &lt;code&gt;JOIN&lt;/code&gt;s. Stratification is a useful way to think about the production of meaning -- but strata themselves can also be dead weight.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>showdev</category>
      <category>sql</category>
      <category>database</category>
    </item>
    <item>
      <title>A Self-Sourcing Cassandra Cluster with SaltStack and EC2</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Wed, 27 Mar 2019 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/a-self-sourcing-cassandra-cluster-with-saltstack-and-ec2-48kd</link>
      <guid>https://forem.com/dmfay/a-self-sourcing-cassandra-cluster-with-saltstack-and-ec2-48kd</guid>
      <description>&lt;p&gt;Anybody doing something interesting to a production Cassandra cluster is generally advised, for a host of excellent reasons, to try it out in a test environment first. Here's how to make those environments effectively disposable.&lt;/p&gt;

&lt;p&gt;The something interesting we're trying to do to our Cassandra cluster is actually two somethings: upgrading from v2 to v3, while also factoring Cassandra itself out from the group of EC2 servers that currently run Cassandra-and-also-some-other-important-stuff. We have a "pets" situation and want a "cattle" situation, per Bill Baker: pets have names and you care deeply about each one's welfare, while cattle are, not to put too fine a point on it, fungible. If we can bring new dedicated nodes into the cluster, start removing the original nodes as replication takes its course, and finally upgrade this Database of Theseus, that'll be some significant progress -- and without downtime, even! But it's going to take a lot of testing, to say nothing of managing the new nodes for real.&lt;/p&gt;

&lt;p&gt;We already use SaltStack to monitor and manage other areas of our infrastructure besides the data pipeline, and SaltStack includes a "salt-cloud" module which can work with EC2. I'd rather have a single infra-as-code solution, so that part's all good. What isn't: the &lt;a href="https://github.com/salt-formulas/salt-formula-cassandra"&gt;official Cassandra formula&lt;/a&gt; is geared more towards single-node instances or some-assembly-required clusters, and provisioning is a separate concern. I expect to be creating and destroying clusters with abandon, so I need this to be as automatic as possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Salt-Cloud Configuration
&lt;/h2&gt;

&lt;p&gt;The first part of connecting salt-cloud is to set up a provider and profile. On the Salt master, these are in /etc/cloud.providers.d and /etc/cloud.profiles.d. We keep everything in source control and symlink these directories.&lt;/p&gt;

&lt;p&gt;Our cloud stuff is hosted on AWS, so we're using the &lt;a href="https://docs.saltstack.com/en/latest/topics/cloud/aws.html"&gt;EC2 provider&lt;/a&gt;. That part is basically stock, but in profiles we do need to define a template for the Cassandra nodes themselves.&lt;/p&gt;

&lt;h3&gt;
  
  
  etc/cloud.profiles.d/ec2.conf
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra_node:
  provider: [your provider from etc/cloud.providers.d/ec2.conf]
  image: ami-abc123
  ssh_interface: private_ips
  size: m5.large
  securitygroup:
    - default
    - others
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  cassandra-test.map
&lt;/h3&gt;

&lt;p&gt;With the &lt;code&gt;cassandra_node&lt;/code&gt; template defined in the profile configuration, we can establish the cluster layout in a &lt;em&gt;map file&lt;/em&gt;. The filename doesn't matter; mine is cassandra-test.map. One important thing to note is that we're establishing a naming convention for our nodes: &lt;code&gt;cassandra-*&lt;/code&gt;. Each node is also defined as &lt;code&gt;t2.small&lt;/code&gt; size, overriding the default &lt;code&gt;m5.large&lt;/code&gt; -- we don't need all that horsepower while we're just testing! &lt;code&gt;t2.micro&lt;/code&gt; instances, however, did prove to be too underpowered to run Cassandra.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra_node:
  - cassandra-1:
      size: t2.small
      cassandra-seed: true
  - cassandra-2:
      size: t2.small
      cassandra-seed: true
  - cassandra-3:
      size: t2.small
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;code&gt;cassandra-seed&lt;/code&gt; (and &lt;code&gt;size&lt;/code&gt;, for that matter) is a &lt;em&gt;grain&lt;/em&gt;, a fact each Salt-managed "minion" knows about itself. When Cassandra comes up in a multi-node configuration, each node looks for help joining the cluster from a list of "seed" nodes. Without seeds, nothing can join the cluster; however, only non-seeds will bootstrap data from the seeds on joining so it's not a good idea to make everything a seed. And the seed layout needs to toposort: if A has B and C for seeds, B has A and C, and C has A and B, it's the same situation as no seeds. If two instances know that they're special somehow, we can use grain matching to target them specifically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pillar and Mine
&lt;/h2&gt;

&lt;p&gt;The Salt "pillar" is a centralized configuration database stored on the master. Minions make local copies on initialization, and their caches can be updated with &lt;code&gt;salt minion-name saltutil.refresh_pillar&lt;/code&gt;. Pillars can target nodes based on name, grains, or other criteria, and are commonly used to store configuration. We have a lot of configuration, and most of it will be the same for all nodes, so using pillars is a natural fit.&lt;/p&gt;

&lt;h3&gt;
  
  
  srv/salt/pillar/top.sls
&lt;/h3&gt;

&lt;p&gt;Like the &lt;code&gt;top.sls&lt;/code&gt; for Salt itself, the Pillar &lt;code&gt;top.sls&lt;/code&gt; defines a &lt;em&gt;highstate&lt;/em&gt; or default state for new minions. First, we declare the pillars we're adding appertain to minions whose names match the pattern &lt;code&gt;cassandra-*&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;base:
  'cassandra-*':
    - system-user-ubuntu
    - mine-network-info
    - java
    - cassandra
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/pillar/system-user-ubuntu.sls
&lt;/h3&gt;

&lt;p&gt;Nothing special here, just a user so we can ssh in and poke things. The private key for the user is defined in the cloud provider configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;system:
  user: ubuntu
  home: /home/ubuntu
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/pillar/mine-network-info.sls
&lt;/h3&gt;

&lt;p&gt;The Salt "mine" is another centralized database, this one storing grain information so minions can retrieve facts about other minions from the master instead of dealing with peer-to-peer communication. Minions use a &lt;code&gt;mine_functions&lt;/code&gt; pillar (or salt-minion configuration, but we're sticking with the pillar) to determine whether and what to store. For Cassandra nodes, we want internal network configuration and the public DNS name, which latter each node has to get by asking AWS where it is with &lt;code&gt;curl&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mine_functions:
  network.interfaces: [eth0]
  network.ip_addrs: [eth0]
  # ask amazon's network config what we're public as
  public_dns:
    - mine_function: cmd.run
    - 'curl -s http://169.254.169.254/latest/meta-data/public-hostname'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/pillar/java.sls
&lt;/h3&gt;

&lt;p&gt;Cassandra requires Java 8 to be installed (&lt;a href="https://issues.apache.org/jira/browse/CASSANDRA-9608"&gt;prospective Java 9 support became prospective Java 11 support&lt;/a&gt; and is due with Cassandra 4). This pillar sets up the &lt;a href="https://github.com/saltstack-formulas/sun-java-formula"&gt;official Java formula&lt;/a&gt; accordingly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;java:
  # vitals
  release: '8'
  major: '0'
  minor: '202'
  development: false

  # tarball
  prefix: /usr/share/java # unpack here
  version_name: jdk1.8.0_202 # root directory name
  source_url: https://download.oracle.com/otn-pub/java/jdk/8u202-b08/1961070e4c9b4e26a04e7f5a083f551e/server-jre-8u202-linux-x64.tar.gz
  #source_hash: sha256=9efb1493fcf636e39c94f47bacf4f4324821df2d3aeea2dc3ea1bdc86428cb82
  source_hash: sha256=61292e9d9ef84d9702f0e30f57b208e8fbd9a272d87cd530aece4f5213c98e4e
  dl_opts: -b oraclelicense=accept-securebackup-cookie -L
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/pillar/cassandra.sls
&lt;/h3&gt;

&lt;p&gt;Finally, the Cassandra pillar defines properties common to all nodes in the cluster. My upgrade plan is to bring everything up on 2.2.12, switch the central pillar definition over, and then supply the new version number to each minion by refreshing its pillar as part of the upgrade process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra:
  version: '2.2.12'
  cluster_name: 'Test Cluster'
  authenticator: 'AllowAllAuthenticator'
  endpoint_snitch: 'Ec2Snitch'
  twcs_jar:
    '2.2.12': 'TimeWindowCompactionStrategy-2.2.5.jar'
    '3.0.8': 'TimeWindowCompactionStrategy-3.0.0.jar'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;twcs_jar&lt;/code&gt; dictionary gets into one of the reasons I'm not using the official formula: we're using the &lt;a href="http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html"&gt;TimeWindowCompactionStrategy&lt;/a&gt;. TWCS was integrated into Cassandra starting in 3.0.8 or 3.8, but it has to be compiled and installed separately for earlier versions. Pre-integration versions of TWCS also have a different package name (&lt;code&gt;com.jeffjirsa&lt;/code&gt; instead of &lt;code&gt;org.apache&lt;/code&gt;). 3.0.8 is the common point, having the &lt;code&gt;org.apache&lt;/code&gt; TWCS built in but also being a valid compilation target for the &lt;code&gt;com.jeffjirsa&lt;/code&gt; TWCS. After upgrading to 3.0.8 I'll be able to &lt;code&gt;ALTER TABLE&lt;/code&gt; to apply the &lt;code&gt;org.apache&lt;/code&gt; version before proceeding.&lt;/p&gt;

&lt;p&gt;With the provider, profile, map file, and pillar setup we can actually spin up a barebones cluster of Ubuntu VMs now and retrieve the centrally-stored network information from the Salt mine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo salt-cloud -m cassandra-test.map

sudo salt 'cassandra-1' 'mine.get' '*' 'public_dns'
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We can't do much else, since we don't have anything installed on the nodes yet, but it's progress!&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cassandra State
&lt;/h2&gt;

&lt;p&gt;The state definition includes everything a Cassandra node &lt;em&gt;has&lt;/em&gt; to have in order to be part of the cluster: the installed binaries, a &lt;code&gt;cassandra&lt;/code&gt; group and user, a config file, a data directory, and a running SystemD unit. The definition itself is sort of an ouroboros of YAML and Jinja:&lt;/p&gt;

&lt;h3&gt;
  
  
  srv/salt/cassandra/defaults.yaml
&lt;/h3&gt;

&lt;p&gt;First, there's a perfectly ordinary YAML file with some defaults. These could easily be in the pillar we set up above (or the pillar config could all be in this file); the principal distinction seems to be in whether you want to propagate changes via &lt;code&gt;saltutil.refresh_pillar&lt;/code&gt;, or by (re)applying the Cassandra state either directly or via highstate. This is definitely more complicated than it needs to be right now, but given that this is my first major SaltStack project, I don't yet know enough to land on one side or the other, or if combining a defaults file with the pillar configuration will eventually be necessary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra:
  dc: dc1
  rack: rack1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/cassandra/map.jinja
&lt;/h3&gt;

&lt;p&gt;The map template loads the defaults file and merges them with the pillar, creating a &lt;code&gt;server&lt;/code&gt; dictionary with all the Cassandra parameters we're setting.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{% import_yaml "cassandra/defaults.yaml" as default_settings %}

{% set server = salt['pillar.get']('cassandra', default=default_settings.cassandra, merge=True) %}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/cassandra/init.sls
&lt;/h3&gt;

&lt;p&gt;Finally, the Cassandra state entrypoint init.sls is another Jinja template that happens to look a lot like a YAML file and renders a YAML file, which for SaltStack is good enough. Jinja is required here since values from the &lt;code&gt;server&lt;/code&gt; dictionary, like the server version or the TWCS JAR filename, need to be interpolated at the time the state is applied.&lt;/p&gt;

&lt;p&gt;When the Cassandra state is applied to a fresh minion:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;wget&lt;/code&gt; will be installed&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;CASSANDRA_VERSION&lt;/code&gt; environment variable will be set to the value defined in the pillar&lt;/li&gt;
&lt;li&gt;A user and group named &lt;code&gt;cassandra&lt;/code&gt; will be created&lt;/li&gt;
&lt;li&gt;A script named &lt;code&gt;install.sh&lt;/code&gt; will download and extract Cassandra itself, once the above three conditions are met&lt;/li&gt;
&lt;li&gt;A node configuration file named &lt;code&gt;cassandra.yaml&lt;/code&gt; will be generated from a Jinja template and installed to &lt;code&gt;/etc/cassandra&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If necessary, the TWCS jar will be added to the Cassandra lib directory&lt;/li&gt;
&lt;li&gt;The directory &lt;code&gt;/var/lib/cassandra&lt;/code&gt; will be created and chowned to the &lt;code&gt;cassandra&lt;/code&gt; user&lt;/li&gt;
&lt;li&gt;A SystemD unit for Cassandra will be installed and started once all its prerequisites are in order
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{% from "cassandra/map.jinja" import server with context %}

wget:
  pkg.installed

cassandra:
  environ.setenv:
    - name: CASSANDRA_VERSION
    - value: {{ server.version }}

  cmd.script:
    - require:
      - pkg: wget
      - user: cassandra
      - environ: CASSANDRA_VERSION
    - source: salt://cassandra/files/install.sh
    - user: root
    - cwd: ~

  group.present: []

  user.present:
    - require:
      - group: cassandra
    - gid_from_name: True
    - createhome: False

  service.running:
    - enable: True
    - require:
      - file: /etc/cassandra/cassandra.yaml
      - file: /etc/systemd/system/cassandra.service
{%- if server.twcs_jar[server.version] %}
      - file: /opt/cassandra/lib/{{ server.twcs_jar[server.version] }}
{%- endif %}

# Main configuration
/etc/cassandra/cassandra.yaml:
  file.managed:
    - source: salt://cassandra/files/{{ server.version }}/cassandra.yaml
    - template: jinja
    - makedirs: True
    - user: cassandra
    - group: cassandra
    - mode: 644

# Load TWCS jar if necessary
{%- if server.twcs_jar[server.version] %}
/opt/cassandra/lib/{{ server.twcs_jar[server.version] }}:
  file.managed:
    - require:
      - user: cassandra
      - group: cassandra
    - source: salt://cassandra/files/{{ server.version }}/{{ server.twcs_jar[server.version] }}
    - user: cassandra
    - group: cassandra
    - mode: 644
{%- endif %}

# Data directory
/var/lib/cassandra:
  file.directory:
    - user: cassandra
    - group: cassandra
    - mode: 755

# SystemD unit
/etc/systemd/system/cassandra.service:
  file.managed:
    - source: salt://cassandra/files/cassandra.service
    - user: root
    - group: root
    - mode: 644
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/cassandra/files/install.sh
&lt;/h3&gt;

&lt;p&gt;This script downloads and extracts the target version of Cassandra and points the symlink &lt;code&gt;/opt/cassandra&lt;/code&gt; to it. If the target version already exists, it just updates the symlink since everything else is already set up.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#!/bin/bash

update_symlink() {
  rm /opt/cassandra
  ln -s "/opt/apache-cassandra-$CASSANDRA_VERSION" /opt/cassandra

  echo "Updated symlink"
}

# already installed?
if [-d "/opt/apache-cassandra-$CASSANDRA_VERSION"]; then
  echo "Cassandra $CASSANDRA_VERSION is already installed!"

  update_symlink

  exit 0
fi

# download and extract
wget "https://archive.apache.org/dist/cassandra/$CASSANDRA_VERSION/apache-cassandra-$CASSANDRA_VERSION-bin.tar.gz"
tar xf "apache-cassandra-$CASSANDRA_VERSION-bin.tar.gz"
rm "apache-cassandra-$CASSANDRA_VERSION-bin.tar.gz"

# install to /opt and link /opt/cassandra
mv "apache-cassandra-$CASSANDRA_VERSION" /opt
update_symlink

# create log directory
mkdir -p /opt/cassandra/logs

# set ownership
chown -R cassandra:cassandra "/opt/apache-cassandra-$CASSANDRA_VERSION"
chown cassandra:cassandra /opt/cassandra
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;It's probably possible to do most of this, at least the symlink juggling and directory management, with "pure" Salt (and the environment variable could be eliminated by rendering &lt;code&gt;install.sh&lt;/code&gt; as a Jinja template with the &lt;code&gt;server&lt;/code&gt; dictionary), but the script does what I want it to and it's already idempotent and centrally managed.&lt;/p&gt;

&lt;h3&gt;
  
  
  srv/salt/cassandra/files/cassandra.service
&lt;/h3&gt;

&lt;p&gt;This is a basic SystemD unit, with some system limits customized to give Cassandra enough room to run. It starts whatever Cassandra executable it finds at /opt/cassandra, so all that's necessary to resume operations after the symlink changes during the upgrade is to restart the service.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Unit]
Description=Apache Cassandra database server
Documentation=http://cassandra.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=forking
User=cassandra
Group=cassandra
ExecStart=/opt/cassandra/bin/cassandra -Dcassandra.config=file:///etc/cassandra/cassandra.yaml
LimitNOFILE=100000
LimitNPROC=32768
LimitMEMLOCK=infinity
LimitAS=infinity

[Install]
WantedBy=multi-user.target
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  srv/salt/cassandra/files/2.2.12/cassandra.yaml
&lt;/h3&gt;

&lt;p&gt;The full &lt;code&gt;cassandra.yaml&lt;/code&gt; is enormous, so I won't reproduce it here in full. The interesting parts are where values are being automatically interpolated by Salt. Like the Cassandra state, this is actually a Jinja template which &lt;em&gt;renders&lt;/em&gt; a YAML file.&lt;/p&gt;

&lt;p&gt;First, we get a list of internal IP addresses corresponding to &lt;code&gt;cassandra-seed&lt;/code&gt; minions from the Salt mine and build a list of &lt;code&gt;known_seeds&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{%- from 'cassandra/map.jinja' import server with context -%}
{% set known_seeds = [] %}
{% for minion, ip_array in salt['mine.get']('cassandra-seed:true', 'network.ip_addrs', 'grain').items() if ip_array is not sameas false and known_seeds|length &amp;lt; 2 %}
{% for ip in ip_array %}
{% do known_seeds.append(ip) %}
{% endfor %}
{% endfor %}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This becomes the list of seeds the node looks for when trying to join the cluster.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          - seeds: "{{ known_seeds|unique|join(',') }}"
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Listen and broadcast addresses are configured per node. The broadcast addresses are a little special due to our network configuration needs: each node has to get its public dns name from the Salt mine. This is perhaps a bit overcomplicated compared to a custom grain or capturing the output from running the Salt modules at render time, but it's there and it works and at this point messing with it isn't a great use of time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;listen_address: {{ grains['fqdn'] }}
broadcast_address: {{ salt['mine.get'](grains['id'], 'public_dns').items()[0][1] }}
rpc_address: {{ grains['fqdn'] }}
broadcast_rpc_address: {{ salt['mine.get'](grains['id'], 'public_dns').items()[0][1] }}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The cluster name and other central settings are interpolated from the pillar+defaults &lt;code&gt;server&lt;/code&gt; dictionary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cluster_name: "{{ server.cluster_name }}"
...
authenticator: "{{ server.authenticator }}"
...
endpoint_snitch: "{{ server.endpoint_snitch }}"
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The changes to the Cassandra 3.0.8 configuration are identical.&lt;/p&gt;

&lt;h3&gt;
  
  
  srv/salt/cassandra/files/2.2.12/TimeWindowCompactionStrategy-2.2.5.jar
&lt;/h3&gt;

&lt;p&gt;See &lt;a href="http://thelastpickle.com/blog/2017/01/10/twcs-part2.html"&gt;this post on TheLastPickle&lt;/a&gt; for directions on building the TWCS jar.&lt;/p&gt;

&lt;h2&gt;
  
  
  Highstate
&lt;/h2&gt;

&lt;p&gt;Finally, the Salt highstate needs to ensure that our &lt;code&gt;cassandra-*&lt;/code&gt; nodes have the Java and Cassandra states applied. Since Salt-Cloud minions come configured, however, we have to ensure the default &lt;code&gt;salt.minion&lt;/code&gt; state is excluded from our Cassandra nodes since otherwise a highstate will blow away the cloud-specific configuration.&lt;/p&gt;

&lt;h3&gt;
  
  
  srv/salt/top.sls changes
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;base:
  'not cassandra-*':
    - match: compound
    - salt.minion
  'cassandra-*':
    - sun-java
    - sun-java.env
    - cassandra
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h2&gt;
  
  
  Startup!
&lt;/h2&gt;

&lt;p&gt;Set the Salt config dir to &lt;code&gt;etc&lt;/code&gt; with &lt;code&gt;-c&lt;/code&gt; and pass in the map file with &lt;code&gt;-m&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo salt-cloud -c etc -m cassandra-test.map
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;To clean up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo salt-cloud -d cassandra-1 cassandra-2 cassandra-3
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



</description>
      <category>cassandra</category>
      <category>devops</category>
      <category>saltstack</category>
      <category>aws</category>
    </item>
    <item>
      <title>Automatic Node Deploys to Elastic Beanstalk</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Mon, 08 Oct 2018 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/automatic-node-deploys-to-elastic-beanstalk-d6h</link>
      <guid>https://forem.com/dmfay/automatic-node-deploys-to-elastic-beanstalk-d6h</guid>
      <description>&lt;p&gt;One of my favorite good ideas to ignore is the maxim that you should have your deployment pipeline ready to go before you start writing code. There's always some wrinkle you couldn't have anticipated anyway, so while it sounds good on paper I just don't think it's the best possible use of time. But with anything sufficiently complicated, there's a point where you just have to buckle down and automate rather than waste time repeating the same steps yet again (or, worse, forgetting one). I hit that point recently: the application isn't in production yet, so I'd been "deploying" by means of pulling the repo on an EC2 server, installing dependencies and building in-place, then killing and restarting the node process with &lt;code&gt;nohup&lt;/code&gt;. Good enough for demos, not sustainable long-term. Also, I might have in fact missed a step Friday before last and not realized things were mostly broken until the following Monday.&lt;/p&gt;

&lt;p&gt;I'd been using CircleCI to build and test the application already, so I wanted to stick with it for deployment as well. However, this precluded using the same EC2 instance: the build container would need to connect to it to run commands over SSH, &lt;em&gt;but&lt;/em&gt; this connection would be coming from any of a huge possible range of build container IP addresses. I didn't want to open the server up to the whole world to accommodate the build system. Eventually I settled on Elastic Beanstalk, which can be controlled through the AWS command-line interface with the proper credentials instead of the morass of VPCs and security groups. Just upload a zip file!&lt;/p&gt;

&lt;p&gt;The cost of using EBS, it turned out, was that while it made difficult things easy it also made easy things difficult. How do you deploy the same application to different environments? You don't. Everything has to be in that zip file, and if that includes any per-environment configuration then the right config files had better be where they're expected to be. This is less than ideal, but at least it can be scripted. Here's the whole thing (assuming &lt;code&gt;awscli&lt;/code&gt; has already been installed):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# what time is it?&lt;/span&gt;
&lt;span class="nv"&gt;TIMESTAMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# work around Elastic Beanstalk permissions for node-gyp (bcrypt)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"unsafe-perm=true"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; .npmrc

&lt;span class="c"&gt;# generate artifacts&lt;/span&gt;
npm run build

&lt;span class="c"&gt;# download config&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;cp &lt;/span&gt;s3://elasticbeanstalk-bucket-name/app/development.config.json &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# zip everything up&lt;/span&gt;
zip &lt;span class="nt"&gt;-r&lt;/span&gt; app-dev.zip &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--exclude&lt;/span&gt; &lt;span class="s2"&gt;"node_modules/*"&lt;/span&gt; &lt;span class="s2"&gt;".git/*"&lt;/span&gt; &lt;span class="s2"&gt;"coverage/*"&lt;/span&gt; &lt;span class="s2"&gt;".nyc_output/*"&lt;/span&gt; &lt;span class="s2"&gt;"test/*"&lt;/span&gt; &lt;span class="s2"&gt;".circleci/*"&lt;/span&gt;

&lt;span class="c"&gt;# upload to s3&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;mv&lt;/span&gt; ./app-dev.zip s3://elasticbeanstalk-bucket-name/app/app-dev-&lt;span class="nv"&gt;$TIMESTAMP&lt;/span&gt;.zip

&lt;span class="c"&gt;# create new version&lt;/span&gt;
aws elasticbeanstalk create-application-version &lt;span class="nt"&gt;--region&lt;/span&gt; us-west-2 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--application-name&lt;/span&gt; app &lt;span class="nt"&gt;--version-label&lt;/span&gt; development-&lt;span class="nv"&gt;$TIMESTAMP&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--source-bundle&lt;/span&gt; &lt;span class="nv"&gt;S3Bucket&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;elasticbeanstalk-bucket-name,S3Key&lt;span class="o"&gt;=&lt;/span&gt;app/app-dev-&lt;span class="nv"&gt;$TIMESTAMP&lt;/span&gt;.zip

&lt;span class="c"&gt;# deploy to dev environment&lt;/span&gt;
&lt;span class="c"&gt;# --application-name app is not specified because apt installs&lt;/span&gt;
&lt;span class="c"&gt;# an older version of awscli which doesn't accept that option&lt;/span&gt;
aws elasticbeanstalk update-environment &lt;span class="nt"&gt;--region&lt;/span&gt; us-west-2 &lt;span class="nt"&gt;--environment-name&lt;/span&gt; app-dev &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--version-label&lt;/span&gt; development-&lt;span class="nv"&gt;$TIMESTAMP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;TIMESTAMP&lt;/code&gt; ensures the build can be uniquely identified later. The &lt;code&gt;.npmrc&lt;/code&gt; setting is for AWS reasons: as detailed in &lt;a href="https://stackoverflow.com/questions/46001516/beanstalk-node-js-deployment-node-gyp-fails-due-to-permission-denied"&gt;this StackOverflow answer&lt;/a&gt;, the unfortunately-acronymed &lt;code&gt;node-gyp&lt;/code&gt; runs as the instance's ec2-user account and doesn't have permissions it needs to compile bcrypt. If you're not using bcrypt (or another project that involves a &lt;code&gt;node-gyp&lt;/code&gt; step on install), you don't need that line.&lt;/p&gt;

&lt;p&gt;The zip is assembled in three steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;npm build&lt;/code&gt; compiles stylesheets, dynamic Pug templates, frontend JavaScript, and so forth.&lt;/li&gt;
&lt;li&gt;The appropriate environment config is downloaded from an S3 bucket.&lt;/li&gt;
&lt;li&gt;Everything is rolled together in the zip file, minus the detritus of source control and test results.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Finally, the Elastic Beanstalk deploy happens in two stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;aws elasticbeanstalk create-application-version&lt;/code&gt; does what it sounds like: each timestamped zip file becomes a new "version". These don't map exactly to versions as more commonly understood thanks to the target environment configuration, so naming them for the target environment and giving the timestamp helps identify them.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;aws elasticbeanstalk update-environment&lt;/code&gt; actually deploys the newly-created "version" to the destination environment.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Obviously, when it comes time to roll the project out to production, I'll factor the environment out into a variable to download and upload the appropriate artifacts. But even in its current state, this one small script has almost made deployment continuous: every pushed commit gets deployed to Elastic Beanstalk with no manual intervention, unless there are database changes. That's next.&lt;/p&gt;

</description>
      <category>node</category>
      <category>cloud</category>
      <category>bash</category>
      <category>circleci</category>
    </item>
    <item>
      <title>What's your atomic bomb?</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Mon, 10 Sep 2018 10:11:42 +0000</pubDate>
      <link>https://forem.com/dmfay/whats-your-atomic-bomb-20ob</link>
      <guid>https://forem.com/dmfay/whats-your-atomic-bomb-20ob</guid>
      <description>&lt;p&gt;Inspired by &lt;a href="https://dev.to/bertilmuth/software-that-helps-33ge"&gt;Software that helps&lt;/a&gt;, except I disagree with Bertil's implicit assertion that all software is helpful in the long run :) The development of the atomic bomb is the classic example of how diffusion of responsibility through a team and/or management structure helps people rationalize working on all kinds of stuff. Likely none of us have reached the ethical depths plumbed by Oppenheimer and his team, but what &lt;em&gt;aren't&lt;/em&gt; you proud of?&lt;/p&gt;

&lt;p&gt;I worked in advertising for a few years, making it easier to waste paper and annoy people worldwide with those mailers from car dealerships you throw away without looking at. I'm never working in advertising again.&lt;/p&gt;

</description>
      <category>discuss</category>
    </item>
    <item>
      <title>Surrealist Remixes with Markov Chains</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Sun, 05 Aug 2018 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/surrealist-remixes-with-markov-chains-22pp</link>
      <guid>https://forem.com/dmfay/surrealist-remixes-with-markov-chains-22pp</guid>
      <description>&lt;p&gt;There's a new button at the bottom of this (and each) post. Try clicking it! (If you're reading this on &lt;a href="https://dev.to"&gt;dev.to&lt;/a&gt; or an RSS reader, you'll need to visit &lt;a href="https://di.nmfay.com/markov-remix"&gt;di.nmfay.com&lt;/a&gt; to see it)&lt;/p&gt;

&lt;p&gt;By now everyone's run into Twitter bots and automated text generators that combine words in ways that &lt;em&gt;almost&lt;/em&gt; compute. There's even a &lt;a href="https://www.reddit.com/r/SubredditSimulator/"&gt;subreddit&lt;/a&gt; that runs the user-generated content of other subreddits through individual accounts which make posts that seem vaguely representative of their sources, but either defy comprehension or break through into a sublime silliness.&lt;/p&gt;

&lt;p&gt;People have engaged in wordplay (and word-work) for as long as we've communicated with words. Taking language apart and putting it back together in novel ways has been the domain of poets, philosophers, and magicians alike for eons, to say nothing of puns, dad jokes, glossolalia, and word salad.&lt;/p&gt;

&lt;p&gt;In the early 20th century, artists associated with the Surrealist movement played a game, variously for entertainment and inspiration, called "exquisite corpse". Each player writes a word (in this version, everyone is assigned a part of speech ahead of time) or draws on an exposed section of paper, then folds the sheet over to obscure their work from the next player. Once everyone's had a turn, the full sentence or picture is revealed. The game takes its name from its first recorded result: &lt;em&gt;le cadavre exquis boira le vin nouveau&lt;/em&gt;, or "the exquisite corpse shall drink the new wine".&lt;/p&gt;

&lt;p&gt;The Surrealist seeds fell on fertile ground and their ideas spread throughout the artistic and literary world, just as they themselves had been informed by earlier avant-garde movements like Symbolism and Dada. In the mid-century, writers and occultists like Brion Gysin and William Burroughs used similar techniques to discover new meanings in old texts. The only real difference in our modern toys is that they run on their own -- it's a little bit horror movie ouija board, except you can see the workings for yourself.&lt;/p&gt;

&lt;p&gt;There are a variety of ways to implement this kind of functionality. On the more primitive side, you have "mad libs" algorithms which select random values to insert into known placeholders, as many Twitter bots such as &lt;a href="https://twitter.com/godtributes"&gt;@godtributes&lt;/a&gt; or &lt;a href="https://twitter.com/bottest_takes"&gt;@bottest_takes&lt;/a&gt; do. This method runs up against obvious limitations fairly quickly: the set of substitutions is finite, and the structure they're substituted into likewise becomes predictable.&lt;/p&gt;

&lt;p&gt;More advanced text generators are predictive, reorganizing words or phrases from a body of text or &lt;em&gt;corpus&lt;/em&gt; in ways which reflect the composition of the corpus itself: words aren't simply jumbled up at random, but follow each other in identifiable sequences. Many generators like these run on Markov chains, probabilistic state machines where the next state is a function only of the current state.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing a Textual Markov Chain
&lt;/h2&gt;

&lt;p&gt;The first order of business in using a Markov chain to generate text is to break up the original corpus. Regular expressions matching whitespace make that easy enough, turning it into an array of words. The next step is to establish the links between states, which is where things start getting a little complex.&lt;/p&gt;

&lt;p&gt;Textual Markov chains have one important parameter: the prefix length, which defines how many previous states (words) comprise the current state and must be evaluated to find potential next states. Prefixes must comprise at least one word, but for the purposes of natural-seeming text generation the sweet spot tends to be between two and four words depending on corpus length. With too short a prefix length, the output tends to be simply garbled; too long a prefix or too short a corpus, and there may be too few potential next states for the chain to diverge from the original text.&lt;/p&gt;

&lt;p&gt;Mapping prefixes to next states requires a sliding window on the array. This is more easily illustrated. Here's a passage from &lt;em&gt;Les Chants de Maldoror&lt;/em&gt;, a 19th-century prose poem rediscovered and given new fame (or infamy) by the Surrealists, who identified in its obscene grandeur a deconstruction of language and the still-developing format of the modern novel that prefigured their own artistic ideology:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;He is as fair as the retractility of the claws of birds of prey; or again, as the uncertainty of the muscular movements in wounds in the soft parts of the lower cervical region; or rather, as that perpetual rat-trap always reset by the trapped animal, which by itself can catch rodents indefinitely and work even when hidden under straw; and above all, as the chance meeting on a dissecting-table of a sewing-machine and an umbrella!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Assuming a prefix length of 2, the mapping might start to take this shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"He is": ["as"],
"is as": ["fair"],
"as fair": ["as"],
"fair as": ["the"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Starting from the first prefix ("He is"), there is only one next state possible since the words "He is" only appear once in the corpus. Upon reaching the next state, the active prefix is now "is as", which likewise has only one possible next state, and so forth. But when the current state reaches "as the", the next word to be added may be "retractility", "uncertainty", or "chance", and what happens after that depends on the route taken. Multiple next states introduce the potential for divergence; this is also why having too long a prefix length, or too short a corpus, results in uninteresting output!&lt;/p&gt;

&lt;p&gt;Because the prefix is constantly losing its earliest word and appending the next, it's stored as a stringified array rather than as a concatenated string. The order of operations goes like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Select one of the potential next states for the current stringified prefix array.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;shift&lt;/code&gt; the earliest word out of the prefix array and &lt;code&gt;push&lt;/code&gt; the selected next word onto the end.&lt;/li&gt;
&lt;li&gt;Stringify the new prefix array.&lt;/li&gt;
&lt;li&gt;Repeat until bored, or until there's no possible next state.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Remix!
&lt;/h2&gt;

&lt;p&gt;If you're interested in the actual code, it's &lt;code&gt;remix.js&lt;/code&gt; in devtools, or you can find it in &lt;a href="https://github.com/dmfay/blog/blob/master/assets/remix.js"&gt;source control&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Markov chain generators aren't usually interactive; that's where the "probabilistic" part of "probabilistic state machine" comes into play. This makes the implementation here incomplete by design. Where only one possible next state exists, the state machine advances on its own, but where there are multiple, it allows the user to choose how to proceed. This, along with starting from the beginning instead of selecting a random opening prefix, gives it more an exploratory direction than if it simply restructured the entire corpus at the push of a button. The jury's still out on whether any great insights lie waiting to be unearthed, as the more mystically-minded practitioners of aleatory editing hoped, but in the mean time, the results are at least good fun.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>surrealism</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Summer 2018: Massive, Twice Over</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Mon, 30 Jul 2018 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/summer-2018-massive-twice-over-14g0</link>
      <guid>https://forem.com/dmfay/summer-2018-massive-twice-over-14g0</guid>
      <description>&lt;p&gt;&lt;a href="https://vimeo.com/281409168"&gt;NDC talks are up&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;There's also the &lt;a href="https://skillsmatter.com/skillscasts/12008-database-as-api-with-postgresql-and-massive-js"&gt;FullStack London&lt;/a&gt; version which is slightly condensed for a shorter timeslot, if you have a SkillsMatter account and want to get right to the fun parts.&lt;/p&gt;

&lt;p&gt;If you've read (almost) anything I've written, text or code, odds are you've run into &lt;a href="https://github.com/dmfay/massive-js"&gt;Massive.js&lt;/a&gt;. On the off chance you haven't, the elevator pitch is that PostgreSQL exclusivity lets you get a lot more mileage out of your database (as long as it's Postgres) and JavaScript being a dynamically typed, functional-ish language lets you get away with it really easily.&lt;/p&gt;

&lt;p&gt;This talk goes over Massive in much more depth: first laying out a case for alternatives to the dominant object-relational mapping data access technique, in general and especially in JavaScript; and then diving into the architecture of Massive itself with plenty of examples. Also, there's some trivia about early 20th century Russian avant-garde art and another bit poking fun at French modernist architect Le Corbusier.&lt;/p&gt;

&lt;p&gt;It's the second talk I've done, and overall I was pretty happy with how it went in Oslo and London both! I'm the furthest thing from a natural public speaker but I covered what I wanted to cover, finished at a reasonable time, and didn't screw anything up too badly -- so that's a success in my book. And after all, the only way to improve this particular skill is to keep doing it.&lt;/p&gt;

</description>
      <category>database</category>
      <category>javascript</category>
      <category>node</category>
      <category>speaking</category>
    </item>
    <item>
      <title>Centralize Your Query Logic!</title>
      <dc:creator>Dian Fay</dc:creator>
      <pubDate>Wed, 25 Jul 2018 00:00:00 +0000</pubDate>
      <link>https://forem.com/dmfay/centralize-your-query-logic-5bhh</link>
      <guid>https://forem.com/dmfay/centralize-your-query-logic-5bhh</guid>
      <description>&lt;p&gt;At a talk I gave earlier this month, an audience member asked if &lt;a href="https://github.com/dmfay/massive-js"&gt;Massive&lt;/a&gt; supported joining information from multiple tables together. It's come up on the issue tracker before as well. Massive does not currently have this functionality, and while I'm open to suggestions it's not on my own radar.&lt;/p&gt;

&lt;p&gt;The central reason for this is that join logic can be tricky to manage from the application architecture side. The ability to correlate and combine what you need when you need it is certainly powerful, but it also embeds assumptions about your database layout in client code. As the database and application evolve, these assumptions can easily fall out of date and out of sync with each other. In real terms, if your application's "model" (whether implicit or explicit) of a user loaded from the database includes only the user record itself sometimes, but other times looks for information in a separate profile table, adds current statistics, et cetera, and you have functionality that operates on A User, either you understand that users come in different shapes and handle them accordingly across the board or you are living on borrowed uptime.&lt;/p&gt;

&lt;p&gt;Some application architectures approach this scenario by grouping the query logic together. In the enterprise world, &lt;em&gt;n-tier&lt;/em&gt; applications frequently pull related queries into "services" or Data Access Objects (DAOs) so there's at least some kind of organizational schema. This reduces the maintenance overhead somewhat, but it's an imperfect solution, not least because there's nothing but fallible code reviews (if that) standing in the way of someone dropping data access code somewhere else.&lt;/p&gt;

&lt;p&gt;Fortunately, there's already part of the application-database ecosystem dedicated to organizing things -- the database itself! And as an organizing principle, it already has its own way to manage complex queries. Sure, it'll involve writing a little SQL, but let's face it: you were going to wind up writing SQL eventually anyway.&lt;/p&gt;

&lt;p&gt;If you've only scratched the surface of working with databases, you might not be familiar with views. The good news is they're pretty straightforward: a view is a stored SQL query with a name, given life with the statement &lt;code&gt;CREATE VIEW myview AS SELECT...&lt;/code&gt;. You can &lt;code&gt;SELECT&lt;/code&gt; from a view just like you can a table, optionally with &lt;code&gt;JOIN&lt;/code&gt;s and a &lt;code&gt;WHERE&lt;/code&gt; clause and all the other trimmings, whereupon the database executes the query. Results are not stored so the information you get out of a view is always current, unless you intentionally sacrifice realtime data for speed by creating a &lt;em&gt;materialized&lt;/em&gt; view which does persist results and has to be manually refreshed.&lt;/p&gt;

&lt;p&gt;The reason views are underrated and underutilized in application development has mostly to do with the frameworks developers use to communicate with databases. When you have to provide a concrete implementation of a unary &lt;code&gt;User&lt;/code&gt; model, odds are you only care about things you can both read &lt;em&gt;and&lt;/em&gt; write to, so you back it up with tables instead of using views to shape data for your needs. There's little room for views in object/relational mapping, and when I've had to use O/RMs I've really only been able to take advantage of views to streamline the raw SQL queries you have to write anyway when you use O/RMs.&lt;/p&gt;

&lt;p&gt;If you're not stuck with an object-relational mapper, though, you can really get your money's worth out of views! Retrieving user records from a view, or building more complex user-inclusive results by joining it into other views, ensures that you have a consistent definition of &lt;em&gt;what information comprises a user&lt;/em&gt; built into your database. You can't always stop other developers from winging it, naturally, but having that central definition to point to eliminates at least one major potential ambiguity. Massive's omission of the join feature encourages developers using it to center their thinking on the database and the tools it offers for organizing information.&lt;/p&gt;

&lt;p&gt;As with anything, there are tradeoffs. Here, it's flexibility. Views may be ephemeral stored queries, but they're still part of the database schema for all that, and the schema takes more planning and effort to change than does application code. But it's a good idea to be thinking carefully about this stuff in the first place.&lt;/p&gt;

</description>
      <category>database</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
