<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Daniel Westheide</title>
    <description>The latest articles on Forem by Daniel Westheide (@dwestheide).</description>
    <link>https://forem.com/dwestheide</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2851%2Fbig_1KddbUgv.jpg</url>
      <title>Forem: Daniel Westheide</title>
      <link>https://forem.com/dwestheide</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/dwestheide"/>
    <language>en</language>
    <item>
      <title>"Introducing kontextfrei" </title>
      <dc:creator>Daniel Westheide</dc:creator>
      <pubDate>Thu, 09 Nov 2017 13:56:28 +0000</pubDate>
      <link>https://forem.com/dwestheide/introducing-kontextfrei-7kb</link>
      <guid>https://forem.com/dwestheide/introducing-kontextfrei-7kb</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally posted on &lt;a href="http://danielwestheide.com/blog/2017/10/31/introducing-kontextfrei.html"&gt;Daniel Westheide's blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For the past 15 months, I have been working on a new library on and off. So far, I have been mostly silent about it, because I didn't feel like it was ready for a wider audience to use – even though we had been using it successfully in production for a while. However, since I broke my silence as long ago as April this year, when I did a talk about it at this year's ScalarConf in Warsaw, a blog post is overdue in which I explain what this library does and why I set out to write it in the first place.&lt;/p&gt;

&lt;p&gt;Last year, I was involved in a project that required my team to implement a few Spark applications. For most of them, the business logic was rather complex, so we tried to implement this business logic in a test-driven way, using property-driven tests.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pain of unit-testing Spark applications
&lt;/h2&gt;

&lt;p&gt;At first glance, it looks like this is a great match. When it comes down to it, a Spark application consists of IO stages (reading from and writing to data sources) and transformations of data sets. The latter constitute our business logic and are relatively easy to separate from the IO parts. They are mostly built from pure functions. Functions like these are usually a perfect fit for test-driven development as well as for property-based testing.&lt;/p&gt;

&lt;p&gt;However, all was not great. It may be old news to you if you have been working with Apache Spark for a while, but it turns out that writing real unit tests is not actually supported that well by Spark, and as a result, it can be quite painful. The thing is that in order to create an &lt;code&gt;RDD&lt;/code&gt;, we always need a &lt;code&gt;SparkContext&lt;/code&gt;, and the most light-weight mechanism for getting one is to create a local &lt;code&gt;SparkContext&lt;/code&gt;. Creating a local &lt;code&gt;SparkContext&lt;/code&gt; means that we start up a server, which takes a few seconds, and testing our properties with lots of different generated input data takes a really long time. Most certainly, we are losing the fast feedback loop we are used to from developing web applications, for example.&lt;/p&gt;

&lt;h2&gt;
  
  
  Abstracting over RDDs with kontextfrei
&lt;/h2&gt;

&lt;p&gt;Now, we could confine ourselves to only unit-testing the functions that we pass to &lt;code&gt;RDD&lt;/code&gt; operators, so that our unit tests do not have any dependency on Spark and can be verified as quickly as we are used to. However, this leaves quite a lot of business logic uncovered. Instead, at a Scala hackathon last May, I started to experiment with the idea of abstracting over Spark's &lt;code&gt;RDD&lt;/code&gt;, and &lt;em&gt;kontextfrei&lt;/em&gt; was born.&lt;/p&gt;

&lt;p&gt;The idea is the following: By abstracting over &lt;code&gt;RDD&lt;/code&gt;, we can write business logic that has no dependency on the &lt;code&gt;RDD&lt;/code&gt; type. This means that we can also write test properties that are Spark-agnostic. Any Spark-agnostic code like this can either be executed on an &lt;code&gt;RDD&lt;/code&gt; (which you would do in your actual Spark application and in your integration tests), or on a local and fast Scala collection (which is really great for unit tests that you continously run locally during development).&lt;/p&gt;

&lt;h2&gt;
  
  
  Obtaining the library
&lt;/h2&gt;

&lt;p&gt;It's probably easier to show how this works than to describe it with words alone, so let's look at a really minimalistic example, the traditional &lt;em&gt;word count&lt;/em&gt;. First, we need to add the necessary dependencies to our SBT build file. Kontextfrei consists of two different modules, &lt;code&gt;kontextfrei-core&lt;/code&gt; and &lt;code&gt;kontextfrei-scalatest&lt;/code&gt;. The former is what you need to abstract over &lt;code&gt;RDD&lt;/code&gt; in your main code base, the former to get some additional support for writing your RDD-independent tests using ScalaTest with ScalaCheck. Let's add them to our &lt;code&gt;build.sbt&lt;/code&gt; file, together with the usual&lt;br&gt;
Spark dependency you would need anyway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight scala"&gt;&lt;code&gt;&lt;span class="n"&gt;resolvers&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s"&gt;"dwestheide"&lt;/span&gt; &lt;span class="n"&gt;at&lt;/span&gt; &lt;span class="s"&gt;"https://dl.bintray.com/dwestheide/maven"&lt;/span&gt;
&lt;span class="n"&gt;libraryDependencies&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s"&gt;"com.danielwestheide"&lt;/span&gt; &lt;span class="o"&gt;%%&lt;/span&gt; &lt;span class="s"&gt;"kontextfrei-core-spark-2.2.0"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s"&gt;"0.6.0"&lt;/span&gt;
&lt;span class="n"&gt;libraryDependencies&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s"&gt;"com.danielwestheide"&lt;/span&gt; &lt;span class="o"&gt;%%&lt;/span&gt; &lt;span class="s"&gt;"kontextfrei-scalatest-spark-2.2.0"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s"&gt;"0.6.0"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s"&gt;"test,it"&lt;/span&gt;
&lt;span class="n"&gt;libraryDependencies&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s"&gt;"org.apache.spark"&lt;/span&gt; &lt;span class="o"&gt;%%&lt;/span&gt; &lt;span class="s"&gt;"spark-core"&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="s"&gt;"2.2.0"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Please note that in this simple example, we create a Spark application that you can execute in a self-contained way. In the real world, you would add &lt;code&gt;spark-core&lt;/code&gt; as a &lt;code&gt;provided&lt;/code&gt; dependency and create an assembly JAR that you pass to &lt;code&gt;spark-submit&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing the business logic
&lt;/h2&gt;

&lt;p&gt;Now, let's see how we can implement the business logic of our word count application using &lt;em&gt;kontextfrei&lt;/em&gt;. In our example, we define all of our business logic in a trait called &lt;code&gt;WordCount&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight scala"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.wordcount&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.DCollectionOps&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.syntax.SyntaxSupport&lt;/span&gt;

&lt;span class="k"&gt;trait&lt;/span&gt; &lt;span class="nc"&gt;WordCount&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;SyntaxSupport&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;counts&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;DCollectionOps&lt;/span&gt;&lt;span class="o"&gt;](&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="o"&gt;])&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[(&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;, &lt;span class="kt"&gt;Long&lt;/span&gt;&lt;span class="o"&gt;)]&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;
      &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;flatMap&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;split&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;" "&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
      &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1L&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
      &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;reduceByKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
      &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;sortBy&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;_&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;_2&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ascending&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;formatted&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;DCollectionOps&lt;/span&gt;&lt;span class="o"&gt;](&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[(&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;, &lt;span class="kt"&gt;Long&lt;/span&gt;&lt;span class="o"&gt;)])&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt;
    &lt;span class="nv"&gt;counts&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;map&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;case&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="s"&gt;"$word,$count"&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first thing you'll notice is that the implementations of &lt;code&gt;counts&lt;/code&gt; and &lt;code&gt;formatted&lt;/code&gt; look exactly the same as they would if you were programming against Spark's &lt;code&gt;RDD&lt;/code&gt; type. You could literally copy and paste &lt;code&gt;RDD&lt;/code&gt;-based code into a program written with &lt;em&gt;kontextfrei&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The second thing you notice is that the method signatures of &lt;code&gt;counts&lt;/code&gt; and &lt;code&gt;formatted&lt;/code&gt; contain a type constructor, declared as &lt;code&gt;F[_]&lt;/code&gt;, which is constrained by a context bound: For any concrete type constructor we pass in here, there must be an instance of kontextfrei's &lt;code&gt;DCollectionOps&lt;/code&gt; typeclass. In our business logic, we do not care what concrete type constructor is used for &lt;code&gt;F&lt;/code&gt;, as long as the operations defined in &lt;code&gt;DCollectionOps&lt;/code&gt; are supported for it. This way, we are liberating our business logic from any dependency on Spark, and specifically on the annoying &lt;code&gt;SparkContext&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In order to be able to use the familiar syntax we know from the &lt;code&gt;RDD&lt;/code&gt; type, we mix in kontextfrei's &lt;code&gt;SyntaxSupport&lt;/code&gt; trait, but you could just as well use an import instead, if that's more to your liking.&lt;/p&gt;

&lt;h2&gt;
  
  
  Plugging our business logic into the Spark application
&lt;/h2&gt;

&lt;p&gt;At the end of the day, we want to be able to have a runnable Spark application. In order to achieve that, we must plug our Spark-agnostic business logic together with the Spark-dependent IO parts of our application. Here is what this looks like in our word count example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight scala"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.wordcount&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.rdd.RDDOpsSupport&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.spark.SparkContext&lt;/span&gt;

&lt;span class="k"&gt;object&lt;/span&gt; &lt;span class="nc"&gt;Main&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;App&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;WordCount&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;RDDOpsSupport&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

  &lt;span class="k"&gt;implicit&lt;/span&gt; &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;sparkContext&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;SparkContext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SparkContext&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"local[1]"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"word-count"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;inputFilePath&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;args&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;outputFilePath&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;args&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;textFile&lt;/span&gt;   &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;sparkContext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;textFile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inputFilePath&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;minPartitions&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;wordCounts&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;counts&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;textFile&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;formatted&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordCounts&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="py"&gt;saveAsTextFile&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outputFilePath&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

  &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;sparkContext&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our &lt;code&gt;Main&lt;/code&gt; object mixes in our &lt;code&gt;WordCount&lt;/code&gt; trait as well as &lt;em&gt;kontextfrei&lt;/em&gt;'s &lt;code&gt;RDDOpsSupport&lt;/code&gt;, which proves to the compiler that we have an instance of the &lt;code&gt;DCollectionOps&lt;/code&gt; typeclass for the &lt;code&gt;RDD&lt;/code&gt; type constructor. In order to prove this, we also need an implicit &lt;code&gt;SparkContext&lt;/code&gt;. Again, instead of mixing in this trait, we can also use an import.&lt;/p&gt;

&lt;p&gt;Now, our &lt;code&gt;Main&lt;/code&gt; object is all about doing some IO and integrating our business logi into it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Writing Spark-agnostic tests
&lt;/h2&gt;

&lt;p&gt;So far so good. We have liberated our business logic from any dependency on Spark, but what do we gain from this? Well, now we are able to write our unit tests in a Spark-agnostic way as well. First, we define a &lt;code&gt;BaseSpec&lt;/code&gt; which inherits from kontextfrei's &lt;code&gt;KontextfreiSpec&lt;/code&gt; and mixes in a few other goodies from &lt;em&gt;kontextfrei-scalatest&lt;/em&gt; and from ScalaTest itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight scala"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.wordcount&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.scalatest.KontextfreiSpec&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.syntax.DistributionSyntaxSupport&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.scalactic.anyvals.PosInt&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.scalatest.prop.GeneratorDrivenPropertyChecks&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.scalatest.&lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nc"&gt;MustMatchers&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;PropSpecLike&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;trait&lt;/span&gt; &lt;span class="nc"&gt;BaseSpec&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;
    &lt;span class="nc"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;KontextfreiSpec&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DistributionSyntaxSupport&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;PropSpecLike&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;GeneratorDrivenPropertyChecks&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;MustMatchers&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

  &lt;span class="k"&gt;implicit&lt;/span&gt; &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;config&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;PropertyCheckConfiguration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="nc"&gt;PropertyCheckConfiguration&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minSuccessful&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PosInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;BaseSpec&lt;/code&gt;, like our &lt;code&gt;WordCount&lt;/code&gt; trait, takes a type constructor, which it simply passes along to the &lt;code&gt;KontextfreiSpec&lt;/code&gt; trait. We will get back to that one in a minute.&lt;/p&gt;

&lt;p&gt;Our actual test properties can now be implemented for any type constructor &lt;code&gt;F[_]&lt;/code&gt; for which there is an instance of &lt;code&gt;DCollectionOps&lt;/code&gt;. We define them in a trait &lt;code&gt;WordCountProperties&lt;/code&gt;, which also has to be parameterized by a type constructor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight scala"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.wordcount&lt;/span&gt;

&lt;span class="k"&gt;trait&lt;/span&gt; &lt;span class="nc"&gt;WordCountProperties&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="k"&gt;_&lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt; &lt;span class="nc"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;BaseSpec&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;F&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;WordCount&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

  &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;collection.immutable._&lt;/span&gt;

  &lt;span class="nf"&gt;property&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"sums word counts across lines"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;forAll&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordA&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;whenever&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;wordA&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;nonEmpty&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;wordB&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;wordA&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;reverse&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;wordA&lt;/span&gt;
        &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;result&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt;
          &lt;span class="nf"&gt;counts&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Seq&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="s"&gt;"$wordB $wordA $wordB"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wordB&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="py"&gt;distributed&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;collectAsMap&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
        &lt;span class="nf"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordB&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
      &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;property&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"does not have duplicate keys"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;forAll&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wordA&lt;/span&gt;&lt;span class="k"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nf"&gt;whenever&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;wordA&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;nonEmpty&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;wordB&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;wordA&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;reverse&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;wordA&lt;/span&gt;
        &lt;span class="k"&gt;val&lt;/span&gt; &lt;span class="nv"&gt;result&lt;/span&gt; &lt;span class="k"&gt;=&lt;/span&gt;
          &lt;span class="nf"&gt;counts&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Seq&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="s"&gt;"$wordA $wordB"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="s"&gt;"$wordB $wordA"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="py"&gt;distributed&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;assert&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
          &lt;span class="nv"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;keys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;distinct&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="py"&gt;collect&lt;/span&gt;&lt;span class="o"&gt;().&lt;/span&gt;&lt;span class="py"&gt;toList&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nv"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;keys&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;collect&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="py"&gt;toList&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
      &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;

&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We want to be able to test our Spark-agnostic properties both against fast Scala collections as well as against &lt;code&gt;RDD&lt;/code&gt;s in a local Spark cluster. To get there, we will need to define two test classes, one in the &lt;code&gt;test&lt;/code&gt; sources directory, the other one in the &lt;code&gt;it&lt;/code&gt; sources directory. Here is the unit test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight scala"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.wordcount&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.scalatest.StreamSpec&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WordCountSpec&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;BaseSpec&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Stream&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;StreamSpec&lt;/span&gt;
  &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;WordCountProperties&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;Stream&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We mix in &lt;code&gt;BaseSpec&lt;/code&gt; and pass it the &lt;code&gt;Stream&lt;/code&gt; type constructor. &lt;code&gt;Stream&lt;/code&gt; has the same shape as &lt;code&gt;RDD&lt;/code&gt;, but it is a Scala collection. The &lt;code&gt;KontextfreiSpec&lt;/code&gt; trait extended by &lt;code&gt;BaseSpec&lt;/code&gt; defines an abstract implicit &lt;code&gt;DCollectionOps&lt;/code&gt; for its type constructor. By mixing in &lt;code&gt;StreamSpec&lt;/code&gt;, we get an instance of &lt;code&gt;DCollectionOps&lt;/code&gt; for &lt;code&gt;Stream&lt;/code&gt;. When we implement our business logic, we can run the &lt;code&gt;WordCountSpec&lt;/code&gt; test and get instantaneous feedback. We can use SBT's triggered execution and have it run our unit tests upon every detected source change, using &lt;code&gt;~test&lt;/code&gt;, and it will be really fast.&lt;/p&gt;

&lt;p&gt;In order to make sure that none of the typical bugs that you would only notice in a Spark cluster have sneaked in, we also define an integration test, which tests exactly the same properties:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight scala"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.wordcount&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.danielwestheide.kontextfrei.scalatest.RDDSpec&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.apache.spark.rdd.RDD&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WordCountIntegrationSpec&lt;/span&gt; &lt;span class="k"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;BaseSpec&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;RDD&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;RDDSpec&lt;/span&gt;
  &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;WordCountProperties&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;RDD&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This time, we mix in &lt;code&gt;RDDSpec&lt;/code&gt; because we pass parameterize &lt;code&gt;BaseSpec&lt;/code&gt; with the &lt;code&gt;RDD&lt;/code&gt; type constructor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design goals
&lt;/h2&gt;

&lt;p&gt;It was an explicit design goal to stick to the existing Spark API as closely as possible, allowing people with existing Spark code bases to switch to &lt;em&gt;kontextfrei&lt;/em&gt; as smoothly as possible, or even to migrate parts of their application without too much hassle, with the benefit of now being able to cover their business logic with missing tests without the usual pain.&lt;/p&gt;

&lt;p&gt;An alternative to this, of course, would have been to build this library based on the ever popular interpreter pattern. To be honest, I wish Spark itself was using this pattern – other libraries like Apache Crunch have shown successfully that this can help tremendously with enabling developers to write tests for the business logic of their applications. If Spark was built on those very principles, there wouldn't ne any reason for &lt;em&gt;kontextfrei&lt;/em&gt; to exist at all.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;kontextfrei&lt;/em&gt; is still a young library, and while we have been using it in production in one project, I do not know of any other adopters. One if its limitations is that it doesn't yet support all operations defined on the &lt;code&gt;RDD&lt;/code&gt; type – but we are getting closer. In addition, I have yet to find a clever way to support broadcast variables and accumulators. And of course, who is using &lt;code&gt;RDD&lt;/code&gt;s anyway in 2017? While I do think that there is still room for &lt;code&gt;RDD&lt;/code&gt;-based Spark applications, I am aware that many people have long moved on to &lt;code&gt;Dataset&lt;/code&gt;s and to Spark Streaming. It would be nice to create a similar typeclass-based abstraction for datasets and for streaming applications, but I haven't had the time to look deeper into what would be necessary to implement either of those.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;kontextfrei&lt;/em&gt; is a Scala library that aims to provide developers with a faster feedback loop when developing Apache Spark applications. To achieve that, it enables you to write the business logic of your Spark application, as well as your test code, against an abstraction over Spark’s RDD.&lt;/p&gt;

&lt;p&gt;I would love to hear your thoughts on this approach. Do you think it's worth it defining the biggest typeclass ever and reimplementing the &lt;code&gt;RDD&lt;/code&gt; logic for Scala collections for test purposes? Please, if this looks interesting, do try it out. I am always interested in feedback and in contributions of all kind.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dwestheide.github.io/kontextfrei/index.html"&gt;kontextfrei project website&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/dwestheide/kontextfrei"&gt;kontextfrei GitHub repo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/dwestheide/kontextfrei-wordcount"&gt;kontextfrei wordcount example&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>spark</category>
      <category>scala</category>
      <category>showdev</category>
      <category>tdd</category>
    </item>
    <item>
      <title>The Empathic Programmer</title>
      <dc:creator>Daniel Westheide</dc:creator>
      <pubDate>Tue, 07 Feb 2017 08:42:20 +0000</pubDate>
      <link>https://forem.com/dwestheide/the-empathic-programmer</link>
      <guid>https://forem.com/dwestheide/the-empathic-programmer</guid>
      <description>&lt;p&gt;&lt;em&gt;This article was originally posted on &lt;a href="http://danielwestheide.com/blog/2017/01/16/the-empathic-programmer.html"&gt;Daniel Westheide's blog&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In 1999, Andrew Hunt and Dave Thomas, in their seminal book, demanded that programmers be &lt;a href="https://www.goodreads.com/book/show/4099.The_Pragmatic_Programmer"&gt;pragmatic&lt;/a&gt;. Ten years later, Chad Fowler, in his excellent book on career development, asked programmers to be &lt;a href="https://www.goodreads.com/book/show/6399113-the-passionate-programmer"&gt;passionate&lt;/a&gt;. Even today, I still consider a lot of the advice in both of these books to be incredibly valuable, especially Fowler's book that helped me a lot, personally.&lt;/p&gt;

&lt;p&gt;Nevertheless, in recent years, I have witnessed again and again that one other quality in programmers is at least as important and that it hasn't even seen a fraction of the attention it deserves. The programmer we should all strive to be is &lt;em&gt;the empathic programmer&lt;/em&gt;. Of course, I am not the only one, let alone the first one, to realize that. For starters, in my bubble, Benjamin Reitzimmer wrote an excellent post about what he considers to be &lt;a href="http://squeakyvessel.com/2015/05/12/mature-developers/"&gt;important qualities of a mature developer&lt;/a&gt; a while ago, and empathy is one of them. I consider a lack of empathy to be the root cause for some of the biggest problems in our industry and in the tech community. In this post, I want to share some observations on how a lack of empathy leads to problems. Consider it a call to strive for more empathy.&lt;/p&gt;

&lt;p&gt;So what is empathy? Here is a &lt;a href="https://www.merriam-webster.com/dictionary/empathy"&gt;definition from Merriam-Webster&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;the action of understanding, being aware of, being sensitive to, and vicariously experiencing the feelings, thoughts, and experience of another of either the past or present without having the feelings, thoughts, and experience fully communicated in an objectively explicit manner; also: the capacity for this&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Empathy at the workplace
&lt;/h2&gt;

&lt;p&gt;It shouldn't come as a surprise that the ability to show empathy can come in handy in any kind of job that involves working with other people, including the job as a programmer. This is true even if you work remotely – the other messages you see in your Slack channels are not &lt;em&gt;all&lt;/em&gt; coming from bots. There are actual human beings behind them.&lt;/p&gt;

&lt;p&gt;One of the situations where we often forget to think about that is code reviews. Just writing down what is wrong with a pull request without thinking about tone can easily lead to the creator of the pull request feeling personally offended. April Wensel has some &lt;a href="http://engineering.usertesting.com/2016/02/3-common-code-review-pitfalls/"&gt;good advice&lt;/a&gt; on code reviews. What's crucial is to develop some sensitivity for how your words will be perceived by the receiver, which requires to put yourself into their shoes, see through their eyes and reflect how they will feel. This is easier the better you know the person, otherwise you will have to make some assumptions, but I think that's still far better than not reflecting at all on how the other person will feel.&lt;/p&gt;

&lt;p&gt;Another workplace situation where I have often seen a lack of empathy is when members of two different teams need to collaborate to solve a problem or get a feature done. In some companies, I have seen an odd, competitive "us versus them" attitude between teams. This phenomenon has been &lt;a href="https://en.wikipedia.org/wiki/Ingroups_and_outgroups"&gt;explored by social and evolutionary psychologists&lt;/a&gt;, and while such a behaviour might still be in our nature, that doesn't mean that we cannot try to overcome it. A variant of "us versus them" is "developers versus managers". We developers have a hard time understanding why managers do what they do, but frankly, often, we don't try very hard. I have often seen developers taking on a very defensive stance against managers, and of course, the relationship between managers and developers in these cases was rather chilly. Getting to know "the other side" would certainly help to empathize with managers. Understanding why they act in a specific way is absolutely necessary in order to get to a healthy relationship with them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Empathy in the tech community
&lt;/h2&gt;

&lt;p&gt;Empathy is not only important at your workplace, but also very much so when you are interacting with others in our community, be it on mailing lists, conferences, or when communicating with users of your open source library, or developers of an open source library you are using. In some of these situations, a lack of empathy can strengthen exclusion, ultimately leading to a closed community that is perceived as elitist and arrogant.&lt;/p&gt;

&lt;p&gt;As a developer using an open source library, empathize with the developers of the library before you start complaining about a bug, or better yet, a missing feature. Sam Halliday wrote an interesting post called &lt;a href="https://medium.com/@fommil/the-open-source-entitlement-complex-bcb718e2326d#.qtqmmnul7"&gt;The Open Source Entitlement Complex&lt;/a&gt;. It's hard to believe, but apparently, many users of open source libraries have this attitude that the developers of these libraries are some kind of service provider, happily working for free to do exactly what you want. This is not how it works. The same way that wording and tone are important in code reviews, try to empathize with the developers who spend their free time on this library you use. Serving you and helping you out because you didn't read the documentation is probably not their highest priority in life, so don't treat them as if it is.&lt;/p&gt;

&lt;p&gt;On the other hand, when presenting your open source library to potential users, consider how these people will feel about that presentation. Does it make them feel respected? Does it make them feel welcome? I am sorry to disappoint you, but I think that a foo bar "for reasonable people" does not have that effect. Personally, I find this to be very condescending and think it will intimidate a lot of people and turn them away. It implies that any other way than yours is &lt;em&gt;not&lt;/em&gt; reasonable, and that, hence, people who have not used your library yet, but some different approach, are unreasonable people. As library authors, let's show some empathy as well towards our potential users. As always in tech, there is no silver bullet, and there are trade-offs. There are probably perfectly good reasons why someone has been using a different library so far, and maybe even after looking at your library, there will still be good reasons not to use yours. Even if you are convinced that your library is so much better, you aren't exactly creating an open and welcoming atmosphere by basically telling people visiting your project page that they are unreasonable for using anything else.&lt;/p&gt;

&lt;p&gt;If you are at a tech conference, and you ask women whether they are developers at the very beginning of your conversation, but don't do the same with men, you are probably not doing that out of malignity, but because you don't see many women at tech conferences who are actually developers. Nevertheless, to the receiver, this seemingly harmless and neutral question doesn't come across like that at all. She has probably heard this question many times, and constantly hearing doubts about whether you are really a programmer doesn't exactly make you feel welcome, or confident. Show some empathy when you talk to other people at tech conferences. Imagine what it would be like to constantly be doubted, for example. If you don't see a need for being inclusive, that's probably because you had no problem being included in the community. This likely means that you are a man, and probably white. Since most people around you are like you, chances are you don't even know any women or other unprivileged people who are developers. The problem of being privileged is that you don't notice it. Talk to women on conferences and let them tell you about their experiences. By showing empathy, you can create a more welcoming environment of inclusion and foster diversity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;These are my two cents about empathy, and lack thereof, in the tech community, and how it relates to inclusion and diversity. Empathy is important not only at the workplace, when interacting with co-workers, but also when we are participating in the tech community, as conference visitors, open source developers, and users of open source libraries. Only by showing empathy, we can create an inclusive and open community. Let's try to be more aware of the effects we have on each other, and act accordingly. Thanks!&lt;/p&gt;

</description>
      <category>diversity</category>
      <category>inclusion</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
