<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Anna Geller</title>
    <description>The latest articles on Forem by Anna Geller (@annageller).</description>
    <link>https://forem.com/annageller</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F627480%2Fc5d597be-c700-46d0-b557-f38d341275a7.jpeg</url>
      <title>Forem: Anna Geller</title>
      <link>https://forem.com/annageller</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/annageller"/>
    <language>en</language>
    <item>
      <title>Here is What Happens If You Decouple Your BI Stack</title>
      <dc:creator>Anna Geller</dc:creator>
      <pubDate>Fri, 07 May 2021 22:05:59 +0000</pubDate>
      <link>https://forem.com/annageller/here-is-what-happens-if-you-decouple-your-bi-stack-51nd</link>
      <guid>https://forem.com/annageller/here-is-what-happens-if-you-decouple-your-bi-stack-51nd</guid>
      <description>&lt;p&gt;The field of Business Intelligence has significantly evolved over the last decades. While in the 1980s, it was considered an umbrella term for all data-driven decision-making activities, these days, it’s most commonly understood as solely the “visualization” and analytics part of the data lifecycle. Therefore, the term “headless BI” seems to be an oxymoron: how can something which inherently serves visualization be headless? The answer is thanks to the API layer. This article will demonstrate a decoupled headless BI stack that can be deployed to a Kubernetes cluster or even just to a Docker container on your local machine.&lt;/p&gt;

&lt;p&gt;The ultimate goal of Business Intelligence is to leverage data to guide business decisions and measure performance. Therefore, it’s essential to approach it strategically by following engineering best practices and building future-proof data ecosystems.&lt;/p&gt;




&lt;h3&gt;
  
  
  Decoupling
&lt;/h3&gt;

&lt;p&gt;Historically, monolithic tightly coupled applications often proved to be difficult to scale, develop, redeploy and maintain. To make any changes to a single element of the system, one would need to affect all other components by redeploying a new version of the application.&lt;/p&gt;

&lt;p&gt;Microservices emerged as a popular design choice allowing to decouple such monolithic architecture into independent microservices. Loosely coupled &lt;strong&gt;microservice architecture&lt;/strong&gt; implies a collection of individual &lt;strong&gt;autonomous components&lt;/strong&gt; potentially unaware of each other. Those components typically perform a small amount of work and thus contribute to potentially greater output. By using decoupled components, we try to &lt;strong&gt;minimize dependencies&lt;/strong&gt; in the architecture— each element of the system should be able to work by itself, as well as to communicate with other components by means of standardized protocols.&lt;/p&gt;

&lt;p&gt;By applying the same principles to Business Intelligence, we may, for instance, break large dashboards into &lt;strong&gt;individual charts, metrics, and insights&lt;/strong&gt; that can be (re)used in various reports. Several BI vendors approached it by introducing a &lt;a href="https://en.wikipedia.org/wiki/Semantic_layer" rel="noopener noreferrer"&gt;semantic layer&lt;/a&gt; that provides a common &lt;strong&gt;definition of metrics&lt;/strong&gt; that can be &lt;strong&gt;shared&lt;/strong&gt; across reports. What are the benefits of that approach?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;facilitating &lt;strong&gt;reuse of components&lt;/strong&gt; → the same insight can be reused across many different dashboards,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eliminating&lt;/strong&gt; &lt;strong&gt;update anomalies&lt;/strong&gt; → the insight or KPI needs to be defined and updated only once,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;preventing duplicate efforts&lt;/strong&gt; → if somebody already created a specific metric and shared it, we can avoid building it for the second time,&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;eliminating conflicting KPI definitions&lt;/strong&gt; → by sharing the same definition of metrics, we can provide a single source of truth to analytics as this reduces the risk that the same KPI can be defined differently in several places.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  API-first approach
&lt;/h3&gt;

&lt;p&gt;As mentioned in the introduction, the term headless BI seems to be an oxymoron at first. But as long as the BI architecture is built in the API-first way, headless BI can be accomplished. At first, you may shrug and say: well then, everything is now behind an API, so what? Nothing changes, right? Not quite. By building BI software on top of a well-designed API, you are opening doors for all sorts of &lt;strong&gt;automation, versioning, easier backups, access control, and programmatically scaling your architecture&lt;/strong&gt; to new customers and domains. For instance, you can:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;a) Share specific analytical applications with external stakeholders.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Imagine that you need to share some dashboards with your external partners. The API-based approach allows you to assign specific fine-grained access permissions and limit the scope of what those users can do with the analytical application (&lt;em&gt;for instance, see only specific areas or having read-only access)&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;b) Easily move from development to production&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;declarative definition&lt;/strong&gt; of the analytical application allows you to build your charts and metrics in the development environment. Once everything is thoroughly tested, you can move to production by exporting the declarative definition file and importing it into a new environment with just a few API calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;c) Apply GitOps and Infrastructure as Code to your BI applications&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Just as you would approach any software development project, an API-based BI stack allows you to version-control your dashboards and KPI definitions for reliability. If you notice that a new KPI declaration seems to be wrong at some point, you can &lt;strong&gt;roll back to the previous version&lt;/strong&gt;. This way, you can additionally &lt;strong&gt;track&lt;/strong&gt; how your metric and dashboard definitions &lt;strong&gt;change over time&lt;/strong&gt;. Finally, if somebody accidentally changed or deleted a chart or dashboard, you can &lt;strong&gt;recover&lt;/strong&gt; from it by recreating the most recent version from a Git repository.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;d) Build analytical applications that can be deployed as a service&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you want to build a custom front-end application, the microservice approach allows you to do that. An example of how this can be accomplished: &lt;strong&gt;&lt;a href="https://www.gooddata.com/developers/cloud-native/doc/1.0/getting-started/create-ui-sdk-app/" rel="noopener noreferrer"&gt;Embed Analytics into Your Application&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To sum up this section, with the API-first approach, the sky’s the limit. Almost everything can be automated, versioned, and extended.&lt;/p&gt;




&lt;h3&gt;
  
  
  How can we put it to practice?
&lt;/h3&gt;

&lt;p&gt;Hopefully, by now, you know why decoupled API-based BI stack may be attractive for many use cases. It’s time to see it in practice. GoodData has recently released its Cloud Native platform, called &lt;strong&gt;&lt;a href="https://www.gooddata.com/developers/cloud-native/?utm_source=mediumcom&amp;amp;utm_medium=referral&amp;amp;utm_campaign=gdcn&amp;amp;utm_content=anna-decouple" rel="noopener noreferrer"&gt;GoodData.CN&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Overall, GoodData has been offering BI capabilities for nearly two decades and has learned what is needed to provide analytics at scale while satisfying the needs of analysts, business users, and front-end application developers. Their newly released &lt;strong&gt;cloud-native platform&lt;/strong&gt; offers a fully-fledged analytical engine in a &lt;strong&gt;single Docker container image&lt;/strong&gt;. You can use it to test the platform &lt;strong&gt;on your local machine&lt;/strong&gt;, in an &lt;strong&gt;on-prem data center&lt;/strong&gt;, or &lt;strong&gt;in the cloud&lt;/strong&gt;. The associated Helm chart helps with a deployment to a Kubernetes cluster of your choice.&lt;/p&gt;

&lt;p&gt;The main benefit of the cloud-native platform is that you can host this BI architecture &lt;strong&gt;close to your data&lt;/strong&gt;. If your Redshift or Snowflake data warehouse resides on AWS in the &lt;code&gt;us-east-1&lt;/code&gt; region, you can deploy the GoodData.CN Helm chart to a Kubernetes cluster in the same region, &lt;strong&gt;minimizing latency&lt;/strong&gt;. This approach of &lt;strong&gt;moving your BI stack to your data&lt;/strong&gt; rather than the other way around is a considerable improvement compared to the older BI tools that required you to first install some client application on your computer and then import your data into it. This is possible thanks to several microservice components that communicate over a REST API. The API-based semantic layer defines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.gooddata.com/developers/cloud-native/doc/1.0/getting-started/connect-data/#connect-your-own-database?utm_source=mediumcom&amp;amp;utm_medium=referral&amp;amp;utm_campaign=gdcn&amp;amp;utm_content=anna-decouple" rel="noopener noreferrer"&gt;from where to read the data&lt;/a&gt;,&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.gooddata.com/developers/cloud-native/doc/1.0/concepts/logical-data-model/?utm_source=mediumcom&amp;amp;utm_medium=referral&amp;amp;utm_campaign=gdcn&amp;amp;utm_content=anna-decouple" rel="noopener noreferrer"&gt;how this data is structured&lt;/a&gt; (the logical data model),&lt;/li&gt;
&lt;li&gt;how to &lt;a href="https://medium.com/gooddata-developers/high-concurrency-analytics-with-gooddata-7d5ca988966f" rel="noopener noreferrer"&gt;optimize the query&lt;/a&gt; for the best possible performance,&lt;/li&gt;
&lt;li&gt;and applying &lt;a href="https://community.gooddata.com/dashboards-and-reports-56/how-does-gooddata-cache-reports-128?utm_source=mediumcom&amp;amp;utm_medium=referral&amp;amp;utm_campaign=gdcn&amp;amp;utm_content=anna-decouple" rel="noopener noreferrer"&gt;intelligent caching&lt;/a&gt; with Redis under the hood when reading data.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Demo: All-In-One Docker container
&lt;/h3&gt;

&lt;p&gt;All components required to use the cloud-native platform are available through a &lt;a href="https://hub.docker.com/r/gooddata/gooddata-cn-ce" rel="noopener noreferrer"&gt;Docker container&lt;/a&gt; that already contains a Postgres database with sample sales data. We first pull the image from Dockerhub:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker pull gooddata/gooddata-cn-ce:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we can start the container. Note that this image is &lt;strong&gt;not meant to be used in production&lt;/strong&gt;. Therefore you need to accept the Non-Production License Agreement. You can do that either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;by running the image with an interactive flag (&lt;code&gt;i -t&lt;/code&gt;) and then accepting the license agreement via console:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run --name gooddata -i -t -p 3000:3000 -p 5432:5432 gooddata/gooddata-cn-ce:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;or you can pass the environment variable &lt;code&gt;LICENSE_AND_PRIVACY_POLICY_ACCEPTED=yes&lt;/code&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run --name gooddata -p 3000:3000 -p 5432:5432 \
  -e LICENSE_AND_PRIVACY_POLICY_ACCEPTED=YES gooddata/gooddata-cn-ce:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As shown in the command above, the GoodData UI will be served via port 3000 and Postgres via port 5432. Overall, you will see a lot of logs printed in your console. Once you see the following output, your setup is complete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/===== All services of GoodData.CN are ready ======\\
   |
   | Navigate your browser to &amp;lt;http://localhost:3000/&amp;gt;
   |
   | You can log in as user demo@example.com with password demo123
   | To access API, use Bearer token YWRtaW46Ym9vdHN0cmFwOmFkbWluMTIz
   |
   \\======== All services of GoodData.CN are ready ====/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can now navigate to &lt;strong&gt;&lt;a href="http://localhost:3000/" rel="noopener noreferrer"&gt;http://localhost:3000/&lt;/a&gt;&lt;/strong&gt; in our browser and log in using email: &lt;code&gt;demo@example.com&lt;/code&gt; and password: &lt;code&gt;demo123&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1280%2F1%2Aky_n74JOhQr7Hu5-6Ei0EQ.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1280%2F1%2Aky_n74JOhQr7Hu5-6Ei0EQ.png" alt="https://cdn-images-1.medium.com/max/1280/1*ky_n74JOhQr7Hu5-6Ei0EQ.png"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;GoodData cloud-native UI — image by the author&lt;/p&gt;

&lt;p&gt;Once we are logged in, we can create a &lt;code&gt;demo&lt;/code&gt; workspace, connect to data from our database, and start creating insights and dashboards, as shown below.&lt;/p&gt;

&lt;p&gt;First steps in the GoodData cloud-native UI — image by the author&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1920%2F1%2AXBVPLP8sdYD9HVNZH4DV_g.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1920%2F1%2AXBVPLP8sdYD9HVNZH4DV_g.gif" alt="https://cdn-images-1.medium.com/max/1920/1*XBVPLP8sdYD9HVNZH4DV_g.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To create our first simple dashboard, we can drag and drop the relevant metrics into the visualization canvas. Once we are done, we can &lt;code&gt;Save &amp;amp; Publish&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Creating a first dashboard — image by the author&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1920%2F1%2A8U9ZuHt0IV5Mb_eH1tnb0g.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1920%2F1%2A8U9ZuHt0IV5Mb_eH1tnb0g.gif" alt="https://cdn-images-1.medium.com/max/1920/1*8U9ZuHt0IV5Mb_eH1tnb0g.gif"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Finally, to demonstrate the power of the underlying API, you can look at the &lt;a href="https://www.gooddata.com/developers/cloud-native/doc/1.0/apidocs/api_reference_all/#/entities/getAllEntities%40Metrics%3Futm_source%3Dmediumcom%26utm_medium%3Dreferral%26utm_campaign%3Dgdcn%26utm_content%3Danna-decouple" rel="noopener noreferrer"&gt;Open API definition&lt;/a&gt; to explore all sorts of automation available. Here is Python code that uses the API to get information about available workspaces, dashboards, and metrics (&lt;em&gt;for raw code, see &lt;a href="https://gist.github.com/5d7fba7b65a2093bf44bfc6681479954" rel="noopener noreferrer"&gt;the following Gist&lt;/a&gt;&lt;/em&gt;):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1920%2F1%2AGtgB6_UP4g0NjCuWHpzOuw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1920%2F1%2AGtgB6_UP4g0NjCuWHpzOuw.png" alt="https://cdn-images-1.medium.com/max/1920/1*GtgB6_UP4g0NjCuWHpzOuw.png"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;In this article, we looked at the benefits of bringing microservice architecture to a BI stack. We looked at the &lt;strong&gt;semantic layer&lt;/strong&gt; in the data modeling process and how &lt;strong&gt;decoupling&lt;/strong&gt; can help make analytical processes more resilient. Finally, we used the community edition of the brand new cloud-native &lt;a href="https://www.gooddata.com/developers/cloud-native-community-edition?utm_source=mediumcom&amp;amp;utm_medium=referral&amp;amp;utm_campaign=gdcn&amp;amp;utm_content=anna-decouple" rel="noopener noreferrer"&gt;GoodData.CN&lt;/a&gt; platform that provides a fully-fledged BI platform in a single Docker container. This setup is great for development, non-production, and evaluation processes, but if you want to use it for production, you should look at the &lt;a href="https://www.gooddata.com/developers/cloud-native/#plans?utm_source=mediumcom&amp;amp;utm_medium=referral&amp;amp;utm_campaign=gdcn&amp;amp;utm_content=anna-decouple" rel="noopener noreferrer"&gt;Helm chart Kubernetes deployment&lt;/a&gt;. Or, if you are considering a fully hosted solution instead, look at the &lt;a href="https://www.gooddata.com/technical-overview?utm_source=mediumcom&amp;amp;utm_medium=referral&amp;amp;utm_campaign=gdcn&amp;amp;utm_content=anna-decouple" rel="noopener noreferrer"&gt;technical overview&lt;/a&gt; of the GoodData platform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thank you for reading!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>analytics</category>
      <category>bigdata</category>
      <category>microservices</category>
      <category>architecture</category>
    </item>
  </channel>
</rss>
