<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Adolfo Estevez</title>
    <description>The latest articles on Forem by Adolfo Estevez (@aestevezjimenez).</description>
    <link>https://forem.com/aestevezjimenez</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F356605%2F9f4eb813-5a86-42b4-a0d1-cfd60dc8410a.jpg</url>
      <title>Forem: Adolfo Estevez</title>
      <link>https://forem.com/aestevezjimenez</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/aestevezjimenez"/>
    <language>en</language>
    <item>
      <title>GCP Professional Data Engineer Guide - September 2020</title>
      <dc:creator>Adolfo Estevez</dc:creator>
      <pubDate>Tue, 06 Oct 2020 07:01:11 +0000</pubDate>
      <link>https://forem.com/aestevezjimenez/gcp-professional-data-engineer-guide-september-2020-7lp</link>
      <guid>https://forem.com/aestevezjimenez/gcp-professional-data-engineer-guide-september-2020-7lp</guid>
      <description>&lt;p&gt;I have recently recalled &lt;strong&gt;my first experience with GCP&lt;/strong&gt;. It was in London, shortly before the 2012 Olympics, in an &lt;strong&gt;online gaming project&lt;/strong&gt;, initially thought for &lt;em&gt;AWS&lt;/em&gt;, that was migrated to App Engine -  &lt;em&gt;&lt;strong&gt;PAAS platform that would evolve to the current GCP&lt;/strong&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My initial impression was good,&lt;/strong&gt; although the platform &lt;strong&gt;imposed a number of development limitations&lt;/strong&gt;, which would be reduced later with the release of &lt;strong&gt;&lt;em&gt;App Engine Flexible&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Coinciding with the launch of &lt;em&gt;&lt;strong&gt;Tensor Flow&lt;/strong&gt;&lt;/em&gt; as an Open Source framework in 2015, I was lucky enough to &lt;strong&gt;attend a workshop on neural networks&lt;/strong&gt; - given by one of the AI scientists from &lt;em&gt;Google Seattle&lt;/em&gt; - where I had my second experience with the platform. I was very surprised by the &lt;strong&gt;simplicity of configuration and deployment&lt;/strong&gt;, the &lt;strong&gt;NoOps concept and a Machine Learning / AI offering&lt;/strong&gt;, without competition at the time.&lt;/p&gt;

&lt;p&gt;&lt;iframe src="https://player.vimeo.com/video/132700334" width="710" height="399"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Do Androids Dream of Electric Sheep? Philip K. Dick would have "hallucinated" with the electrical dreams of neural networks - powered by Tensor Flow.&lt;/p&gt;

&lt;h2&gt;Exam&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;structure of the exam&lt;/strong&gt; is the usual one in GCP exams: &lt;em&gt;&lt;strong&gt;2 hours and 50 questions&lt;/strong&gt;&lt;/em&gt;, with a format  directed towards scenario-type questions, mixing &lt;strong&gt;questions of high difficulty&lt;/strong&gt; with simpler ones of &lt;strong&gt;medium-low difficulty&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In general, &lt;strong&gt;to choose the correct answer&lt;/strong&gt;, you &lt;em&gt;&lt;strong&gt;have to apply technical and business criteria&lt;/strong&gt;&lt;/em&gt;. Therefore, it is necessary a deep knowledge of the services from the technological point of view, as well as skill / experience to apply the business criteria in a contextual way, depending on the question, type of environment, sector, application, etc .. .&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2Fbd-data-lake.png%3Fw%3D698" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2Fbd-data-lake.png%3Fw%3D698" alt=""&gt;&lt;/a&gt;Image #1, Data Lake, the ubiquitous architecture - Image owned by GCP&lt;/p&gt;

&lt;p&gt;We can group the relevant services according to the states (and substates) of the data cycle:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Management, Storage, Transformation and Analysis.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Ingestion Batch&lt;/em&gt;&lt;/strong&gt; / &lt;strong&gt;&lt;em&gt;Data Lake&lt;/em&gt;&lt;/strong&gt;: Cloud Storage.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Ingestion&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt;&lt;em&gt; Streaming&lt;/em&gt;&lt;/strong&gt;: Kafka, Pub/Sub, Computing Services, Cloud IoT Core.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Migrations:&lt;/strong&gt; &lt;/em&gt;Transfer Appliance, Transfer Service, Interconnect, gsutil.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Transformation&lt;/strong&gt;&lt;/em&gt;s: Dataflow, Dataproc, Cloud Dataprep, Hadoop, Apache Beam.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Computing: &lt;/em&gt;&lt;/strong&gt;Kubernetes Engine, Compute Instances, Cloud Functions, App Engine.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Storage&lt;/em&gt;&lt;/strong&gt;: Cloud SQL, Cloud Spanner, Datastore / Firebase, BigQuery, BigTable, HBase, MongoDB, Cassandra.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cache&lt;/em&gt;&lt;/strong&gt;: Cloud Memorystore, Redis.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Analysis / Data Operations:&lt;/strong&gt; &lt;/em&gt;BigQuery, Cloud Datalab, Data Studio, DataPrep, Cloud Composer, Apache Airflow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Machine Learning:&lt;/em&gt;&lt;/strong&gt; AI Platform, BigQueryML, Cloud AutoML, Tensor Flow, Cloud Text-to-Speech API, Cloud Speech-to-Text, Cloud Vision API, Cloud Video AI, Translations, Recommendations API, Cloud Inference API,  Natural Language,  DialogFlow,  Spark MLib.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;IoT&lt;/em&gt;&lt;/strong&gt;: Cloud IoT Core, Cloud IoT Edge. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Security &amp;amp; encryption&lt;/em&gt;&lt;/strong&gt;: IAM, Roles, Encryption, KMS, Data Prevention API, Compliance ... &lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Opera&lt;/strong&gt;t&lt;strong&gt;ions:&lt;/strong&gt;&lt;/em&gt;  Kubeflow, AI Platform, Cloud Deployment Manager ... &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Monitorization:&lt;/em&gt;&lt;/strong&gt; Cloud Stackdriver Logging, Stackdriver Monitoring.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Optimization:&lt;/em&gt;&lt;/strong&gt; Cost control, Autoscaling, Preemptive instances ...&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Pre-requisites and recommendations&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;At this level of certification, the questions do not refer, in general, to a single topic&lt;/strong&gt;. That is, a question from the Analytics domain may require more or less advanced knowledge of &lt;em&gt;Computing, Security, Networking or DevOps&lt;/em&gt; to be able to solve it successfully. I´d recommend having the &lt;em&gt;GCP Associate Cloud Engineer&lt;/em&gt; certification or have equivalent knowledge.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;GCP experience at the architectural level.&lt;/em&gt;&lt;/strong&gt; The exam is focused, in part, on the architecture solution, design and deployment of data pipelines; selection of technologies to solve business problems, and to a lesser extent development. I´d recommend studying as many &lt;a href="https://cloudblog.withgoogle.com/products/application-development/13-popular-application-architectures-for-google-cloud/amp/" rel="noopener noreferrer"&gt;reference architectures &lt;/a&gt;as possible, such as the ones I show in this guide.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;GCP experience at the development level.&lt;/em&gt;&lt;/strong&gt; Although no explicit programming questions appeared in my question set, or in the mock test, the exam requires technical knowledge of services and APIS: &lt;em&gt;SQL, Python, REST, algorithms, Map-Reduce, Spark, Apache Beam (Dataflow) &lt;/em&gt;…&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;GCP experience at Security level.&lt;/strong&gt;&lt;/em&gt; Domain that appears transversally in all certifications - I´d  recommend knowledge at the level of &lt;em&gt;Associate Engineer&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;GCP experience at Networking level.&lt;/em&gt;&lt;/strong&gt; Another domain that appears transversely -  I´d recommend knowledge at the level of Associate Engineer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Knowledge of Data Analytics.&lt;/em&gt;&lt;/strong&gt; It's a no-brainer, but some domain knowledge is essential. Otherwise, I´d recommend studying books like &lt;em&gt;“Data Analytics with Hadoop”&lt;/em&gt; or taking courses like Specialized Program: &lt;em&gt;&lt;a href="https://www.coursera.org/specializations/gcp-data-machine-learning" rel="noopener noreferrer"&gt;Data Engineering, Big Data and ML on Google Cloud in Coursera&lt;/a&gt;.&lt;/em&gt; Likewise, practicing with laboratories or pet projects is essential to obtain some practical experience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Knowledge of the Hadoop - Spark ecosystem&lt;/em&gt;&lt;/strong&gt;. Connected with the previous point. High-level knowledge of the ecosystem is necessary: &lt;em&gt;Map Reduce, Spark, Hive, Hdfs, Pig&lt;/em&gt; …&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Knowledge of Machine Learning and IoT. &lt;/em&gt;&lt;/strong&gt;Advanced knowledge in &lt;em&gt;Data Science and Machine Learning&lt;/em&gt; is essential, apart from specific knowledge of GCP products. There are questions exclusively about this domain - at the level of certifications like &lt;em&gt;AWS Machine Learning &lt;/em&gt;or higher. IoT appears on the exam in a lighter form, but it is essential to know the architecture and services of reference.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;DevOps experience. &lt;/em&gt;&lt;/strong&gt;Concepts such as CI / CD, infrastructure or configuration as code, are of great importance today, and this is reflected in the exam, although they do not have a great specific weight.&lt;/li&gt;
&lt;/ul&gt;



&lt;h3&gt;Standard questions&lt;/h3&gt;

&lt;p&gt;Representative question of the level of difficulty of the &lt;a href="https://cloud.google.com/certification/practice-exam/data-engineer" rel="noopener noreferrer"&gt;exam&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2Fcaptura-de-pantalla-2020-08-29-a-las-16.19.35.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2Fcaptura-de-pantalla-2020-08-29-a-las-16.19.35.png%3Fw%3D1024" alt=""&gt;&lt;/a&gt;Image property of GCP&lt;/p&gt;

&lt;p&gt;Practical migration scenario question, that includes cloud services and the &lt;em&gt;Hadoop ecosystem&lt;/em&gt;, as well as concepts from the &lt;em&gt;Analytics domain&lt;/em&gt;.&lt;/p&gt;

&lt;h4&gt;&lt;strong&gt;Services to study in detail&lt;/strong&gt;&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2Fretail-rt-inventory.png%3Fw%3D996" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2Fretail-rt-inventory.png%3Fw%3D996" alt=""&gt;&lt;/a&gt;Image #2 - property of GCP&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Storage &lt;/em&gt;&lt;/strong&gt;- Core service that appears consistently in all certifications, and is central in the  &lt;em&gt;&lt;strong&gt;Data Lake&lt;/strong&gt;&lt;/em&gt; systems. I´d recommend its study in detail at an architectural level - see Image 1 -, configurations according to the data temperature, and as an integration / storage element between the different services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;BigQuery&lt;/em&gt;&lt;/strong&gt; - Core service in the &lt;em&gt;Analytics GCP&lt;/em&gt; domain as a BI and storage element. Extremely important in the exam, so have to be studied in detail: architecture, configuration, backups, export / import, streaming, batch, security, partitioning, sharding, projects, datasets, views, integration with other services, cost, queries and optimization SQL (legacy and standard) at table levels, keys …&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Pub / Sub&lt;/em&gt;&lt;/strong&gt; - Core service as an element of ingestion and integration. Its in-depth study is highly recommended: use cases, architecture, configuration, API, security and integration with other services (eg &lt;em&gt;Dataflow, Cloud Storage&lt;/em&gt;) - Kafka's native cloud mirror service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Dataflow&lt;/em&gt;&lt;/strong&gt; - Core service in the &lt;em&gt;Analytics GCP domain&lt;/em&gt; as a process and transformation element. Implementation based on &lt;em&gt;Apache Beam&lt;/em&gt; that is necessary to know at a high level and pipeline design. Use cases, architecture, configuration, API and integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Dataproc&lt;/strong&gt;&lt;/em&gt; - Core service in the &lt;em&gt;Analytics GCP domain&lt;/em&gt; as a process and transformation element. It is a service based on Hadoop, and therefore, it is the indicated service for a migration to the cloud. In this case, not only knowledge of Dataproc is required, but also in native services: &lt;em&gt;Spark, HDFS, HBase, Pig&lt;/em&gt; … use cases, architecture, configuration, import / export, reliability, optimization, cost, API and integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud SQL, Cloud Spanner&lt;/em&gt;&lt;/strong&gt; - Cloud native relational databases. Use cases, architecture, configuration, security, performance, reliability, cost and optimization: clusters, transactionality, disaster recovery, backups, export / import, SQL performance and optimization, tables, queries, keys and debugging. Integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Bigtable&lt;/em&gt;&lt;/strong&gt; - Low latency &lt;em&gt;NoSQL&lt;/em&gt; managed database, suitable for time series, IoT… ideal to replace a &lt;em&gt;HBase&lt;/em&gt; installation on premise. Use cases, architecture, configuration, security, performance, reliability and optimization: clusters, CAP, backups, export / import, partitioning, performance, and optimization of tables, queries, keys. Integration with other services. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Machine Learning&lt;/em&gt;&lt;/strong&gt; - One of the strengths of the certification is the domain "&lt;em&gt;Operationalizing machine learning models"&lt;/em&gt;. Much more dense and complex than it may seem at first, since it not only includes the operability and knowledge of the relevant GCP services; likewise, it includes the knowledge of the Data Science fundamentals: algorithm selection, optimization, metrics … The level of difficulty of the questions is variable, but comparable to that of specific certifications, such as &lt;em&gt;AWS Certified Machine Learning - Specialty&lt;/em&gt;. Most important services: &lt;em&gt;BigQuery ML, Cloud Vision API, Cloud Video Intelligence, Cloud AutoML, Tensor Flow, Dialogflow, GPU´s, TPU´s &lt;/em&gt;…&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Security&lt;/em&gt;&lt;/strong&gt; - Security is a transversal concern across all domains, and appears consistently in all certifications. In this case, it appears as an independent technical topic, crosscutting concern or as a business requirement: &lt;em&gt;KMS, IAM, Policies, Roles, Encryption, Data Prevention API …&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F09%2F12_2fk8rre.max-1100x1100-1.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F09%2F12_2fk8rre.max-1100x1100-1.png%3Fw%3D1024" alt=""&gt;&lt;/a&gt;Image #3, IoT Reference Architecture - owned by GCP&lt;/p&gt;



&lt;h4&gt;&lt;strong&gt;Very important services to consider&lt;/strong&gt;&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Networking&lt;/em&gt;&lt;/strong&gt; - Cross-domain that can appear in the form of separate technical issues, cross cutting concerns, or as business requirements: &lt;em&gt;VPC, Direct Interconnect, Multi Region / Zone, Hybrid connectivity, Firewall rules, Load Balancing, Network Security, Container Networking, API Access ( private / public) …&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Hadoop -&lt;/em&gt;&lt;/strong&gt; The exam covers ecosystems and third-party services like &lt;em&gt;Hadoop, Spark, HDFS, Hive, Pig&lt;/em&gt; … use cases, architecture, functionality, integration and migration to GCP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Apache Kafka&lt;/em&gt;&lt;/strong&gt; - Alternative service to &lt;em&gt;Pub / Sub&lt;/em&gt;, so it is advisable to study it at a high level: use cases, operational characteristics, configuration, migration and integration with GCP - plugins, connectors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;IoT&lt;/em&gt;&lt;/strong&gt; - It can appear in various questions at the architectural level: use cases, reference architecture and integration with other services.&lt;em&gt; IoT core, Edge Computing.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Datastore / Firebase&lt;/em&gt;&lt;/strong&gt; - Document database. Use cases, configuration, performance, entity model, keys and index optimization, transactions, backups, export / import and integration with other services. It doesn't carry as much weight as the other data repositories.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Memory Store / Redis&lt;/em&gt;&lt;/strong&gt; - Structured data cache repository. Use cases, architecture, configuration, performance, reliability and optimization: clusters, backups, export / import and integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Dataprep &lt;/em&gt;&lt;/strong&gt;- Use cases, console and general operation, supported formats, and Dataflow integration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Stackdriver&lt;/em&gt;&lt;/strong&gt; - Use cases, monitoring and logging, both at the system and application level: &lt;em&gt;Cloud Stackdriver Logging, Cloud Stackdriver Monitoring, Stackdriver Agent and plugins.&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;&lt;strong&gt;Other services&lt;/strong&gt;&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;MongoDB, Cassandra - &lt;/em&gt;&lt;/strong&gt;&lt;em&gt;NoSQL&lt;/em&gt; databases that can appear in different scenarios. Use cases, architecture and integration with other services&lt;strong&gt;&lt;em&gt;.&lt;/em&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Composer &lt;/em&gt;&lt;/strong&gt;- Use cases, general operation and web console, configuration of diagram types, supported formats, import / export, integration with other services, connectors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Data Studio&lt;/em&gt;&lt;/strong&gt; - Use cases, configuration, networking, security, general operation and environment, and integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Data Lab -&lt;/em&gt;&lt;/strong&gt; Use cases, general operation and web console, types of diagrams, supported formats, import / export and integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Kubernetes  Engine - &lt;/em&gt;&lt;/strong&gt;Use cases, architecture, clustering and integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Kubeflow - &lt;/em&gt;&lt;/strong&gt;Use cases, architecture, environment configuration, Kubernetes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Apache Airflow&lt;/em&gt;&lt;/strong&gt;  - Use cases, architecture and general operation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Cloud Functions &lt;/em&gt;&lt;/strong&gt; -&lt;strong&gt;&lt;em&gt; &lt;/em&gt;&lt;/strong&gt;Use cases, architecture, configuration and integration with other services - such as &lt;em&gt;Cloud Storage and Pub / Sub, in Push / Pull mode.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Compute Engine &lt;/em&gt;&lt;/strong&gt;- Use cases, architecture, configuration, high availability, reliability and integration with other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;App Engine -&lt;/em&gt;&lt;/strong&gt; Use cases, architecture and integration with other services.&lt;/li&gt;
&lt;/ul&gt;



&lt;h2&gt;Bibliography &amp;amp; essential resources&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Google provides&lt;/strong&gt; a large number of &lt;strong&gt;resources for the preparation of this certification&lt;/strong&gt;, in the form of courses, official guide book, documentation and mock exams. These resources are &lt;strong&gt;highly recommended&lt;/strong&gt;, and in some cases, I would say essential.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.coursera.org/learn/preparing-cloud-professional-data-engineer-exam" rel="noopener noreferrer"&gt;Certification Preparation Course&lt;/a&gt;, contained in the &lt;a href="https://www.coursera.org/specializations/gcp-data-machine-learning" rel="noopener noreferrer"&gt;Data Engineering Specialized Program&lt;/a&gt;, includes an extra exam, lots of additional tips and materials and labs - using the external Qwik Labs tool.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://cloud.google.com/certification/guides/data-engineer" rel="noopener noreferrer"&gt;GCP Certification Guide&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;a href="//cloud.google.com/docs#section-9"&gt;Google Doc&lt;/a&gt;&lt;/em&gt;&lt;a href="//cloud.google.com/docs#section-9"&gt;&lt;em&gt;s&lt;/em&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://cloud.google.com/certification/practice-exam/data-engineer" rel="noopener noreferrer"&gt;Practice exam&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;a href="https://www.coursera.org/learn/preparing-cloud-professional-data-engineer-exam" rel="noopener noreferrer"&gt;Readiness course&lt;/a&gt; &lt;/em&gt;– &lt;em&gt;highly recommended, includes an additional practice test.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://www.coursera.org/specializations/gcp-data-machine-learning" rel="noopener noreferrer"&gt;Data Engineering, Big Data and ML on Google Cloud&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cloudblog.withgoogle.com/products/application-development/13-popular-application-architectures-for-google-cloud/amp/" rel="noopener noreferrer"&gt;&lt;em&gt;Thirteen GCP Reference Architectures&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2F20200827_090027.jpg%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2F20200827_090027.jpg%3Fw%3D1024" alt=""&gt;&lt;/a&gt;&lt;br&gt;Bibliography (selection) that I have used for the preparation of the certification&lt;/p&gt;

&lt;p&gt;As I have previously indicated, &lt;strong&gt;I find the Google courses on Coursera to be excellen&lt;/strong&gt;t, as they combine a series of &lt;strong&gt;short videos, reading material, labs, and test questions&lt;/strong&gt;, thus creating &lt;strong&gt;a very dynamic experience&lt;/strong&gt;. In any case, they &lt;strong&gt;should only be considered as a starting point&lt;/strong&gt;, being necessary the deepening - according to experience - in each one of the domains using, for instance, &lt;em&gt;the excellent GCP documentation.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;But you should not limit yourself to online courses. I can't hide the fact that I love books in general, and IT books in particular. In fact, I have a huge collection of books dating back to the 80s, which at some point I will donate to a local Cervantina bookstore.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Books provide&lt;/strong&gt; a deeper and more &lt;strong&gt;dynamic experience than videos&lt;/strong&gt;, which can be a bit monotonous if they are too long - as well as being a much more passive experience - like watching TV. The ideal is the &lt;strong&gt;combination of audiovisual and written media&lt;/strong&gt;, &lt;strong&gt;thus creating your own learning path&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;Laboratories&lt;/h3&gt;

&lt;ul id="block-0cc6ede6-3159-4db7-82fc-d6154af6ef29"&gt;
&lt;li&gt;&lt;a href="https://www.qwiklabs.com/quests/43" rel="noopener noreferrer"&gt;&lt;em&gt;Data Science on GCP Quest&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.qwiklabs.com/quests/25" rel="noopener noreferrer"&gt;&lt;em&gt;Data Engineering Quest&lt;/em&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2F6_5xbiker.0999056719991134.max-1100x1100-1.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmetanube.files.wordpress.com%2F2020%2F08%2F6_5xbiker.0999056719991134.max-1100x1100-1.png%3Fw%3D1024" alt=""&gt;&lt;/a&gt;Image #4  - Data Lake based upon Cloud Storage - owned by GCP&lt;/p&gt;

&lt;p&gt;Part of the job as a &lt;strong&gt;&lt;em&gt;Data Engineer&lt;/em&gt;&lt;/strong&gt; consists of creating, integrating, deploying and maintaining &lt;strong&gt;data pipelines&lt;/strong&gt;, both in batch and streaming mode.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.qwiklabs.com/quests/25" rel="noopener noreferrer"&gt;Data Engineering Quest&lt;/a&gt; contains several labs that introduce &lt;strong&gt;the creation of different data transformation,  IoT, and Machine Learning pipelines&lt;/strong&gt;, so I find them excellent exercises - and not just for certification.&lt;/p&gt;

&lt;h2&gt;Is it worth?&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The level of certification is advanced,&lt;/strong&gt; and in general, it should not be the first cloud certification to obtain. It &lt;strong&gt;covers a large amount of material and domains&lt;/strong&gt;, so tackling it without a certain level of prior knowledge &lt;strong&gt;can be quite a complex task.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If we compare it with the mirror certification on the AWS platform, it covers almost twice as much material, mainly due to the &lt;strong&gt;inclusion of questions about the Machine Learning / Data Science domain&lt;/strong&gt; - which in the case of AWS have been eliminated, to be included in its own certification. Therefore, it is like taking two certifications in one.&lt;/p&gt;

&lt;p&gt;Is it worth? of course, but not as a first certification - depending on the experience provided.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Certifications&lt;/strong&gt; are a good way, not only to &lt;strong&gt;validate knowledge externally&lt;/strong&gt;, but to &lt;strong&gt;collect updated information,&lt;/strong&gt; &lt;strong&gt;validate good practices and consolidate knowledge&lt;/strong&gt; with real practical cases (or almost).&lt;/p&gt;

&lt;p&gt;Good luck to you all!&lt;/p&gt;

</description>
      <category>googlecloud</category>
      <category>career</category>
      <category>datascience</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>An AWS Summer: EFS &amp; Lambda + Serverless Framework</title>
      <dc:creator>Adolfo Estevez</dc:creator>
      <pubDate>Mon, 05 Oct 2020 09:56:18 +0000</pubDate>
      <link>https://forem.com/aestevezjimenez/an-aws-summer-efs-lambda-serverless-framework-489</link>
      <guid>https://forem.com/aestevezjimenez/an-aws-summer-efs-lambda-serverless-framework-489</guid>
      <description>&lt;p&gt;The autumn equinox has just passed, which is a perfect moment to look back, and review some of the features released in this last summer by AWS - in no particular order, just because I think they are cool - and useful :)&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Mb237jWt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://upload.wikimedia.org/wikipedia/commons/8/8b/North_season.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Mb237jWt--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://upload.wikimedia.org/wikipedia/commons/8/8b/North_season.jpg" alt="" width="771" height="415"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;h2&gt;Serverless challenges &lt;/h2&gt;

&lt;p&gt;If you´ve been developing serverless applications for a while, pretty sure you have found yourself facing a few challenges, apart from the &lt;em&gt;old cold start thing&lt;/em&gt; - which have been solved to a great extent with the &lt;em&gt;Provisioned Concurrency&lt;/em&gt; feature. &lt;/p&gt;

&lt;p&gt;For instance, let's say you need to load large files of rules consumed by a Lambda function, that implements a rules engine, or you need to keep data files produced dynamically by the function between invocations. Lambda provides some local space - 512MB - that you may use, but it's small and ephemeral, so is not useful for those kinds of scenarios.&lt;/p&gt;

&lt;p&gt;Other solutions come to mind: storing in databases - &lt;em&gt;RDS, DynamoDB,S3 &lt;/em&gt; ... but comes with a high price of development, performance and cost. What would happen if we had peaks of several hundreds - or thousands requests - per second,  loading big files in the startup and writing files to a data store concurrently? &lt;/p&gt;

&lt;p&gt;Well, at the very least, we could have a big performance hit, depending on the size of the files, the latency of retrieving the files at startup + the cold start of Lambdas - &lt;em&gt;enter provisioned concurrency&lt;/em&gt; - plus the latency of storing the intermediate files to the datastores - it's not the same storing and retrieving from S3 than from DynamoDB.&lt;/p&gt;

&lt;p&gt;So no alternative? Well, we are in luck, as AWS released &lt;a href="https://aws.amazon.com/es/efs/"&gt;EFS&lt;/a&gt; support for Lambda in June!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ZMdN3Zdc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ZMdN3Zdc--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image.png%3Fw%3D1024" alt="" class="wp-image-537" width="743" height="216"&gt;&lt;/a&gt;Image property of AWS&lt;/p&gt;

&lt;p&gt;Amazon &lt;a href="https://aws.amazon.com/es/efs/"&gt;EFS&lt;/a&gt; is widely known, so I'm not going to delve depth into the service, but just to mention that &lt;em&gt;Amazon&lt;/em&gt; &lt;em&gt;Elastic File Service&lt;/em&gt; provides a NFS file system that escalates on demand, providing high throughput and low latency. It's very useful when shared storage, and a parallel access from the services it´s needed.&lt;/p&gt;



&lt;h3&gt;Configuration &amp;amp; Considerations&lt;/h3&gt;

&lt;p&gt;"With power comes responsibility", or in our case with powerful features come some configuration constraints. EFS runs in different subnets within a VPC, which means that our Lambda functions have to run within a VPC as well. That comes with a price: IP directioning, possible performance hit, loss of connection to AWS global services, therefore a &lt;em&gt;NAT Gateway or Private Links / Gateway&lt;/em&gt; might need to be used, depending on the use case.&lt;/p&gt;

&lt;p&gt;That constraint was vastly improved last year when &lt;a href="https://aws.amazon.com/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/"&gt;Hyperplane ENI for Lambda&lt;/a&gt; was released, allowing that just a few ENI´s  - and therefore a few IP´s - would be enough to handle a big number of Lambda invocations, decoupling function scaling from ENI´s provisioning.&lt;/p&gt;

&lt;h4&gt;Configuration - Serverless Framework&lt;/h4&gt;

&lt;p&gt;The configuration of a Lambda function running within a VPC could be fairly simple - if only needs to access the VPC resources - as in shown in the image below - under the vpc label:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--v2QwQleM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image-5.png%3Fw%3D938" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--v2QwQleM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image-5.png%3Fw%3D938" alt="" class="wp-image-548" width="569" height="381"&gt;&lt;/a&gt;Serverless framework YAML - Image  MNube.org&lt;/p&gt;

&lt;p&gt;A security group is needed for the Lambda function, the ID´s of subnet(s) where the ENI(s) will be placed, and permissións to &lt;em&gt;create, delete, and describe network interfaces&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--YNcIHRn4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-1.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--YNcIHRn4--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-1.png%3Fw%3D1024" alt="" class="wp-image-590" width="1008" height="288"&gt;&lt;/a&gt;VPC Lambda - Image MNube.org&lt;/p&gt;



&lt;p&gt;The Lambda function is running within our VPC now, &lt;em&gt;an ENI placed in each subnet selected&lt;/em&gt;, but in order to access the EFS instance a few permissións will need to be provided:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--5_bQZQpA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image-9.png%3Fw%3D868" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--5_bQZQpA--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image-9.png%3Fw%3D868" alt="" class="wp-image-558" width="538" height="357"&gt;&lt;/a&gt;Role permissións EFS, Lambda - Image MNube.org&lt;/p&gt;

&lt;p&gt;Now the EFS can be created within the VPC. In order to do that, the console, Cloudformation, Serverless, AWS CLI, AWS SDK, etc ... could be used.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--MmtZBAq2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-2.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--MmtZBAq2--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-2.png%3Fw%3D1024" alt="" class="wp-image-592"&gt;&lt;/a&gt;EFS instance - Image MNube.org&lt;/p&gt;

&lt;p&gt;After creating the instance, an &lt;em&gt;access point&lt;/em&gt; needs to be provided to allow applications access. This is a new resource: &lt;em&gt;"AWS::EFS::AccessPoint"&lt;/em&gt;. It can be created from the console, or through a cloudformation file - we will need to supply the EFS ID: ${self.provider}.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--9DhtPnhq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image-4.png%3Fw%3D992" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--9DhtPnhq--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/09/image-4.png%3Fw%3D992" alt="" class="wp-image-545" width="571" height="353"&gt;&lt;/a&gt;Serverless framework YAML - Image MNube.org&lt;/p&gt;

&lt;p&gt;Finally, we link the file system to the Lambda Function, providing the arn of the EFS, the arn of the access point, and the local mounted path - as shown on the image below:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--XJ3Tu9ok--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-3-1.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--XJ3Tu9ok--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-3-1.png%3Fw%3D1024" alt="" class="wp-image-595"&gt;&lt;/a&gt;Image MNube.org&lt;/p&gt;

&lt;p&gt;The EFS instance is ready to be accessed by the Lambda function :)&lt;/p&gt;



&lt;h3&gt;Solution&lt;/h3&gt;

&lt;p&gt;I have used the &lt;em&gt;&lt;strong&gt;&lt;a href="https://www.serverless.com/open-source/"&gt;Serverless framework&lt;/a&gt;&lt;/strong&gt;&lt;/em&gt; to produce the solution - but  &lt;em&gt;&lt;strong&gt;A&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt;&lt;em&gt;WS SAM&lt;/em&gt; &lt;em&gt;with Cloud 9&lt;/em&gt;&lt;/strong&gt; as the official alternative could have been used instead. I have quite experience with Serverless, having introduced it to a few companies - &lt;em&gt;including &lt;a href="https://www.everis.com/global/en"&gt;Everis&lt;/a&gt;&lt;/em&gt; - with big success.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--bsfaTLaT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/aws-lambda-efs-support.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--bsfaTLaT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/aws-lambda-efs-support.png%3Fw%3D1024" alt="" class="wp-image-605"&gt;&lt;/a&gt;Architecture - MNube.org&lt;/p&gt;

&lt;p&gt;Let's create - or transfer -  a rules file that can be accessed from the Lambda function :)&lt;/p&gt;

&lt;p&gt;Different services could be used to transfer the files, like &lt;em&gt;AWS DataSync&lt;/em&gt;, an &lt;em&gt;EC2&lt;/em&gt; instance, or even creating files from code. The files we might transfer from EC2 are accessible from the Lambda functions, so we´ll use this method.&lt;/p&gt;

&lt;p&gt;After the EC2 instance has been created - &lt;em&gt;a t2.micro is enough&lt;/em&gt;  - in one of the subnets of the VPC that has access to the &lt;em&gt;EFS ENI´s&lt;/em&gt;, a directory we´ll be needed -  &lt;strong&gt;/&lt;em&gt;efs&lt;/em&gt;&lt;/strong&gt;. That directory doesn't have any link to the EFS instance, so we´ll need to mount the directory. &lt;/p&gt;

&lt;p&gt;One way to do it is using the &lt;em&gt;EFS tools&lt;/em&gt;:&lt;/p&gt;

&lt;pre class="wp-block-code"&gt;&lt;code&gt;                     sudo yum install -y amazon-efs-utils&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;An &lt;em&gt;access point&lt;/em&gt; was created previously that we can use to mount the directory. It's easy to get the command line needed from the web console. Just go to to the &lt;em&gt;Amazon EFS &amp;gt; Access Point &amp;gt; id&lt;/em&gt; link, and press the &lt;em&gt;&lt;strong&gt;Attach&lt;/strong&gt;&lt;/em&gt; button:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--0VMlB64C--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-4.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--0VMlB64C--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-4.png%3Fw%3D1024" alt="" class="wp-image-597"&gt;&lt;/a&gt;EFS Mount - Image MNube.org&lt;/p&gt;

&lt;p&gt;After mounting the directory - &lt;em&gt;in green &lt;/em&gt;- the files can be transfer to the&lt;em&gt; &lt;strong&gt;/efs&lt;/strong&gt;&lt;/em&gt; directory:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--afIXZ9_6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-5.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--afIXZ9_6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-5.png%3Fw%3D1024" alt="" class="wp-image-599"&gt;&lt;/a&gt;Mounting and creating files - Image MNube.org&lt;/p&gt;

&lt;p&gt;At this point, the access to the directory from the Lambda function should be fully possible. I have coded a minimum Lambda function that lists the files contained in the directory:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--ltVc4lmW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/image-7.png%3Fw%3D970" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--ltVc4lmW--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/image-7.png%3Fw%3D970" alt="" class="wp-image-587" width="536" height="387"&gt;&lt;/a&gt;Lambda function - Image MNube.org&lt;/p&gt;

&lt;p&gt;The solution is now ready to be deployed. Keep in mind that I have only shown parts of the &lt;em&gt;serverless.yml&lt;/em&gt;, equivalent to the cloudformation file you might use to provide the infrastructure - I will leave that to you as an exercise.&lt;/p&gt;

&lt;pre class="wp-block-syntaxhighlighter-code"&gt;                serverless deploy --stage dev --region eu-west-1&lt;/pre&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--LcC4sdTR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-6.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--LcC4sdTR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/vpc-6.png%3Fw%3D1024" alt="" class="wp-image-600"&gt;&lt;/a&gt;Serverless Stack - Image MNube.org&lt;/p&gt;

&lt;p&gt;An URL link is provided by the framework, as I created an API gateway that invokes the Lambda function:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--2o7R_Enk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/image-6.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--2o7R_Enk--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/10/image-6.png%3Fw%3D1024" alt="" class="wp-image-586" width="1100" height="261"&gt;&lt;/a&gt;Cloudwatch Logs - Image from MNUBE.org&lt;/p&gt;

&lt;p&gt;I have captured the request trace from the Cloudwatch Logs, where we can see the files in /efs: &lt;em&gt;&lt;strong&gt;test.txt and rules.txt&lt;/strong&gt;&lt;/em&gt;, and the low latency of the request.&lt;/p&gt;



&lt;h3&gt;Other Use Cases&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Loading big libraries that Lambda layers can´t handle.&lt;/li&gt;
&lt;li&gt;Files that are updated regularly.&lt;/li&gt;
&lt;li&gt;Files that need locks for concurrent access. &lt;/li&gt;
&lt;li&gt;Access to big files - zip / unzip.&lt;/li&gt;
&lt;li&gt;Using different computing architectures - &lt;em&gt;EC2, ECS&lt;/em&gt; -  to process the same files.&lt;/li&gt;
&lt;/ul&gt;



</description>
      <category>aws</category>
      <category>serverless</category>
      <category>tutorial</category>
      <category>productivity</category>
    </item>
    <item>
      <title>AWS Data Analytics Certification, is it worth?</title>
      <dc:creator>Adolfo Estevez</dc:creator>
      <pubDate>Sun, 17 May 2020 08:13:48 +0000</pubDate>
      <link>https://forem.com/aestevezjimenez/aws-data-analytics-certification-is-it-worth-it-2887</link>
      <guid>https://forem.com/aestevezjimenez/aws-data-analytics-certification-is-it-worth-it-2887</guid>
      <description>&lt;p&gt;On April 13, the journey of the new AWS Data Analytics Specialty certification officially began - prior to the beta phase in December 2019 / January 2020. It coincided in time with the AWS Database Specialty Beta, which forced me to choose between the two. Finally, I decided on taking the Databases Specialty, as I had recently tested from AWS Big Data.&lt;/p&gt;

&lt;p&gt;The “Beta exam” experience is very different from the “standard” one: &lt;strong&gt;85 questions and 4 hours long - &lt;/strong&gt;that is, &lt;strong&gt;20 questions and one more hour&lt;/strong&gt; - a really intense experience. I recommend taking a 5-minute break - in the centers they are allowed - since after the third hour it is very difficult to stay focused. &lt;/p&gt;

&lt;p&gt;The certification is the new version of AWS Big Data Specialty, an exam that will be withdrawn in June 2020. I will not go into much depth on the differences, suffice it to say that &lt;strong&gt;the domain of Machine Learning has been eliminated&lt;/strong&gt;, expanding and updating the rest of domains in depth. But beware, Machine Learning and IoT continue to appear integrated in the other domains, therefore, it is necessary to know them at an architectural level, at the very least.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--mclQ3trp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/kinesisanalytics_lp_iot.2d8e10d5cf377dad4453aedb6ccbd8f2efee612a.png%3Fw%3D760" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--mclQ3trp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/kinesisanalytics_lp_iot.2d8e10d5cf377dad4453aedb6ccbd8f2efee612a.png%3Fw%3D760" alt="" class="wp-image-103" width="760" height="266"&gt;&lt;/a&gt;Image from aws.amazon.com&lt;/p&gt;

&lt;h2&gt;Prerequisites and recommendations&lt;/h2&gt;

&lt;p&gt;I will not repeat the information that is already available &lt;a href="https://aws.amazon.com/training/path-data-analytics/"&gt;on the AWS website&lt;/a&gt;; instead, I am going to give my personal recommendations and observations, as I consider the Learning Path that AWS suggests to be somewhat light for the current level of the exam.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;AWS experience at the architectural level&lt;/strong&gt;. &lt;/em&gt;The exam is largely focused on advanced architecture solution - 5 pillars - and to a lesser extent on development, which is present mainly in services such as Kinesis and Glue. I recommend being in possession of the &lt;em&gt;AWS Architect Solutions Pro &lt;/em&gt;certification or alternatively the &lt;em&gt;AWS Architect Associate + AWS Security Specialty&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Advanced AWS security experience.&lt;/em&gt;&lt;/strong&gt; it is a complete domain of the exam, but can be found - &lt;em&gt;cross domain&lt;/em&gt; - in many questions. If you are in possession of the &lt;em&gt;AWS Architect Solutions Pro&lt;/em&gt;, general security knowledge may be sufficient - not the specific certification knowledge for each service. Otherwise, the &lt;em&gt;AWS Security Specialty&lt;/em&gt; is a good option, or equivalent knowledge in certain services - that I will indicate later on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Analytics knowledge&lt;/em&gt;&lt;/strong&gt;. Otherwise, I´d recommend studying books such as “&lt;em&gt;Data Analytics with Hadoop” - O’Reilly 2016&lt;/em&gt;, or taking the courses indicated in the &lt;em&gt;AWS Learning Path&lt;/em&gt;. Likewise, carry out laboratories or pet projects to obtain some practical experience.&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;&lt;strong&gt;Hadoop´s ecosystem knowledge&lt;/strong&gt;&lt;/em&gt;. Connected to the previous point. High-level and architectural knowledge of the ecosystem is a must: ​​Hive, Presto, Pig, …&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Knowledge of Machine Learning and IoT - AWS ecosystem&lt;/em&gt;&lt;/strong&gt;. &lt;em&gt;Sagemaker and core IoT &lt;/em&gt;services at the architectural level&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--fKkc00Z6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/learning-paths_data-analytics.aff5d6194bc5842c971873f069d13b615302f893.png%3Fw%3D760" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--fKkc00Z6--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/learning-paths_data-analytics.aff5d6194bc5842c971873f069d13b615302f893.png%3Fw%3D760" alt="" class="wp-image-48"&gt;&lt;/a&gt;&lt;br&gt;Imagen aws.amazon.com&lt;/p&gt;

&lt;h2&gt;Exam&lt;/h2&gt;

&lt;p&gt;The questions follow the style of other certifications such as &lt;em&gt;AWS Pro Architect or Security or Databases Specialty&lt;/em&gt;. They are all “scenario based”, long and complex - most of them. You are not going to find many simple questions. Certainly, between 5% and 10% of “easy” questions appeared, but all in a “&lt;em&gt;scenario&lt;/em&gt;” format.&lt;/p&gt;

&lt;p&gt;Let's look at an example taken from the &lt;a href="https://d1.awsstatic.com/training-and-certification/docs-data-analytics-specialty/AWS-Certified-Data-Analytics-Specialty_Sample-Questions.pdf"&gt;AWS sample questions:&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--83Dm_ymM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/captura-de-pantalla-2020-05-09-a-las-11.46.07.png%3Fw%3D1024" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--83Dm_ymM--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/captura-de-pantalla-2020-05-09-a-las-11.46.07.png%3Fw%3D1024" alt="" class="wp-image-52"&gt;&lt;/a&gt;Imagen amazon.aws.com&lt;br&gt;&lt;/p&gt;

&lt;p&gt;I´d classify this question as &lt;em&gt;&lt;strong&gt;"intermediate"&lt;/strong&gt;&lt;/em&gt; level of difficulty. If you have taken the &lt;em&gt;Architect PRO&lt;/em&gt;, or some specialty such as &lt;em&gt;Security or Big Data&lt;/em&gt;, you will know what I am talking about. Certainly, the level of the questions is much higher and deeper than in the previous version of the exam.&lt;/p&gt;

&lt;p&gt;I´d recommend doing the new specialty directly, as the old one contains questions about already deprecated services - or outdated information.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;&lt;em&gt;Services to know in depth&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--asFyb_9d--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/product-page-diagram_kinesis-data-analytics-real-time-log-analytics.d577a64060cc1e594c3c5c4a66feb1cc6e26a397.png%3Fw%3D760" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--asFyb_9d--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://metanube.files.wordpress.com/2020/05/product-page-diagram_kinesis-data-analytics-real-time-log-analytics.d577a64060cc1e594c3c5c4a66feb1cc6e26a397.png%3Fw%3D760" alt="" class="wp-image-110"&gt;&lt;/a&gt;Image from aws.amazon.com&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;AWS Kinesis&lt;/strong&gt;&lt;/em&gt; - in its three modalities,&lt;em&gt; Data Streams, Firehose and Analytics&lt;/em&gt;. Architecture, dimensioning, configuration, integration with other services, security, troubleshooting, metrics, optimization and development. Questions of various levels, some of them very complex and of great depth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;AWS Glue&lt;/em&gt;&lt;/strong&gt;  - in deep for ETL and discover - an integral part of the exam. Questions of different levels - I did not find them to be the most difficult.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;AWS Redshift - &lt;/em&gt;&lt;/strong&gt;architecture, design, dimensioning, integration, security, ETL, backups … a large number of questions  and some of them very complex.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;AWS EMR / Spark&lt;/em&gt;&lt;/strong&gt; -  architecture, sizing configuration, performance, integration with other services, security, integration with the Hadoop ecosystem - very important, but not as important as the previous three services. Very complex questions that require advanced and transversal knowledge of all domains and the Hadoop ecosystem: &lt;em&gt;Hive, HBase, Presto, Scoop, Pig &lt;/em&gt;…&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Security&lt;/strong&gt;&lt;/em&gt; - KMS encryption, AWS Cloud HMS, Federation, Active Directory, IAM, Policies, Roles etc … in general and for each service in particular. Transversal questions to other domains and of a high difficulty.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Very important services to consider&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS S3&lt;/em&gt;&lt;/strong&gt; - core service base (storage, security, rules) and new features like &lt;em&gt;AWS S3 Select&lt;/em&gt;. It appears consistently across all certifications, which is why I´d assume it's known in depth except for the new features.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Athena&lt;/em&gt;&lt;/strong&gt; - architecture, configuration, integration, performance, use cases. It appears consistently and as an alternative to other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Managed Kafka&lt;/em&gt;&lt;/strong&gt; - alternative to Kinesis, architecture, configuration, dimensioning, performance, integration, use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Quicksight&lt;/em&gt;&lt;/strong&gt; - subscription formats, service features, different ways of viewing, use cases. Alternative to other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Elastic Search y Kibana (ELK)&lt;/em&gt;&lt;/strong&gt; - architecture, configuration, dimensioning, performance, integration, use cases. Alternative to other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Lambda&lt;/em&gt;&lt;/strong&gt; - architecture, integration, use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS StepFunctions&lt;/em&gt;&lt;/strong&gt; - architecture, integration, use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS DMS&lt;/em&gt;&lt;/strong&gt;  - architecture, integration, use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS DataPipeline&lt;/em&gt;&lt;/strong&gt; - architecture, integration, use cases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Ga8hecYp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://metanube.files.wordpress.com/2020/05/near_real_time_streaming_1.gif%3Fw%3D657" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Ga8hecYp--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_66%2Cw_880/https://metanube.files.wordpress.com/2020/05/near_real_time_streaming_1.gif%3Fw%3D657" alt="" class="wp-image-115"&gt;&lt;/a&gt;Image from aws.amazon.com&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Other services&lt;/em&gt; &lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Networking&lt;/em&gt;&lt;/strong&gt; - basic network architectures and knowledge: VPC, security groups, Direct Connect, VPN, Regions, Zones … network configuration of each particular service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS DynamoDB, ElasticCache&lt;/em&gt;&lt;/strong&gt;  - architecture, integration, use case knowledge. These services, which appeared very prominently in the previous version of the exam, have much less weight in the current one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS CloudWatch, Events, Log&lt;/em&gt;&lt;/strong&gt; - architecture, configuration, integration, use case knowledge.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS RDS y Aurora&lt;/em&gt;&lt;/strong&gt; - architecture, configuration, integration, use case knowledge.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;EC2, Autoscaling&lt;/em&gt;&lt;/strong&gt; - knowledge of architecture, integration, use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;SQS, SNS&lt;/em&gt;&lt;/strong&gt; - knowledge of architecture, integration, use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;AWS Cloudformation&lt;/em&gt;&lt;/strong&gt; - knowledge of architecture, use cases, devops.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Sagemaker y AWS IoT core&lt;/em&gt;&lt;/strong&gt; - knowledge of architecture, integration, use cases.&lt;/li&gt;
&lt;/ul&gt;



&lt;h2&gt;Essential Resources&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/es/training/path-data-analytics/"&gt;AWS Certification Website.&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;
&lt;a href="https://d1.awsstatic.com/training-and-certification/docs-data-analytics-specialty/AWS-Certified-Data-Analytics-Specialty_Sample-Questions.pdf"&gt;Example questions&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.aws.training/Details/eLearning?id=46612"&gt;Readiness Course&lt;/a&gt; - a must, packed with information and resources - including a 20 question test.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://aws.amazon.com/es/blogs/big-data/big-data-analytics-options-on-aws-updated-white-paper/"&gt;AWS Whitepapers&lt;/a&gt; - Big Data Analytics Options on AWS.&lt;/li&gt;
&lt;li&gt;AWS FAQS for every service - specially for Kinesis, Glue, Redshift, EMR.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/es/blogs/big-data/"&gt;AWS Big Data Blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Practice Exam - a must, quite challenging and very representative of the actual exam.&lt;/li&gt;
&lt;/ul&gt;



&lt;h2&gt;Is it worth then?&lt;/h2&gt;

&lt;p&gt;Let´s see :)&lt;/p&gt;

&lt;p&gt;&lt;em&gt;AWS Data Analytics Specialty&lt;/em&gt; is a complex and difficult certification; expensive (300 euros), which requires a very important investment of time - even having experience in analytics and AWS. Therefore, it is not a decision that can be taken lightly.&lt;/p&gt;

&lt;p&gt;In my personal case, I found it very convenient to have done it, since I having been working on several projects of that kind - &lt;em&gt;fast data, IoT&lt;/em&gt; - under AWS in recent times - apart from being the only certification that I needed to complete the full set of thirteen - if &lt;em&gt;Big Data&lt;/em&gt; is included - certifications.&lt;/p&gt;

&lt;p&gt;Certifications are a good way, not only to &lt;strong&gt;&lt;em&gt;validate knowledge externally,&lt;/em&gt;&lt;/strong&gt; but to &lt;strong&gt;&lt;em&gt;collect updated information&lt;/em&gt;&lt;/strong&gt;, &lt;strong&gt;&lt;em&gt;validate good practices&lt;/em&gt;&lt;/strong&gt; and &lt;strong&gt;&lt;em&gt;consolidate knowledge&lt;/em&gt;&lt;/strong&gt; with real (or almost) practical cases.&lt;/p&gt;

&lt;p&gt;For those interested in the analytics field or who have professional experience in it, and who want to make the leap to the cloud, my recommendation is to first obtain an AWS Architect-type certification - preferably PRO - and optionally the Security specialty or equivalent knowledge , at least in the services that I have mentioned in previous points.&lt;/p&gt;

&lt;p&gt;For those who already have AWS certifications, but no professional experience in the specific field, it may be a good way to start, but it will not be an easy or short path. I recommend doing labs or pet projects, in order to get some experience necessary to pass the exam.&lt;/p&gt;

&lt;p&gt;So is it worth it? &lt;strong&gt;Absolutely&lt;/strong&gt;, but not as a first certification. Especially aimed at people with advanced knowledge of AWS architecture who want to delve deeper into the analytics - cloud field.&lt;/p&gt;

&lt;p&gt;Good luck to you all!&lt;/p&gt;



&lt;p&gt;somewhat light for the current level of the exam.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>architecture</category>
      <category>career</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>The Reference Architecture Disappointment</title>
      <dc:creator>Adolfo Estevez</dc:creator>
      <pubDate>Tue, 12 May 2020 05:13:57 +0000</pubDate>
      <link>https://forem.com/aestevezjimenez/the-reference-architecture-disappointment-476m</link>
      <guid>https://forem.com/aestevezjimenez/the-reference-architecture-disappointment-476m</guid>
      <description>&lt;p&gt;There is a "phenomenon" that I have experienced through my career that I like to call the "Reference Architecture Disappointment". &lt;/p&gt;

&lt;p&gt;It´s a similar effect some people would experiment when they go to the MD´s consultation with several symptoms, just to find out that they may have a common cold. No frenzy at the Hospital, no crazy consultations, no House MD´s TV scenes. Just paracetamol, water and rest! &lt;/p&gt;

&lt;p&gt;So many years of Medicine School just to prescribe that? 
Well, yes. The MD was able to recognize a common cold between dozen of illnesses with the same set of symptoms, and prescribed the simplest and best treatment. Question is, would you be able to do it? &lt;/p&gt;

&lt;p&gt;Same thing when a Solutions Architect deals with a set of requirements. The "Architect" will select the best architecture that solves a business problem, in the simplest and efficient manner possible. That means - sometimes - to use the "Reference Architecture" for that particular problem, with the necessary changes. &lt;/p&gt;

&lt;p&gt;Those architectures emerge from practical experience and encompass patterns and best practices. Usually reinventing the wheel is just not a good idea. 

Keep it simple and Rock On!&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>career</category>
      <category>productivity</category>
      <category>aws</category>
    </item>
  </channel>
</rss>
