<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Nidhi Thakore</title>
    <description>The latest articles on Forem by Nidhi Thakore (@nidhi_0105).</description>
    <link>https://forem.com/nidhi_0105</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3476921%2F8fa72548-42ab-47d2-a9e8-5ed8856dfb5e.png</url>
      <title>Forem: Nidhi Thakore</title>
      <link>https://forem.com/nidhi_0105</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/nidhi_0105"/>
    <language>en</language>
    <item>
      <title>How I Used AWS Glue and Athena for Serverless Data Analytics</title>
      <dc:creator>Nidhi Thakore</dc:creator>
      <pubDate>Mon, 06 Oct 2025 08:33:28 +0000</pubDate>
      <link>https://forem.com/nidhi_0105/how-i-used-aws-glue-and-athena-for-serverless-data-analytics-44p7</link>
      <guid>https://forem.com/nidhi_0105/how-i-used-aws-glue-and-athena-for-serverless-data-analytics-44p7</guid>
      <description>&lt;p&gt;As someone who loves building &lt;strong&gt;data pipelines&lt;/strong&gt;, I’ve always been fascinated by how serverless architectures simplify analytics.&lt;/p&gt;

&lt;p&gt;Recently, I worked on a project where I built a fully serverless data analytics pipeline using &lt;strong&gt;AWS Glue and Amazon Athena&lt;/strong&gt; — no servers, no EC2, no clusters, and no headaches.&lt;/p&gt;

&lt;p&gt;In this blog, I’ll take you through how I used these two &lt;strong&gt;AWS powerhouses&lt;/strong&gt; to go from raw S3 data → cleaned data → analytical insights — all without managing a single server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qxc6yq90houi0uoymec.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8qxc6yq90houi0uoymec.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Store Raw Data in Amazon S3&lt;/strong&gt;&lt;br&gt;
I started with raw e-commerce transaction data — product sales, customers, and timestamps — stored in S3.&lt;br&gt;
my-ecommerce-analytics/&lt;br&gt;
  raw/&lt;br&gt;
    sales01.csv&lt;br&gt;
    sales02.csv&lt;br&gt;
    customers.csv&lt;br&gt;
  transformed/&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Crawling Data Using AWS Glue&lt;/strong&gt;&lt;br&gt;
Next, I created an AWS Glue Crawler and pointed it to my s3 bucket.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What’s amazing about Glue Crawlers is that they automatically detect schema and data types and create tables inside the AWS Glue Data Catalog.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;After running the crawler, I had:&lt;br&gt;
sales_data&lt;br&gt;
customers_data&lt;br&gt;
in a database called ecommerce_analytics&lt;/p&gt;

&lt;p&gt;You can also schedule crawlers to run daily or hourly — perfect for continuously updated S3 data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Exploring Data with Amazon Athena&lt;/strong&gt;&lt;br&gt;
With the Glue Catalog ready, I moved to Amazon Athena that allows you to run SQL queries directly on S3 data, without having to load it into a database where I explored my sales data, aggregated revenue numbers, and filtered out any invalid or duplicate records.&lt;/p&gt;

&lt;p&gt;You can write your own queries to perform these operations — it feels just like using a normal SQL database.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Transforming Data with AWS Glue Jobs&lt;/strong&gt;&lt;br&gt;
Raw data is rarely perfect, so I used AWS Glue ETL Jobs to clean and transform it where i created a Glue job in Python to remove duplicates, standardize timestamp formats, and join sales data with customer information.&lt;/p&gt;

&lt;p&gt;Once transformed, I stored the cleaned data back in S3 — this time in Parquet format to make future queries faster and more cost-efficient.&lt;/p&gt;

&lt;p&gt;If you’re implementing this, you can write your own ETL logic inside Glue Studio or the Glue Job editor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5: Querying Transformed Data with Athena&lt;/strong&gt;&lt;br&gt;
After transformation, I returned to Athena to query the cleaned data. This is where you can perform your own analytical queries like finding top-selling products, analyzing sales patterns, or identifying high-value customers.&lt;/p&gt;

&lt;p&gt;Athena makes it effortless — just write standard SQL queries, and it processes everything directly from S3&lt;/p&gt;

&lt;p&gt;If you’re exploring AWS as a student, data engineer, or cloud enthusiast, I highly recommend trying this out with your own dataset so that You can understand the true power of serverless analytics once you query data sitting in S3 — in seconds — without spinning up a single machine.&lt;/p&gt;

</description>
      <category>dataengineering</category>
      <category>analytics</category>
      <category>serverless</category>
      <category>aws</category>
    </item>
    <item>
      <title>Event-Driven Architectures on AWS: Beyond Lambda</title>
      <dc:creator>Nidhi Thakore</dc:creator>
      <pubDate>Fri, 05 Sep 2025 08:45:10 +0000</pubDate>
      <link>https://forem.com/nidhi_0105/event-driven-architectures-on-aws-beyond-lambda-295a</link>
      <guid>https://forem.com/nidhi_0105/event-driven-architectures-on-aws-beyond-lambda-295a</guid>
      <description>&lt;p&gt;When most people hear event-driven architecture on AWS, they instantly think Lambda.&lt;br&gt;
And yes, Lambda is amazing — serverless, pay-per-use, and perfect for quick triggers.&lt;/p&gt;

&lt;p&gt;But here’s the catch → &lt;strong&gt;event-driven systems are much bigger than just Lambda.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;But the Question might arise in your mind&lt;br&gt;
&lt;strong&gt;Why Event-Driven?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Traditional architectures rely on polling or batch jobs. Event-driven systems flip the script:-&lt;br&gt;
    -&amp;gt; You don’t ask if something happened.&lt;br&gt;
    -&amp;gt; You react instantly when it does.&lt;br&gt;
This makes applications faster, cheaper, and more resilient.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lpwjo9c3ayd30qjoxb6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8lpwjo9c3ayd30qjoxb6.png" alt=" " width="800" height="1200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;_&lt;/p&gt;

&lt;h2&gt;
  
  
  AWS Building Blocks for Event-Driven Systems
&lt;/h2&gt;

&lt;p&gt;_&lt;/p&gt;

&lt;p&gt;Think of AWS event-driven architecture as a team of specialists, each with a unique role:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;EventBridge&lt;/strong&gt; → The Traffic Controller&lt;br&gt;
Decides where the event should go. Perfect for connecting apps, services, and even third-party SaaS without tight coupling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SNS&lt;/strong&gt; (Simple Notification Service) → The Broadcaster&lt;br&gt;
Shouts the event out to many listeners at once — email, SMS, &lt;br&gt;
Lambda, or other apps. Great for fan-out patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQS&lt;/strong&gt; (Simple Queue Service) → The Reliable Mailbox&lt;br&gt;
Holds events safely until someone is ready to process them. Ensures nothing gets lost, even during traffic spikes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step Functions&lt;/strong&gt; → The Workflow Manager&lt;br&gt;
Coordinates multi-step processes. Adds retries, error handling, and parallel execution to keep business workflows smooth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lambda&lt;/strong&gt; → The Quick Responder&lt;br&gt;
Executes business logic instantly. Serverless, auto-scaling, and cost-effective — but just one piece of the puzzle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Event-driven architectures&lt;/em&gt;&lt;/strong&gt; are no longer “nice to have” — they’re becoming the default way to design modern applications. Businesses want systems that are:&lt;/p&gt;

&lt;p&gt;Real-time → responding instantly to customer actions&lt;/p&gt;

&lt;p&gt;Scalable → handling unpredictable workloads with ease&lt;/p&gt;

&lt;p&gt;Cost-efficient → paying only when something actually happens&lt;/p&gt;

&lt;p&gt;Resilient → loosely coupled so failures don’t cascade&lt;/p&gt;

&lt;p&gt;AWS gives us the perfect toolkit to achieve this: EventBridge, SNS, SQS, Step Functions, and Lambda — each playing a distinct role but working together seamlessly.&lt;/p&gt;

&lt;p&gt;The real shift for engineers and architects is moving away from thinking in terms of servers and cron jobs to thinking in terms of events and reactions.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;And with AWS, event-driven design means building apps that don’t just exist in the cloud — they listen, react, and scale with the world around them.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>blog</category>
      <category>aws</category>
      <category>dataengineering</category>
      <category>lambda</category>
    </item>
  </channel>
</rss>
