<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Nandhini D</title>
    <description>The latest articles on Forem by Nandhini D (@nandhini_d_3172ea1ec82a9e).</description>
    <link>https://forem.com/nandhini_d_3172ea1ec82a9e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3460946%2F06c842e0-47df-46e4-97ed-ab81e4c27034.png</url>
      <title>Forem: Nandhini D</title>
      <link>https://forem.com/nandhini_d_3172ea1ec82a9e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/nandhini_d_3172ea1ec82a9e"/>
    <language>en</language>
    <item>
      <title>DATA IN SKY(Data in cloud)</title>
      <dc:creator>Nandhini D</dc:creator>
      <pubDate>Mon, 06 Oct 2025 18:53:34 +0000</pubDate>
      <link>https://forem.com/nandhini_d_3172ea1ec82a9e/data-in-skydata-in-cloud-2dn9</link>
      <guid>https://forem.com/nandhini_d_3172ea1ec82a9e/data-in-skydata-in-cloud-2dn9</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;br&gt;
      In a world where everything is connected from smart homes to self-driving cars,one invisible force powers it all: data. But the real magic isn’t just in the data itself...it’s where it lives.&lt;br&gt;
Welcome to the cloud, where information floats freely, accessible anytime, anywhere.&lt;br&gt;
&lt;code&gt;When we say “data in the sky,” it doesn’t mean our files are literally floating above us.&lt;br&gt;
Instead, they’re safely stored in giant data centers around the world, managed by cloud providers who make sure your data is always available when you need it.&lt;br&gt;
&lt;/code&gt;&lt;br&gt;
What is &lt;strong&gt;The Cloud&lt;/strong&gt; Really?&lt;br&gt;
     The term cloud often sounds like something abstract, but it’s actually a network of powerful servers distributed across the globe.&lt;br&gt;
When you upload a photo, stream a movie, or collaborate on a document in Google Drive-your data is being stored, processed, and served from these cloud data centers.&lt;br&gt;
In short:&lt;br&gt;
The cloud 🡪 someone else’s supercomputer that works for you over the internet.&lt;br&gt;
Let’s dive into the six most commonly used data formats in data analytics and cloud systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Formats in Cloud Analytics&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every time you store, share, or query data in the cloud, you’re likely dealing with one of these six formats:&lt;/p&gt;

&lt;p&gt;CSV – Simple text-based, comma-separated data&lt;/p&gt;

&lt;p&gt;SQL – Relational, structured data tables&lt;/p&gt;

&lt;p&gt;JSON – Lightweight, flexible key-value data&lt;/p&gt;

&lt;p&gt;Parquet – Efficient, columnar storage for big data&lt;/p&gt;

&lt;p&gt;XML – Markup-based hierarchical data&lt;/p&gt;

&lt;p&gt;Avro – Binary, schema-driven data for streaming&lt;/p&gt;

&lt;p&gt;To make it easy to understand, let’s take a small dataset and represent it in all six formats.&lt;/p&gt;

&lt;p&gt;&lt;u&gt;Sample Dataset&lt;/u&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Name&lt;/th&gt;
&lt;th&gt;Register_No&lt;/th&gt;
&lt;th&gt;Subject&lt;/th&gt;
&lt;th&gt;Marks&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Aadhiran&lt;/td&gt;
&lt;td&gt;101&lt;/td&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;92&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Kavin&lt;/td&gt;
&lt;td&gt;102&lt;/td&gt;
&lt;td&gt;ML&lt;/td&gt;
&lt;td&gt;88&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rekha&lt;/td&gt;
&lt;td&gt;103&lt;/td&gt;
&lt;td&gt;DBMS&lt;/td&gt;
&lt;td&gt;95&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;1.CSV (Comma Separated Values)&lt;/strong&gt;&lt;br&gt;
       CSV is one of the most popular and simplest data formats. Each line in a CSV file represents a record, and each value is separated by a comma.&lt;br&gt;
It’s easy to read, portable, and supported across almost every software tool — from Excel to Python to Google Sheets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Name,Register_No,Subject,Marks
Aadhiran,101,AI,92
Kavin,102,ML,88
Rekha,103,DBMS,95
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;✅ Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Human-readable and easy to edit&lt;/li&gt;
&lt;li&gt;Works with almost every data tool&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;⚠️ Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No schema or metadata&lt;/li&gt;
&lt;li&gt;Not efficient for large-scale analytics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2.SQL (Structured Query Language)&lt;/strong&gt;&lt;br&gt;
      SQL represents relational data — stored in tables with defined columns and data types.&lt;br&gt;
This is the backbone of databases like MySQL, PostgreSQL, and Oracle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE TABLE Students (
  Name VARCHAR(50),
  Register_No INT,
  Subject VARCHAR(20),
  Marks INT
);

INSERT INTO Students VALUES 
('Aadhiran', 101, 'AI', 92),
('Kavin', 102, 'ML', 88),
('Rekha', 103, 'DBMS', 95);


&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;✅ Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Structured, relational, and queryable&lt;/li&gt;
&lt;li&gt;Ideal for joins, filters, and aggregations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;⚠️ Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rigid schema&lt;/li&gt;
&lt;li&gt;Not suited for nested or semi-structured data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3.JSON (JavaScript Object Notation)&lt;/strong&gt;&lt;br&gt;
      JSON is the king of web APIs and NoSQL databases.&lt;br&gt;
It’s lightweight, flexible, and great for hierarchical or nested data structures.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[
  {"Name": "Aadhiran", "Register_No": 101, "Subject": "AI", "Marks": 92},
  {"Name": "Kavin", "Register_No": 102, "Subject": "ML", "Marks": 88},
  {"Name": "Rekha", "Register_No": 103, "Subject": "DBMS", "Marks": 95}
]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;✅ Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Easy to use and parse&lt;/li&gt;
&lt;li&gt;Excellent for APIs and web applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;⚠️ Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No enforced schema&lt;/li&gt;
&lt;li&gt;Can grow large for big datasets&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4.Parquet (Columnar Storage Format)&lt;/strong&gt;&lt;br&gt;
       Apache Parquet is designed for big data analytics.&lt;br&gt;
It stores data column-wise instead of row-wise, which improves compression and query performance.&lt;br&gt;
It’s the preferred format for tools like Apache Spark, AWS Athena, and Google BigQuery.&lt;/p&gt;

&lt;p&gt;Conceptual View:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Columns:
Name: ["Aadhiran", "Kavin", "Rekha"]
Register_No: [101, 102, 103]
Subject: ["AI", "ML", "DBMS"]
Marks: [92, 88, 95]

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;✅ Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Highly compressed and efficient&lt;/li&gt;
&lt;li&gt;Great for analytical queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;⚠️ Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Not human-readable&lt;/li&gt;
&lt;li&gt;Requires tools to read/write (e.g., PyArrow, Spark)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;5.XML (Extensible Markup Language)&lt;/strong&gt;&lt;br&gt;
     XML is a markup language that uses tags to define data structure.&lt;br&gt;
It’s often used in web services, configuration files, and document exchange.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;Students&amp;gt;
  &amp;lt;Student&amp;gt;
    &amp;lt;Name&amp;gt;Aadhiran&amp;lt;/Name&amp;gt;
    &amp;lt;Register_No&amp;gt;101&amp;lt;/Register_No&amp;gt;
    &amp;lt;Subject&amp;gt;AI&amp;lt;/Subject&amp;gt;
    &amp;lt;Marks&amp;gt;92&amp;lt;/Marks&amp;gt;
  &amp;lt;/Student&amp;gt;
  &amp;lt;Student&amp;gt;
    &amp;lt;Name&amp;gt;Kavin&amp;lt;/Name&amp;gt;
    &amp;lt;Register_No&amp;gt;102&amp;lt;/Register_No&amp;gt;
    &amp;lt;Subject&amp;gt;ML&amp;lt;/Subject&amp;gt;
    &amp;lt;Marks&amp;gt;88&amp;lt;/Marks&amp;gt;
  &amp;lt;/Student&amp;gt;
  &amp;lt;Student&amp;gt;
    &amp;lt;Name&amp;gt;Rekha&amp;lt;/Name&amp;gt;
    &amp;lt;Register_No&amp;gt;103&amp;lt;/Register_No&amp;gt;
    &amp;lt;Subject&amp;gt;DBMS&amp;lt;/Subject&amp;gt;
    &amp;lt;Marks&amp;gt;95&amp;lt;/Marks&amp;gt;
  &amp;lt;/Student&amp;gt;
&amp;lt;/Students&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;✅ Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Self-descriptive and structured&lt;/li&gt;
&lt;li&gt;Ideal for hierarchical data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;⚠️ Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verbose&lt;/li&gt;
&lt;li&gt;Slower parsing than JSON&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;6.Avro (Row-Based Storage Format)&lt;/strong&gt;&lt;br&gt;
       Apache Avro is a binary row-based format used for data serialization — ideal for streaming and messaging systems like Apache Kafka.&lt;br&gt;
It includes a schema with every file, ensuring data consistency and evolution over time.&lt;/p&gt;

&lt;p&gt;Schema Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "type": "record",
  "name": "Student",
  "fields": [
    {"name": "Name", "type": "string"},
    {"name": "Register_No", "type": "int"},
    {"name": "Subject", "type": "string"},
    {"name": "Marks", "type": "int"}
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;✅ Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compact binary format&lt;/li&gt;
&lt;li&gt;Schema evolution supported&lt;/li&gt;
&lt;li&gt;Excellent for data streaming&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;⚠️ Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Not human-readable&lt;/li&gt;
&lt;li&gt;Requires Avro libraries to use&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Conclusion:&lt;/strong&gt;&lt;br&gt;
     Each data format serves a unique purpose in the data ecosystem.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Case&lt;/strong&gt;&lt;br&gt;
Simple exports or logs 🡪 CSV&lt;br&gt;
Relational storage 🡪 SQL&lt;br&gt;
API responses or nested data 🡪 JSON&lt;br&gt;
Cloud-scale analytics 🡪 Parquet&lt;br&gt;
Hierarchical or document data 🡪 XML&lt;br&gt;
Data pipelines or streaming 🡪 Avro&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>aws</category>
      <category>database</category>
    </item>
    <item>
      <title>Getting Started with MongoDB: Essential Queries</title>
      <dc:creator>Nandhini D</dc:creator>
      <pubDate>Tue, 26 Aug 2025 17:42:12 +0000</pubDate>
      <link>https://forem.com/nandhini_d_3172ea1ec82a9e/getting-started-with-mongodb-essential-queries-j0b</link>
      <guid>https://forem.com/nandhini_d_3172ea1ec82a9e/getting-started-with-mongodb-essential-queries-j0b</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;br&gt;
I recently started learning MongoDB, a NoSQL database, and wanted to share some basic queries I practiced. MongoDB stores data in flexible JSON-like documents, which makes it really powerful for modern applications. MongoDB's flexible schema and powerful querying capabilities made it an excellent choice for this project.&lt;/p&gt;

&lt;p&gt;Datadesign:&lt;br&gt;
For this, I designed a collection named students with the following fields&lt;br&gt;
     {&lt;br&gt;
        "_id": "68a8360214d18e05a6850e6e",&lt;br&gt;
        "name": "John Doe",&lt;br&gt;
        "number_courses": 3,&lt;br&gt;
        "time_study": 4.508,&lt;br&gt;
        "marks": 19.202&lt;br&gt;
     }&lt;br&gt;
DB name: &lt;strong&gt;Review&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CRUD operation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1.Create(C):&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; db.Review.insertOne({
     "name": "Jane Smith",
     "number_courses": 4,
     "time_study": 5.0,
     "marks": 21.5
 });
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbjhclav19766300io7iv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbjhclav19766300io7iv.png" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Top 5 students with highest marks:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    db.Review.find().sort({ marks: -1 }).limit(5);
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbuaessgwa1117c5jisa6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbuaessgwa1117c5jisa6.png" alt=" " width="800" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3.Query to count how many reviews contain the word “good”:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    db.Review.find({ Marks: { $gt: 40 } })
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzoa0rwz5wynw5x34donm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzoa0rwz5wynw5x34donm.png" alt=" " width="518" height="262"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4.Query to get all reviews for a specific ID:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   db.Review.find({_id:ObjectId('68a83923181c0b16f38c0ec6')})
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fet67ee1q4jrs35uhy793.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fet67ee1q4jrs35uhy793.png" alt=" " width="709" height="260"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mongodb</category>
      <category>beginners</category>
      <category>nosql</category>
      <category>database</category>
    </item>
  </channel>
</rss>
