<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Naveena</title>
    <description>The latest articles on Forem by Naveena (@naveena_davay).</description>
    <link>https://forem.com/naveena_davay</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3696869%2Fd46b9dbb-e237-45ab-9064-cfe9870686ce.png</url>
      <title>Forem: Naveena</title>
      <link>https://forem.com/naveena_davay</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/naveena_davay"/>
    <language>en</language>
    <item>
      <title>From CMDB to Clarity: Mapping Products to Applications Without Losing Your Mind</title>
      <dc:creator>Naveena</dc:creator>
      <pubDate>Mon, 06 Apr 2026 20:01:53 +0000</pubDate>
      <link>https://forem.com/naveena_davay/from-cmdb-to-clarity-mapping-products-to-applications-without-losing-your-mind-3efl</link>
      <guid>https://forem.com/naveena_davay/from-cmdb-to-clarity-mapping-products-to-applications-without-losing-your-mind-3efl</guid>
      <description>&lt;p&gt;When working in large enterprises, especially in healthcare or financial systems, knowing what you own isn’t enough.&lt;br&gt;
You also need to know:&lt;br&gt;
&lt;em&gt;What business product does this application actually support?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This article walks through how to move from a messy CMDB in ServiceNow to a usable product-to-application mapping—so you can make safer decisions, reduce technical debt, and stop guessing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;The Problem&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
You’re asked a simple question:&lt;br&gt;
“Can we retire this application?”&lt;br&gt;
You check your CMDB:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;App exists ✅&lt;/li&gt;
&lt;li&gt;Owner exists (maybe outdated) ✅&lt;/li&gt;
&lt;li&gt;Infrastructure mapped ✅&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But…&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which product uses it? ❌&lt;/li&gt;
&lt;li&gt;Is it still critical? ❌&lt;/li&gt;
&lt;li&gt;What breaks if you remove it? ❌&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Meanwhile, your architecture looks like this:&lt;br&gt;
App A → DB1 &lt;br&gt;
App B → DB2 &lt;br&gt;
App C → ??? &lt;br&gt;
No product context. No clarity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;The Approach&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Define Product Mapping&lt;br&gt;
Start by explicitly mapping applications to business products.&lt;br&gt;
CREATE TABLE product_app_map (&lt;br&gt;
   product_name STRING,&lt;br&gt;
   application_name STRING&lt;br&gt;
);&lt;br&gt;
Even a simple mapping table changes everything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Backfill Using What You Already Know&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Pull data from:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CMDB (applications, owners)&lt;/li&gt;
&lt;li&gt;Business docs (products)&lt;/li&gt;
&lt;li&gt;Logs / usage patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;INSERT INTO product_app_map&lt;br&gt;
SELECT&lt;br&gt;
   'Claims Processing' AS product_name,&lt;br&gt;
   app_name&lt;br&gt;
FROM cmdb_applications&lt;br&gt;
WHERE tag = 'claims';&lt;/p&gt;

&lt;p&gt;It won’t be perfect—but it’s a start.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Handle “Unknowns” Explicitly&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
You will find apps like this:&lt;br&gt;
App X → Owner: Unknown → Usage: Unknown&lt;br&gt;
Don’t ignore them. Flag them.&lt;/p&gt;

&lt;p&gt;SELECT application_name&lt;br&gt;
FROM product_app_map&lt;br&gt;
WHERE product_name IS NULL;&lt;br&gt;
&lt;em&gt;These are your hidden technical debt hotspots.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Validate with Real Usage Data&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Augment mapping using logs or query patterns:&lt;br&gt;
def infer_product(app_logs):&lt;br&gt;
   if "claims" in app_logs:&lt;br&gt;
       return "Claims Processing"&lt;br&gt;
   elif "billing" in app_logs:&lt;br&gt;
       return "Billing"&lt;br&gt;
   else:&lt;br&gt;
       return "Unknown"&lt;br&gt;
Not perfect—but better than tribal knowledge.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Enforce It in Governance&lt;/strong&gt;&lt;/em&gt;&lt;br&gt;
Make mapping mandatory:&lt;br&gt;
if (!application.product) {&lt;br&gt;
 throw new Error("Application not mapped to product");&lt;br&gt;
}&lt;br&gt;
No mapping → no deployment, no change approval.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Why This Works&lt;/strong&gt;&lt;/em&gt;&lt;br&gt;
You shift from assets → context&lt;br&gt;
Unknown systems become visible risks&lt;br&gt;
Impact analysis becomes trivial&lt;br&gt;
Instead of:&lt;br&gt;
“I think this app is used somewhere…”&lt;br&gt;
You get:&lt;br&gt;
SELECT product_name&lt;br&gt;
FROM product_app_map&lt;br&gt;
WHERE application_name = 'App A';&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;The Payoff&lt;/strong&gt;&lt;/em&gt;&lt;br&gt;
Safer decommissioning&lt;br&gt;
Faster incident resolution&lt;br&gt;
Clear ownership and accountability&lt;br&gt;
Real visibility into technical debt&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Final Thought&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Most enterprises don’t lack data.&lt;br&gt;
They lack meaningful relationships between that data.&lt;br&gt;
Your CMDB knows everything you have.&lt;br&gt;
It just doesn’t know why you have it.&lt;/p&gt;

</description>
      <category>assetmanagement</category>
      <category>datagovernance</category>
      <category>architecture</category>
      <category>servicenow</category>
    </item>
    <item>
      <title>🚨 Why Your SQL Window Functions Betray You in Cloud SSMS vs Snowflake</title>
      <dc:creator>Naveena</dc:creator>
      <pubDate>Sat, 21 Feb 2026 18:16:20 +0000</pubDate>
      <link>https://forem.com/naveena_davay/why-your-sql-window-functions-betray-you-in-cloud-ssms-vs-snowflake-147l</link>
      <guid>https://forem.com/naveena_davay/why-your-sql-window-functions-betray-you-in-cloud-ssms-vs-snowflake-147l</guid>
      <description>&lt;p&gt;🚨 Why Your SQL Window Functions Betray You in Cloud SSMS vs Snowflake&lt;br&gt;
You run the same query.&lt;br&gt;
 Same data.&lt;br&gt;
 Same logic.&lt;br&gt;
But your numbers don’t match.&lt;br&gt;
Welcome to the sneaky world of window functions — where defaults quietly change your results between Microsoft SQL Server (Cloud SSMS) and Snowflake.&lt;br&gt;
Let’s break down the drama.&lt;/p&gt;

&lt;p&gt;🎭 The Silent Villain: Default Window Frames&lt;br&gt;
Here’s a classic trap:&lt;br&gt;
LAST_VALUE(amount) OVER (&lt;br&gt;
   PARTITION BY customer_id&lt;br&gt;
   ORDER BY order_date&lt;br&gt;
)&lt;br&gt;
In Snowflake, the default frame is:&lt;br&gt;
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW&lt;br&gt;
Translation?&lt;br&gt;
 LAST_VALUE() returns the current row’s value, not the actual last value in the partition.&lt;br&gt;
In SQL Server, you might assume it works differently — or you might get lucky depending on how you've tested it.&lt;br&gt;
💥 Result: mismatched reports, confused stakeholders, late-night debugging.&lt;br&gt;
Pro Tip:&lt;br&gt;
 Always define your frame explicitly:&lt;br&gt;
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING&lt;br&gt;
If it matters, don’t leave it to defaults.&lt;/p&gt;

&lt;p&gt;🧨 NULLs: The Chaos Agents&lt;br&gt;
Sorting rules differ:&lt;br&gt;
Snowflake → NULL comes first in ascending order&lt;br&gt;
SQL Server → NULL comes last&lt;/p&gt;

&lt;p&gt;Now imagine using:&lt;br&gt;
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY bonus)&lt;br&gt;
If bonus has NULLs?&lt;br&gt;
Your row numbers shift.&lt;br&gt;
Your rankings change.&lt;br&gt;
Your dashboard breaks.&lt;/p&gt;

&lt;p&gt;🎲 Ties = Non-Deterministic Madness&lt;br&gt;
This one burns teams during migrations.&lt;br&gt;
ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary)&lt;br&gt;
If two employees earn the same salary:&lt;br&gt;
SQL Server might return one order.&lt;br&gt;
Snowflake (distributed engine) might return another.&lt;br&gt;
Re-run it? You might get a different result again.&lt;/p&gt;

&lt;p&gt;Because neither engine guarantees deterministic ordering unless you make it deterministic.&lt;br&gt;
🔥 Add a tie-breaker:&lt;br&gt;
ORDER BY salary, employee_id&lt;br&gt;
Always.&lt;/p&gt;

&lt;p&gt;🧬 Collation &amp;amp; Case Sensitivity&lt;br&gt;
SQL Server respects database-level collation.&lt;br&gt;
 Snowflake handles text comparisons differently.&lt;br&gt;
If you're partitioning by strings, grouping may not match exactly after migration.&lt;br&gt;
Subtle. Painful. Real.&lt;/p&gt;

&lt;p&gt;☁️ Engine vs. Cloud Architecture&lt;br&gt;
SQL Server executes in a traditional engine model.&lt;br&gt;
Snowflake distributes computation across clusters.&lt;br&gt;
Distributed systems expose sloppy ordering assumptions fast.&lt;br&gt;
What “worked fine” before?&lt;br&gt;
 Was probably relying on physical storage order.&lt;br&gt;
Snowflake doesn’t care about your assumptions.&lt;/p&gt;

&lt;p&gt;🛡️ The Migration Survival Checklist&lt;br&gt;
If you want consistent results:&lt;br&gt;
✅ Explicit ROWS BETWEEN&lt;br&gt;
 ✅ Deterministic ORDER BY&lt;br&gt;
 ✅ Explicit NULL handling&lt;br&gt;
 ✅ Test ties&lt;br&gt;
 ✅ Test edge cases&lt;br&gt;
 ✅ Never trust defaults&lt;/p&gt;

&lt;p&gt;🎯 Bottom Line&lt;br&gt;
Window functions aren’t broken.&lt;br&gt;
Your assumptions are.&lt;br&gt;
When moving from SQL Server to Snowflake, make everything explicit.&lt;br&gt;
Because in analytics…&lt;br&gt;
“Almost the same result” is not the same result.&lt;/p&gt;

</description>
      <category>sql</category>
      <category>snowflake</category>
      <category>cloud</category>
      <category>analytics</category>
    </item>
    <item>
      <title>From Parquet to Snowflake: Query Smart, Load Fast</title>
      <dc:creator>Naveena</dc:creator>
      <pubDate>Tue, 06 Jan 2026 18:05:05 +0000</pubDate>
      <link>https://forem.com/naveena_davay/from-parquet-to-snowflake-query-smart-load-fast-1gm6</link>
      <guid>https://forem.com/naveena_davay/from-parquet-to-snowflake-query-smart-load-fast-1gm6</guid>
      <description>&lt;p&gt;When working with large volumes of financial data, querying efficiently and loading the results into a data warehouse like Snowflake is crucial. This article walks through how an analyst can handle millions of records stored as Parquet files in AWS S3 and export processed data to Snowflake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Problem&lt;/strong&gt;&lt;br&gt;
The task is to generate daily metrics (like total transaction volume, active customers, and average balances) from 3 TB of Parquet data. The data is partitioned by transaction_date in S3, but older partitions have inconsistent column names. The results must then be loaded into Snowflake for further analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Approach&lt;/strong&gt;&lt;br&gt;
Efficiently Query the Data&lt;br&gt;
 Instead of scanning the entire dataset, you only read the last 30 days of data by using partition pruning. This saves both time and cost.&lt;/p&gt;

&lt;p&gt;Handle Schema Evolution&lt;br&gt;
 As the schema has changed over time (e.g., different column names for balance), you use SQL functions like COALESCE to handle missing or differently named columns, ensuring consistency.&lt;/p&gt;

&lt;p&gt;Aggregate Metrics&lt;br&gt;
 You aggregate data by region to calculate total transaction volume, count active customers, and find the average account balance.&lt;/p&gt;

&lt;p&gt;Load Data into Snowflake&lt;br&gt;
 After processing the data, you use Snowflake’s COPY INTO method for efficient, large-scale ingestion, moving the results from a CSV file into your warehouse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why This Works&lt;/strong&gt;&lt;br&gt;
Partition pruning ensures that only the relevant data is queried, making it fast and cost-efficient.&lt;/p&gt;

&lt;p&gt;Schema handling with functions like COALESCE allows for seamless integration across different data partitions.&lt;/p&gt;

&lt;p&gt;Snowflake’s optimized loading mechanisms allow for fast and reliable data transfer.&lt;/p&gt;

&lt;p&gt;This approach makes working with large, partitioned datasets in cloud storage manageable, while ensuring efficient data processing and loading into Snowflake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution in PySQL&lt;/strong&gt;&lt;br&gt;
Read the last 30 days of Parquet using partition pruning:&lt;/p&gt;

&lt;p&gt;import duckdb&lt;br&gt;
import datetime&lt;/p&gt;

&lt;p&gt;end = datetime.date.today()&lt;br&gt;
start = end - datetime.timedelta(days=30)&lt;/p&gt;

&lt;p&gt;con = duckdb.connect()&lt;br&gt;
df = con.execute(f"""&lt;br&gt;
    SELECT * &lt;br&gt;
    FROM read_parquet('s3://bank-lake/transactions/transaction_date&amp;gt;= {start} AND transaction_date &amp;lt;= {end}/*.parquet')&lt;br&gt;
""").fetchdf()&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Aggregate metrics, handling schema differences:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;result = con.execute("""&lt;br&gt;
    SELECT&lt;br&gt;
        region,&lt;br&gt;
        SUM(transaction_amount) AS total_tx,&lt;br&gt;
        COUNT(DISTINCT customer_id) AS active_customers,&lt;br&gt;
        AVG(COALESCE(account_balance, acct_balance)) AS avg_balance&lt;br&gt;
    FROM df&lt;br&gt;
    GROUP BY region&lt;br&gt;
""").fetchdf()&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Load the results into Snowflake:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;import snowflake.connector&lt;/p&gt;

&lt;p&gt;result.to_csv("daily.csv", index=False)&lt;/p&gt;

&lt;p&gt;conn = snowflake.connector.connect(&lt;br&gt;
    user='YOUR_USER',&lt;br&gt;
    password='YOUR_PASSWORD',&lt;br&gt;
    account='YOUR_ACCOUNT'&lt;br&gt;
)&lt;br&gt;
conn.cursor().execute("""&lt;br&gt;
    PUT file://daily.csv @%DAILY_REGION_METRICS;&lt;br&gt;
    COPY INTO DAILY_REGION_METRICS&lt;br&gt;
    FROM @%DAILY_REGION_METRICS&lt;br&gt;
    FILE_FORMAT=(TYPE=CSV FIELD_OPTIONALLY_ENCLOSED_BY='"');&lt;br&gt;
""")&lt;/p&gt;

</description>
      <category>snowsql</category>
      <category>awsdatalake</category>
      <category>analytics</category>
      <category>snowflake</category>
    </item>
  </channel>
</rss>
