<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Steven Hur</title>
    <description>The latest articles on Forem by Steven Hur (@jongwan93).</description>
    <link>https://forem.com/jongwan93</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3477497%2F97adaa1e-50c9-4e82-bfc0-f9bce82a4c1e.jpeg</url>
      <title>Forem: Steven Hur</title>
      <link>https://forem.com/jongwan93</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/jongwan93"/>
    <language>en</language>
    <item>
      <title>Escaping Localhost</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Fri, 12 Dec 2025 22:12:18 +0000</pubDate>
      <link>https://forem.com/jongwan93/escaping-localhost-3kc</link>
      <guid>https://forem.com/jongwan93/escaping-localhost-3kc</guid>
      <description>&lt;p&gt;For a long time, my development life existed within the predictable world of my local machine. I wrote code, it ran, and that was the extent of my world. &lt;/p&gt;

&lt;p&gt;Few Months ago, I had chance to step outside of my comfort zone and dive into the world of Open Source. If I had to describe the feeling of that first moment, I would point to a specific scene from the Disney movie, "Ralph Breaks the Internet".&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lwr6au3308je841hu2x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2lwr6au3308je841hu2x.png" alt=" " width="800" height="450"&gt;&lt;/a&gt;_&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3y1y4e0kgnbtpdbonrjl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3y1y4e0kgnbtpdbonrjl.png" alt=" " width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;picture of movie "Rack it Ralph"&lt;/em&gt;&lt;br&gt;
&lt;em&gt;first time Ralph and Vanellope walks into the world of internet&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Just like Ralph and Vanellope stood on that balcony, gazing wide-eyed at the endless, futuristic skyline, I felt completely small. In the movie, the Internet is described as a sprawling, infinite metropolis—bustling with flying vehicles, and towering skyscrapers representing the giants of the web. &lt;/p&gt;

&lt;p&gt;Coming from the quiet, controlled environment of my local machine, the Open Source ecosystem felt like that futuristic city. The towering buildings weren't Amazon or Google, but massive repositories with millions of lines of code. The flying cars weren't just traffic. They were the large stream of &lt;code&gt;Pull Requests&lt;/code&gt;, &lt;code&gt;Issues&lt;/code&gt;, and &lt;code&gt;Discussions&lt;/code&gt; happening in real-time across the world. People were continuously building, rebuilding, breaking and fixing the projects.&lt;/p&gt;

&lt;p&gt;It was terrifying, yes. But just like Ralph looking out at that horizon, I realized the potential of this limitless world.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My Contribution Highlights&lt;/strong&gt;&lt;br&gt;
Driven by this excitement, I didn't want to just be a tourist in this new city. It was intimidating, but I am incredibly proud to say that I have successfully contributed to some of the foundational pillars of the Python data ecosystem.&lt;/p&gt;

&lt;p&gt;I have had PRs merged into:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Scikit-learn&lt;/li&gt;
&lt;li&gt;NumPy&lt;/li&gt;
&lt;li&gt;Pandas&lt;/li&gt;
&lt;li&gt;Dagster&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Seeing my code become part of tools that millions of developers rely on was a exciting experience.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why I Fell for &lt;code&gt;Dagster&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
This realization explains why I fell so deeply for &lt;code&gt;Dagster&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;While exploring it, I got amazed by their core philosophy of &lt;code&gt;Software-defined Assets&lt;/code&gt;. The concept of treating data not just as a byproduct, but as a &lt;code&gt;first-class asset&lt;/code&gt; was very interesting. Treating data as &lt;code&gt;assets&lt;/code&gt; shifts the focus from managing execution tasks to maintaining the freshness of the actual data products. This approach automatically generates clear lineage graphs, allowing you to easily understand dependencies and track how data flows through the system. As a result, debugging and collaboration become significantly more efficient because you are interacting with defined data outcomes rather than abstract code logic.&lt;/p&gt;

&lt;p&gt;Reading the &lt;code&gt;Dagster&lt;/code&gt; source code didn't feel like studying. I found myself mentally visualizing the entire process like how the data flows, how the assets are materialized, and how the engine handles dependencies. Simulating these complex data journeys in my head was incredibly fun and engaging.&lt;/p&gt;

&lt;p&gt;Stepping out of my &lt;code&gt;local machine&lt;/code&gt; and jumping into the &lt;code&gt;open-source&lt;/code&gt; world brought lots of changes. It helped me realize my passion toward data management system. This was fantastic and fun experience and I will be continuing this journey.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>devjournal</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Continuous Journey through Dagster - bugs and testing</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Tue, 09 Dec 2025 21:25:14 +0000</pubDate>
      <link>https://forem.com/jongwan93/continuous-journey-through-dagster-bugs-and-testing-4d5b</link>
      <guid>https://forem.com/jongwan93/continuous-journey-through-dagster-bugs-and-testing-4d5b</guid>
      <description>&lt;p&gt;Lately, I've been diving deep into open-source contributions for &lt;code&gt;Dagster&lt;/code&gt;. I think I am getting bit more comfortable with their codebases which hastened my working process(placebo?). Today, I want to share the issues I've tackled recently and talk about a significant roadblock I'm currently facing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My Recent Contributions&lt;/strong&gt;&lt;br&gt;
I focused on fixing several bugs and improving stability across different parts of the &lt;code&gt;Dagster&lt;/code&gt;. Here is a breakdown of the issues I worked on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Fixing ECS Pipes Client Execution&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Issue: Users were encountering an &lt;code&gt;IndexError&lt;/code&gt; when launching tasks using the &lt;code&gt;PipesECSClient&lt;/code&gt;. This caused pipelines to crash unexpectedly in ECS environments.&lt;/p&gt;

&lt;p&gt;The Fix: I added proper exception handling and bounds checking to ensure the client launches tasks smoothly without crashing on index errors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dagster-io/dagster/issues/32936" rel="noopener noreferrer"&gt;Issue #32936&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Resolving Asset Specs Mapping Dependencies&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Issue: There was a logic error in &lt;code&gt;AssetsDefinition.map_asset_specs&lt;/code&gt; that caused failures when attempting to add dependencies while input definitions were already set.&lt;/p&gt;

&lt;p&gt;The Fix: I adjusted the core logic to correctly handle the mapping of asset specs even when inputs are pre-configured.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dagster-io/dagster/issues/32913" rel="noopener noreferrer"&gt;Issue #32913&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;[WIP]Correcting Asset Sensor Event Processing&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Issue: The &lt;code&gt;asset_sensor&lt;/code&gt; had a critical bug where it would only process the last materialization event if multiple partitions materialized simultaneously. This issue stems from a &lt;code&gt;race condition&lt;/code&gt;, making it notoriously difficult to reproduce and debug in a local environment.&lt;/p&gt;

&lt;p&gt;The Fix: Still working in progress but initially, I modified the sensor logic to ensure every single materialization event is captured and processed, regardless of concurrency. Precise approach with careful testing is required for further progress.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dagster-io/dagster/issues/32853" rel="noopener noreferrer"&gt;Issue #32853&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;[WIP]Implementing Merge Support for Polars &amp;amp; Delta Lake&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Use Case: Currently, the &lt;code&gt;dagster-deltalake&lt;/code&gt; I/O manager allows writing data, but it lacks out-of-the-box support for the merge operation when using Polars.&lt;/p&gt;

&lt;p&gt;The Implementation: I am working on updating the &lt;code&gt;dagster_deltalake/handler.py&lt;/code&gt; to support merge mode. The logic involves checking if the write mode is set to merge. If so, instead of calling the standard &lt;code&gt;write_deltalake()&lt;/code&gt; function, it creates a &lt;code&gt;DeltaTable&lt;/code&gt; object and executes the merge operation. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dagster-io/dagster/issues/32644?reload=1" rel="noopener noreferrer"&gt;Issue #32644&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The CI&lt;/strong&gt;&lt;br&gt;
While fixing the code was satisfying, getting the Pull Requests (PRs) merged has been a different story. I am currently stuck in a loop regarding &lt;code&gt;CI&lt;/code&gt; tests.&lt;/p&gt;

&lt;p&gt;The Situation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I run the unit tests locally on my environment, and everything passes perfectly.&lt;/li&gt;
&lt;li&gt;I push the code to GitHub, and the &lt;code&gt;CI&lt;/code&gt; pipeline fails.&lt;/li&gt;
&lt;li&gt;Because of this, I can't get a proper code review from the maintainers.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It is frustrating because I cannot reproduce the errors locally. It could be an environment configuration mismatch, a &lt;code&gt;linting&lt;/code&gt; rule that strictly applies in &lt;code&gt;CI&lt;/code&gt;, or a hidden dependency issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Next Steps&lt;/strong&gt;&lt;br&gt;
I plan to reach out to the &lt;code&gt;Dagster&lt;/code&gt; team and the community for guidance. I need to understand how their &lt;code&gt;CI&lt;/code&gt; environment differs from a standard local setup so I can replicate the failure and fix it. Sometimes, reading thousands of lines of codes and fixing errors is easier than testing.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>opensource</category>
      <category>testing</category>
    </item>
    <item>
      <title>Deepening My Roots in the Data Ecosystem - Choosing Depth Over Breadth</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Thu, 04 Dec 2025 23:17:53 +0000</pubDate>
      <link>https://forem.com/jongwan93/deepening-my-roots-in-the-data-ecosystem-choosing-depth-over-breadth-322</link>
      <guid>https://forem.com/jongwan93/deepening-my-roots-in-the-data-ecosystem-choosing-depth-over-breadth-322</guid>
      <description>&lt;p&gt;&lt;strong&gt;The Original Plan vs. Reality&lt;/strong&gt;&lt;br&gt;
In my previous post, I planned to step into LLM orchestration by contributing to &lt;code&gt;LangChain&lt;/code&gt; or diving into full-stack development with &lt;code&gt;Django&lt;/code&gt;. However, digging into the codebase made me realize a distinct difference in engineering styles.&lt;/p&gt;

&lt;p&gt;The library relies heavily on abstraction layers to wrap various LLMs. While this is architecturally impressive, I found that I didn't get the same satisfaction as I did when working with &lt;code&gt;scikit-learn&lt;/code&gt; or &lt;code&gt;Dagster&lt;/code&gt;. It wasn't just about complexity, it was about the nature of the code. I realized that I prefer the logic of data pipelines and algorithms over the integration-heavy nature of LLM wrappers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rediscovering the Joy of Data Engineering&lt;/strong&gt;&lt;br&gt;
Naturally, I shifted my focus back to &lt;code&gt;Dagster&lt;/code&gt;. Scanning the issue tab, I found myself drawn to problems that dealt with strict data flow and orchestration logic.&lt;/p&gt;

&lt;p&gt;It wasn't just because &lt;code&gt;Dagster&lt;/code&gt; was familiar, it was because the challenges were genuinely more stimulating. For instance, working on a feature that required learning &lt;code&gt;Polars&lt;/code&gt; was exciting, even though it was a completely new library for me. This confirmed my preference:&lt;/p&gt;

&lt;p&gt;"I enjoy the process when working on the concrete logic of data processing rather than the abstraction layers of LLM applications."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choosing Depth Over Breadth&lt;/strong&gt;&lt;br&gt;
I made a strategic decision. Instead of making surface-level contributions in a new repository, I decided to double down on &lt;code&gt;Dagster&lt;/code&gt;. This allowed me to move beyond minor patches and focus on high-impact work.&lt;br&gt;
I focused on:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Resolving Core Issues: Diving deep into the internal logic to fix bugs that were blocking other users.&lt;/li&gt;
&lt;li&gt;Expanding Functionality: Implementing new features that enhance the tool's usability.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Leveraging my previous experience with the codebase allowed me to use the time more efficiently. I could navigate the source code with intuition, enabling me to tackle complex architectural problems that would have been out of my reach just a few months ago.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding My Path&lt;/strong&gt;&lt;br&gt;
This journey took an unexpected turn, but it taught me a valuable lesson. Being a skilled developer isn't about following the latest trends; it's about recognizing your strengths and doubling down on them. Instead of spreading myself thin, I chose to deepen my expertise in the data ecosystem.&lt;/p&gt;

</description>
      <category>career</category>
      <category>dataengineering</category>
      <category>devjournal</category>
    </item>
    <item>
      <title>Stepping Out of the Comfort Zone - Plan for the Final Stretch</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Fri, 28 Nov 2025 22:10:21 +0000</pubDate>
      <link>https://forem.com/jongwan93/stepping-out-of-the-comfort-zone-plan-for-the-final-stretch-3h18</link>
      <guid>https://forem.com/jongwan93/stepping-out-of-the-comfort-zone-plan-for-the-final-stretch-3h18</guid>
      <description>&lt;p&gt;&lt;strong&gt;The Journey So Far&lt;/strong&gt;&lt;br&gt;
Over the past few months, my journey through open-source development has been a deep dive into the &lt;code&gt;Python&lt;/code&gt; data ecosystem. In previous releases (0.1 through 0.3), I focused heavily on data engineering and machine learning libraries. I had the opportunity to contribute to &lt;code&gt;Dagster&lt;/code&gt;, &lt;code&gt;scikit-learn&lt;/code&gt;, and &lt;code&gt;NumPy&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;These experiences were invaluable. I learned how to navigate complex C-extensions in &lt;code&gt;NumPy&lt;/code&gt;, understood the orchestration logic in &lt;code&gt;Dagster&lt;/code&gt;, and worked through to the strict code standards of &lt;code&gt;scikit-learn&lt;/code&gt;. However, I felt this is another time to move out of the box one more time and push me to the new world.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bridging Data and Application&lt;/strong&gt;&lt;br&gt;
One of the main goal is to suggest or contribute to a new feature.&lt;/p&gt;

&lt;p&gt;Before I jump into anything, I asked myself: Where do I want to be as a developer?&lt;/p&gt;

&lt;p&gt;I have some background in data processing, but I want to strengthen my skills in building the applications that utilize this data. I want to bridge the gap between "backend logic" and "user-facing functionality." Therefore, for this final step, I plan to walk towards the LLM (Large Language Model) orchestration or Web Framework domain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Target Project: &lt;code&gt;LangChain&lt;/code&gt;, &lt;code&gt;Django&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
After researching potential projects, I have found two interesting open source projects, &lt;code&gt;LangChain&lt;/code&gt; and &lt;code&gt;Django&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Why &lt;code&gt;LangChain&lt;/code&gt;? With the explosion of Generative AI, &lt;code&gt;LangChain&lt;/code&gt; has become the framework for building LLM applications. Since I have already contributed to &lt;code&gt;scikit-learn&lt;/code&gt; and understand the fundamentals of &lt;code&gt;ML&lt;/code&gt; pipelines, moving into LLM orchestration feels like the natural next step. It allows me to apply my &lt;code&gt;Python&lt;/code&gt; skills to a high-impact technology.&lt;/p&gt;

&lt;p&gt;Why &lt;code&gt;Django&lt;/code&gt;? &lt;code&gt;Django&lt;/code&gt; is one of the most robust web frameworks in existence. While my previous contributions were in data libraries, I want to explore the world of &lt;code&gt;Full Stack&lt;/code&gt; development. Contributing to &lt;code&gt;Django&lt;/code&gt; will give me chance to deal with different types of challenges such as ORM optimizations and security which are crucial for my career growth.&lt;/p&gt;

&lt;p&gt;Moving from scientific libraries like &lt;code&gt;NumPy&lt;/code&gt; to application frameworks like &lt;code&gt;LangChain&lt;/code&gt; and &lt;code&gt;Django&lt;/code&gt; is a shift in mindset. It’s a move from optimizing calculation to architecting functionality. It makes me nervous, but that’s exactly why I need to do it.&lt;/p&gt;

&lt;p&gt;I am giving my final push to close out my 3 years of study. Stay tuned for my progress update next week.&lt;/p&gt;

</description>
      <category>python</category>
      <category>devjournal</category>
      <category>datascience</category>
      <category>opensource</category>
    </item>
    <item>
      <title>My First Python Package Release on PyPI: repo-code-packager</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Fri, 21 Nov 2025 22:49:23 +0000</pubDate>
      <link>https://forem.com/jongwan93/my-first-python-package-release-on-pypi-repo-code-packager-9c0</link>
      <guid>https://forem.com/jongwan93/my-first-python-package-release-on-pypi-repo-code-packager-9c0</guid>
      <description>&lt;p&gt;For OSD600 Lab 9, I took on the challenge of releasing my open-source project to the world. My goal was to take my code and package it so that anywhere I can install it with a single command.&lt;/p&gt;

&lt;p&gt;I chose to package my Python project, &lt;code&gt;repo-code-packager&lt;/code&gt;, and publish it to &lt;code&gt;PyPI&lt;/code&gt; (Python Package Index)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Tools&lt;/strong&gt;&lt;br&gt;
Since I am working within the &lt;code&gt;Python&lt;/code&gt; ecosystem, I used the standard industry tools for packaging:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PyPI: The official third-party software repository for Python.&lt;/li&gt;
&lt;li&gt;build: A standard tool to create distribution packages.&lt;/li&gt;
&lt;li&gt;twine: A utility for publishing Python packages to &lt;code&gt;PyPI&lt;/code&gt; securely.&lt;/li&gt;
&lt;li&gt;pyproject.toml: The modern configuration file for defining package metadata and build system requirements.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Process&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Preparing the Package&lt;br&gt;
The first step was organizing my project structure and creating the &lt;code&gt;pyproject.toml&lt;/code&gt; file. This file is the heart of the package, containing the name, version, author info, and dependencies. I had to ensure my source code was properly structured in a &lt;code&gt;src&lt;/code&gt; directory with &lt;code&gt;__init__.py&lt;/code&gt; files to make it importable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Building and Tagging&lt;br&gt;
I used the &lt;code&gt;python -m build&lt;/code&gt; command to generate the distribution artifacts. Before releasing, I practiced using Git Tags, marking my repository with &lt;code&gt;v0.9.0&lt;/code&gt; to simulate a pre-release state. Once I was ready for the official launch, I bumped the version to &lt;code&gt;v1.0.0&lt;/code&gt; and pushed the tags to &lt;code&gt;GitHub&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Publishing to PyPI&lt;br&gt;
Uploading was surprisingly straightforward using &lt;code&gt;twine&lt;/code&gt;. I generated an API Token from &lt;code&gt;PyPI&lt;/code&gt; for security and used it to authenticate during the upload process.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python -m twine upload dist/*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Seeing my package live on &lt;code&gt;PyPI&lt;/code&gt; for the first time was a exciting moment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Unexpected Challenges&lt;/strong&gt;&lt;br&gt;
However, the road to a stable release wasn't smooth. I learned that publishing is easy, but publishing correctly is hard.&lt;/p&gt;

&lt;p&gt;The Case Sensitivity Trap&lt;br&gt;
My biggest problem came from directory naming. My source folder was named &lt;code&gt;Repo_Code_Packager&lt;/code&gt;, but I used &lt;code&gt;repo_code_packager&lt;/code&gt; for project name. When I released the package, I realized that users had to import it exactly as the folder was named:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# correct
from Repo_Code_Packager.content_packager import ContentPackager

# wrong
from repo_code_packager.content_packager import ContentPackager
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This taught me the importance of adhering to naming conventions before starting a project. For this release, I updated the documentation to clearly instruct users to use the capitalized import.&lt;/p&gt;

&lt;p&gt;Missing Dependencies and Class Structures&lt;br&gt;
In my initial &lt;code&gt;v1.0.0&lt;/code&gt; release, I missed declaring &lt;code&gt;Pygments&lt;/code&gt; as a dependency in &lt;code&gt;pyproject.toml&lt;/code&gt;. This meant users installed my package but crashed immediately upon running it. I also realized my code was a collection of loose functions, which was hard for users to integrate. I quickly refactored the code into a proper &lt;code&gt;ContentPackager&lt;/code&gt; class, added the missing dependency, and released patches &lt;code&gt;v1.0.1&lt;/code&gt; and &lt;code&gt;v1.0.2&lt;/code&gt;, to fix these issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User Testing&lt;/strong&gt;&lt;br&gt;
To verify my release, I asked my cousin, who is also a software developer, to test the package. This session was incredibly valuable.&lt;/p&gt;

&lt;p&gt;I provided him with the &lt;code&gt;PyPI&lt;/code&gt; link and my &lt;code&gt;README.md&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Install: He successfully installed it using &lt;code&gt;pip install repo-code-packager&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The Confusion: He instinctively tried to import it using lowercase and hit an &lt;code&gt;ImportError&lt;/code&gt;. I had to point out that the directory name required uppercase.&lt;/li&gt;
&lt;li&gt;The Fix: Seeing him struggle with the &lt;code&gt;import&lt;/code&gt; confirmed that I needed to update my documentation immediately to highlight the case-sensitive import statement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It was a great reminder that documentation is just as important as the code itself.&lt;/p&gt;

&lt;p&gt;This lab taught me that software release is an iterative process. It's rare to get &lt;code&gt;v1.0.0&lt;/code&gt; perfect on the first try, and that's okay. Tools like twine and semantic versioning allow us to fix and improve our packages continuously.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>tutorial</category>
      <category>beginners</category>
    </item>
    <item>
      <title>How I Fixed a Confusing Bug in NumPy</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Fri, 21 Nov 2025 18:09:15 +0000</pubDate>
      <link>https://forem.com/jongwan93/how-i-fixed-a-confusing-bug-in-numpy-1gkj</link>
      <guid>https://forem.com/jongwan93/how-i-fixed-a-confusing-bug-in-numpy-1gkj</guid>
      <description>&lt;p&gt;Contributing to a massive open-source project like NumPy can feel intimidating. You imagine complex &lt;code&gt;C&lt;/code&gt; code, advanced math, and scary build processes. But sometimes, a bug is just a simple logic error hiding in plain sight.&lt;/p&gt;

&lt;p&gt;I just submitted a Pull Request to &lt;code&gt;NumPy&lt;/code&gt; to fix a bug that was causing misleading error messages in &lt;code&gt;numpy.convolve&lt;/code&gt;. Here’s the story of the bug, the fix, and how I verified it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Wait, What?"&lt;/strong&gt;&lt;br&gt;
Imagine you are using &lt;code&gt;numpy.convolve&lt;/code&gt;. You accidentally pass an empty array as your first argument, but your second argument is perfectly fine.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import numpy as np

a = np.array([])      # Empty!
v = np.array([1, 2])  # Not empty!

np.convolve(a, v)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You would expect an error saying &lt;code&gt;a cannot be empty&lt;/code&gt;, right? Instead, NumPy screams at you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ValueError: v cannot be empty
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wait... what? I know &lt;code&gt;v&lt;/code&gt; isn't empty. I just double-checked it! This is the kind of error message that sends developers down a rabbit hole for an hour, debugging the wrong variable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Keep Calm, just Find the Bug&lt;/strong&gt;&lt;br&gt;
I search through the NumPy source code, &lt;code&gt;numpy/_core/numeric.py&lt;/code&gt; to see what was happening under the hood. The logic looked something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# The original buggy logic
def convolve(a, v, mode='full'):
    # ...

    if (len(v) &amp;gt; len(a)):
        a, v = v, a  # &amp;lt;--- The SWAP happens here!

    # Validation
    if len(a) == 0:
        raise ValueError('a cannot be empty')
    if len(v) == 0:
        raise ValueError('v cannot be empty') # &amp;lt;--- The error triggers here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do you see the problem?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The function sees that &lt;code&gt;v&lt;/code&gt; is longer than &lt;code&gt;a&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;It decides to swap them for performance reasons.&lt;/li&gt;
&lt;li&gt;Now, internally, variable &lt;code&gt;v&lt;/code&gt; holds the empty &lt;code&gt;array&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The check if &lt;code&gt;len(v) == 0&lt;/code&gt; triggers, raising &lt;code&gt;ValueError: v cannot be empty&lt;/code&gt;.
The function was swapping the contents of the variables, but the error message was hardcoded to the variable name. It was basically gaslighting the user.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Check First, Optimize Later&lt;/strong&gt;&lt;br&gt;
The fix was simple. We just needed to ensure the input validation happens before any internal swapping takes place.&lt;/p&gt;

&lt;p&gt;I changed the order of operations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# The fixed logic
def convolve(a, v, mode='full'):
    # ...

    # 1. Check for empty inputs FIRST
    if len(a) == 0:
        raise ValueError('a cannot be empty')
    if len(v) == 0:
        raise ValueError('v cannot be empty')

    # 2. THEN perform the optimization swap
    if (len(v) &amp;gt; len(a)):
        a, v = v, a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, if &lt;code&gt;a&lt;/code&gt; is empty, it gets caught immediately, and the user gets the correct error message, &lt;code&gt;a cannot be empty&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This was a small change, just moving a few lines of code but it significantly improves the developer experience. No one likes misleading error messages.&lt;/p&gt;

&lt;p&gt;It was a great reminder that you don't need to be a math genius to contribute to libraries like &lt;code&gt;NumPy&lt;/code&gt;. Sometimes, you just need to spot a logic bug and move some if statements around.&lt;/p&gt;

&lt;p&gt;My PR is up! Fingers crossed for the merge.&lt;/p&gt;

</description>
      <category>python</category>
      <category>opensource</category>
      <category>devjournal</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Debugging Windows Race Conditions in Dagster</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Tue, 18 Nov 2025 21:13:07 +0000</pubDate>
      <link>https://forem.com/jongwan93/debugging-windows-race-conditions-in-dagster-278b</link>
      <guid>https://forem.com/jongwan93/debugging-windows-race-conditions-in-dagster-278b</guid>
      <description>&lt;p&gt;Okay, another PR on &lt;code&gt;Dagster&lt;/code&gt;. I tackled a deceptively complex issue. Specifically, I focused on the &lt;code&gt;dagster-dbt&lt;/code&gt; integration.&lt;/p&gt;

&lt;p&gt;At first glance, the Pull Request might look small. However, getting to that one line required diving deep into &lt;code&gt;Windows&lt;/code&gt; filesystem internals and race conditions.&lt;/p&gt;

&lt;p&gt;Here is the story of how I diagnosed and fixed a nondeterministic crash that was haunting &lt;code&gt;Windows&lt;/code&gt; users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"It works, until it doesn't"&lt;/strong&gt;&lt;br&gt;
The issue appeared simple. When a user reloads their &lt;code&gt;dbt&lt;/code&gt; project definitions in &lt;code&gt;Dagster&lt;/code&gt; on &lt;code&gt;Windows&lt;/code&gt;, the process crashes with a &lt;code&gt;FileExistsError: [WinError 183]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The traceback pointed to this logic in &lt;code&gt;dbt_project_manager.py&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# The original code
shutil.rmtree(local_dir, ignore_errors=True)
local_dir.mkdir()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On paper, this logic seems flawless. It delete the directory, and then create it. So, why was &lt;code&gt;Python&lt;/code&gt; complaining that the file already exists right after deleting it?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Root Cause - The Windows Race Condition&lt;/strong&gt;&lt;br&gt;
This is where the complexity lies. Unlike &lt;code&gt;Linux&lt;/code&gt; or &lt;code&gt;macOS&lt;/code&gt;, the &lt;code&gt;Windows&lt;/code&gt; filesystem behaves differently regarding to file locking and deletion latency.&lt;/p&gt;

&lt;p&gt;When &lt;code&gt;shutil.rmtree()&lt;/code&gt; is called, it requests the OS to delete the directory. However, on &lt;code&gt;Windows&lt;/code&gt;, if a file inside that directory is briefly locked, the deletion doesn't happen instantaneously.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;Python&lt;/code&gt; executes &lt;code&gt;rmtree&lt;/code&gt;. Often &lt;code&gt;Windows&lt;/code&gt; starts deleting but lags slightly due to a lock.&lt;/li&gt;
&lt;li&gt;Because &lt;code&gt;ignore_errors=True&lt;/code&gt; was set, &lt;code&gt;rmtree&lt;/code&gt; returns silently without finishing the job.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Python&lt;/code&gt; immediately executes &lt;code&gt;mkdir()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CRASH!&lt;/strong&gt;: The directory is essentially a "zombie" - it’s flagged for deletion but still technically exists.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is a classic Race Condition. The code assumed instant deletion but the &lt;code&gt;Windows&lt;/code&gt; proved otherwise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Aligning Intent with Reality&lt;/strong&gt;&lt;br&gt;
I didn't want to simply suppress the error. I needed to architect the creation step to be flawless. &lt;/p&gt;

&lt;p&gt;The original author used &lt;code&gt;ignore_errors=True&lt;/code&gt; for deletion, implying a design philosophy of "Availability over Atomicity" which means that if cleanup fails, the program should try to continue rather than crash. However, the &lt;code&gt;mkdir()&lt;/code&gt; step was strict, breaking this philosophy.&lt;/p&gt;

&lt;p&gt;This is my suggestion:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;local_dir.mkdir(parents=True, exist_ok=True)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By adding &lt;code&gt;exist_ok=True&lt;/code&gt;, I ensured that even if the "zombie" directory lingers due to &lt;code&gt;OS&lt;/code&gt; latency, the program proceeds gracefully. The subsequent &lt;code&gt;sync()&lt;/code&gt; operation then handles the data consistency by overwriting files, ensuring that no stale data causes issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Dilemma - Logic over Observation&lt;/strong&gt;&lt;br&gt;
This presented a significant engineering dilemma for me. As a developer, I crave the validation of seeing a test fail before I fix it. I wanted to witness the crash with my own eyes to confirm the bug. &lt;/p&gt;

&lt;p&gt;However, my local test environment worked too well. Because I was testing with a relatively small dataset, the &lt;code&gt;rmtree&lt;/code&gt; operation on my &lt;code&gt;Windows&lt;/code&gt; machine finished instantaneously, beating the race condition every time. No matter how many times I reloaded, the crash wouldn't trigger.&lt;/p&gt;

&lt;p&gt;I decided to trust my static analysis of the code over my local observation. I looked closely at the existing code's intention:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;shutil.rmtree(local_dir, ignore_errors=True)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The original code explicitly ignores errors during deletion. It proved that the original authors anticipated that deletion might fail or be incomplete. They designed the system to tolerate a messy cleanup.&lt;/p&gt;

&lt;p&gt;However, the very next line, &lt;code&gt;mkdir()&lt;/code&gt; was strict and intolerant of pre-existing directories. This was a logical contradiction in the code's design philosophy.&lt;/p&gt;

&lt;p&gt;I concluded that my fix, adding &lt;code&gt;exist_ok=True&lt;/code&gt; was necessary to align the creation logic with the deletion logic. Trusting this architectural logic, I submitted the Pull Request.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/dagster-io/dagster/pull/32841" rel="noopener noreferrer"&gt;PR, issue-32841&lt;/a&gt;&lt;/p&gt;

</description>
      <category>devjournal</category>
      <category>dataengineering</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Setting up CI/CD with GitHub Actions</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Thu, 13 Nov 2025 21:09:56 +0000</pubDate>
      <link>https://forem.com/jongwan93/setting-up-cicd-with-github-actions-1ap9</link>
      <guid>https://forem.com/jongwan93/setting-up-cicd-with-github-actions-1ap9</guid>
      <description>&lt;p&gt;Welcome to my reflection on CI/CD experience, where the core objective was to move beyond local testing and integrate a CI pipeline into my project using GitHub Actions. This lab was an interesting experience understanding how to manage project complexity and a fundamental concept in collaborative software development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. The &lt;code&gt;ci.yml&lt;/code&gt; Blueprint&lt;/strong&gt;&lt;br&gt;
The first step was setting up the automation pipeline. Since my project is based on &lt;code&gt;Python&lt;/code&gt; and uses &lt;code&gt;Pytest&lt;/code&gt; for testing, I configured a workflow to automatically run tests whenever code was pushed or a &lt;code&gt;Pull Request (PR)&lt;/code&gt; was opened.&lt;/p&gt;

&lt;p&gt;-The GitHub Actions Workflow-&lt;br&gt;
The configuration was defined in the &lt;code&gt;.github/workflows/ci.yml&lt;/code&gt; file, ensuring a consistent testing environment.&lt;/p&gt;

&lt;p&gt;What does the YAML file do?&lt;br&gt;
This workflow automates four key steps:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Checkout the code
2. Set up the required Python environment
3. Install project dependencies
4. Execute all unit tests using pytest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This process guarantees that no new changes will break the existing functionality before they are merged.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Mastering the CI Cycle: Pass, Fail, Pass&lt;/strong&gt;&lt;br&gt;
The most important part of setting up CI was running the full test cycle. This proved that the CI system itself works as expected.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Initial Pass (Success): I created a PR, and the CI successfully ran all existing tests, resulting in a green checkmark.&lt;/li&gt;
&lt;li&gt;Intentional Fail (Failure): I then committed a change that caused a test to fail. The CI automatically re-ran and immediately reported a red X.&lt;/li&gt;
&lt;li&gt;Final Pass (Recovery): After reverting the breaking change, the CI successfully ran again, showing a green checkmark.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This cycle confirmed that the CI is functioning properly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The Cross-Project Collaboration Challenge&lt;/strong&gt;&lt;br&gt;
Afterwards, I had to find a partner and contribute a new test case to their repository. My partner's project was also Python-based, utilizing &lt;code&gt;Pytest&lt;/code&gt; and a similar &lt;code&gt;src/tests&lt;/code&gt; directory structure.&lt;/p&gt;

&lt;p&gt;The experience of writing tests for external code highlighted the need for strong documentation and well-encapsulated functions. I chose to write tests for &lt;code&gt;ArgParser&lt;/code&gt; class, focusing on its custom logic for loading &lt;code&gt;TOML&lt;/code&gt; configuration files.&lt;/p&gt;

&lt;p&gt;The main challenge I faced was an unexpected &lt;code&gt;ModuleNotFoundError&lt;/code&gt; when trying to run tests locally, even though the file structure was clear. This was because &lt;code&gt;Python&lt;/code&gt;'s default import path does not automatically include the sibling &lt;code&gt;src/&lt;/code&gt; directory.&lt;/p&gt;

&lt;p&gt;The solution required manually inserting the &lt;code&gt;src&lt;/code&gt; folder's absolute path into the system's path configuration within the test file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# The Fix for ModuleNotFoundError:
import sys
import os
src_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', 'src'))
sys.path.insert(0, src_dir)
from arg_parser import ArgParser
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This experience was a reminder that while &lt;code&gt;CI&lt;/code&gt; automates testing, basic project setup is essential for any developer joining the project.&lt;/p&gt;

&lt;p&gt;Having successfully set up CI and used it in a real-world scenario, I now strongly believe that CI is inevitable.&lt;/p&gt;

&lt;p&gt;Before this lab, running tests felt like an optional. However, proper testing removes human error and ensures that every code change, no matter how small, is immediately validated against the entire suite of existing tests.&lt;/p&gt;

&lt;p&gt;CI is not just about testing. It's about reducing integration friction and providing immediate feedback which is key to maintaining a stable and scalable codebase&lt;/p&gt;

</description>
      <category>cicd</category>
      <category>testing</category>
      <category>python</category>
      <category>github</category>
    </item>
    <item>
      <title>Adding Automated Testing to My Project</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Thu, 06 Nov 2025 23:23:03 +0000</pubDate>
      <link>https://forem.com/jongwan93/adding-automated-testing-to-my-project-1mkj</link>
      <guid>https://forem.com/jongwan93/adding-automated-testing-to-my-project-1mkj</guid>
      <description>&lt;p&gt;For Lab 7, I added automated testing to my &lt;code&gt;Repo Code Packager&lt;/code&gt; project. This tool analyzes &lt;code&gt;Git&lt;/code&gt; repositories and generates formatted output for sharing with &lt;code&gt;LLMs&lt;/code&gt;. Before this lab, I had no automated tests, which made it risky to add new features or refactor code. This lab taught me how to set up a testing framework and write test cases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Choosing a Testing Framework&lt;/strong&gt;&lt;br&gt;
Since my project is written in Python, I researched several testing &lt;code&gt;frameworks&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;unittest: Python's built-in testing framework&lt;/li&gt;
&lt;li&gt;nose: An extension of unittest (but less actively maintained)&lt;/li&gt;
&lt;li&gt;pytest: The most popular modern Python testing framework&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I chose pytest for several reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simple syntax: Uses plain assert statements instead of &lt;code&gt;self.assertEqual()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Great documentation: Easy to find examples and tutorials&lt;/li&gt;
&lt;li&gt;Powerful features: parametrization and well written error messages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Setting Up the Testing Environment&lt;/strong&gt;&lt;br&gt;
Following the lab instructions, I created a testing branch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git checkout -b testing
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;├── tests/
│   ├── __init__.py
│   ├── test_file_utils.py
│   ├── test_content_packager.py
│   └── test_git_utils.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;tests/__init__.py&lt;/code&gt; file is empty but necessary for &lt;code&gt;Python&lt;/code&gt; to recognize the directory as a package.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Writing My First Tests&lt;/strong&gt;&lt;br&gt;
I started with &lt;code&gt;file_utils.py&lt;/code&gt; because it contains pure functions with no external dependencies which makes perfect for learning unit testing.&lt;br&gt;
&lt;code&gt;is_recently_modified()&lt;/code&gt; function checks if a file was modified within a specified time window. Here's what I tested:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def test_nonexistent_file_returns_false(self):
    """Non-existent file should return False"""
    result = is_recently_modified("nonexistent_file.txt")
    assert result == False

def test_recently_created_file_returns_true(self, tmp_path):
    """Recently created file should return True"""
    test_file = tmp_path / "recent.txt"
    test_file.write_text("test content")

    result = is_recently_modified(str(test_file), days=7)
    assert result == True
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I discovered &lt;code&gt;pytest&lt;/code&gt;'s &lt;code&gt;tmp_path&lt;/code&gt; fixture, which creates a temporary directory for each test. This was incredibly useful because test don't leave files on my system and each test is isolated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Testing Edge Cases&lt;/strong&gt;&lt;br&gt;
The most interesting test was checking old files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def test_file_modified_beyond_time_window(self, tmp_path):
    """File modified beyond the time window should return False"""
    test_file = tmp_path / "old.txt"
    test_file.write_text("old content")

    # Set modification time to 10 days ago
    ten_days_ago = time.time() - (10 * 86400)
    os.utime(str(test_file), (ten_days_ago, ten_days_ago))

    result = is_recently_modified(str(test_file), days=7)
    assert result == False
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I learned about &lt;code&gt;os.utime()&lt;/code&gt; which allows you to manipulate file timestamps. It is very useful for testing time-based functionality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bugs Discovered Through Testing&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Bug #1&lt;/strong&gt;: Missing Error Handler&lt;br&gt;
When I wrote tests for &lt;code&gt;git_utils.py&lt;/code&gt;, I discovered a bug:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NotADirectoryError: [WinError 267] The directory name is invalid
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The problem was in my exception handling. The original code only caught &lt;code&gt;subprocess.CalledProcessError&lt;/code&gt; and &lt;code&gt;FileNotFoundError&lt;/code&gt;, but Windows throws &lt;code&gt;NotADirectoryError&lt;/code&gt; for invalid paths.&lt;br&gt;
Fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;except (subprocess.CalledProcessError, FileNotFoundError, NotADirectoryError, IndexError, OSError):
    return "Not a git repository"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This was actually my first time discovering cross-platform issue through formal testing procedure. When &lt;code&gt;subprocess&lt;/code&gt; tries to run a git command with an invalid path on Windows, it raises &lt;code&gt;NotADirectoryError&lt;/code&gt; instead. This wasn't being caught, causing the test to crash. Different operating systems can raise different exceptions for the same error condition. By adding &lt;code&gt;NotADirectoryError&lt;/code&gt; and the more general &lt;code&gt;OSError&lt;/code&gt;, my code now handles edge cases better.&lt;/p&gt;

</description>
      <category>learning</category>
      <category>python</category>
      <category>testing</category>
    </item>
    <item>
      <title>Open Source Journey</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Sat, 01 Nov 2025 16:09:32 +0000</pubDate>
      <link>https://forem.com/jongwan93/open-source-journey-29cm</link>
      <guid>https://forem.com/jongwan93/open-source-journey-29cm</guid>
      <description>&lt;p&gt;Looking back at my previous blog posts feels incredibly humbling. I can see how much I've grown through this journey and honestly, it's been one of the most fun experiences I've had in my academic career. Why? Because I got to browse through some absolutely brilliant and amazing projects that are actually used in production.&lt;br&gt;
Let me take you through the key lessons I learned from contributing to four different open source projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Communication Over Confidence&lt;/strong&gt;&lt;br&gt;
Project: &lt;a href="https://github.com/StanfordVL/BEHAVIOR-1K" rel="noopener noreferrer"&gt;BEHAVIOR-1K&lt;/a&gt;&lt;br&gt;
My first contribution taught me the most fundamental lesson of open source.&lt;br&gt;
I spent full 3 days just setting up the project and understanding the codebase. When I finally identified the issue, I faced a dilemma. There was a line of code that seems very important but I had to remove to fix the issue. The function returned &lt;code&gt;False&lt;/code&gt; if it identified anything other than &lt;code&gt;True&lt;/code&gt; in a list, but there was also an &lt;code&gt;assert all(...)&lt;/code&gt;, &lt;code&gt;child_values has NoneTypes&lt;/code&gt; line checking for &lt;code&gt;NoneType&lt;/code&gt; values.&lt;br&gt;
Should I remove it or Keep it?&lt;br&gt;
Instead of making assumptions, I created a Pull Request with a &lt;code&gt;[WIP]&lt;/code&gt; tag to open a conversation with the reviewers. This turned out to be the right call. In open source, especially as a newcomer, communication is the golden key. Nobody expects you to be perfect but they do expect you to be thoughtful. Don't be afraid to ask questions. Maintainers would much rather answer your questions than dealing with a poor PR.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start Simple, Build Confidence&lt;/strong&gt;&lt;br&gt;
Project: &lt;a href="https://github.com/scikit-learn/scikit-learn" rel="noopener noreferrer"&gt;Scikit-learn&lt;/a&gt;&lt;br&gt;
After the intense first experience with &lt;code&gt;BEHAVIOR-1K&lt;/code&gt;, I needed something more approachable. I went straight to &lt;code&gt;Scikit-learn&lt;/code&gt;'s &lt;code&gt;good first issue&lt;/code&gt; label and found a task that seemed manageable: changing relative imports to absolute imports in &lt;code&gt;Cython&lt;/code&gt; files.&lt;br&gt;
From this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from ...utils._typedefs cimport float64_t, float32_t
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To this&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from sklearn.utils._typedefs cimport float64_t, float32_t
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Was it a simple task? Yes. But I learned something out of it.&lt;br&gt;
This was my first real encounter with &lt;code&gt;Cython&lt;/code&gt;, and I discovered how Python libraries achieve C-level performance. I learned what &lt;code&gt;cimport&lt;/code&gt; means, why &lt;code&gt;float64_t&lt;/code&gt; exists, and how type definitions help optimize the code. Even a simple task in a well structured project teaches you something new.&lt;br&gt;
Simple contributions are not lesser contributions. They're opportunities to learn the project's architecture and tooling. Furthermore, they build your confidence for tackling harder issues later.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Embrace Challenges&lt;/strong&gt;&lt;br&gt;
Project: &lt;a href="https://github.com/dagster-io/dagster" rel="noopener noreferrer"&gt;Dagster&lt;/a&gt;&lt;br&gt;
After building confidence with &lt;code&gt;Scikit-learn&lt;/code&gt;, I wanted something more challenging. &lt;code&gt;Dagster&lt;/code&gt;, a data orchestration platform used by real companies in production, had an interesting bug.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;callable objects with custom signatures were crashing the type hints resolution system.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The problem was technical&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class MyWrapper:
    def __init__(self, fn):
        self.__signature__ = inspect.signature(fn)

    def __call__(self, **kwargs):
        ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This would crash with &lt;code&gt;TypeError: &amp;lt;callable object&amp;gt; is not a module, class, method, or function&lt;/code&gt;.&lt;br&gt;
At first, I thought this is too complex for me. But kept trying and I managed to find the solution. Instead of passing the object to &lt;code&gt;typing.get_type_hints()&lt;/code&gt;, extract the type information directly from the &lt;code&gt;__signature__&lt;/code&gt; object.&lt;br&gt;
I've learned couple things while contributing to this issue.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python's signature protocol and the &lt;code&gt;__signature__&lt;/code&gt; attribute&lt;/li&gt;
&lt;li&gt;The importance of comprehensive testing in production systems&lt;/li&gt;
&lt;li&gt;How to read and understand complex codebases with decorator systems and dependency injection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Don't be afraid to tackle issues that seem slightly beyond your current skill level. The struggle is where the real learning happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contributing to Tools You Use&lt;/strong&gt;&lt;br&gt;
Project: &lt;a href="https://github.com/optuna/optuna" rel="noopener noreferrer"&gt;Optuna&lt;/a&gt;&lt;br&gt;
By my fourth contribution, I felt much more comfortable with the open source process. I chose &lt;code&gt;Optuna&lt;/code&gt;, a hyperparameter optimization framework which I've heard about while studying Machine Learning. I found an issue asking to modernize the code by replacing &lt;code&gt;.format()&lt;/code&gt; with &lt;code&gt;f-strings&lt;/code&gt;.&lt;br&gt;
Old way&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"{cls}({kwargs})".format(cls=..., kwargs=...)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New way&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;f"{cls}({kwargs})"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It was fairly easy issue but I wanted to contribute to a tool I actually use and understand. Working on &lt;code&gt;Optuna&lt;/code&gt; felt much more comfortable than my first &lt;code&gt;Scikit-learn&lt;/code&gt; contribution because I had context about what the library does and why it matters.&lt;br&gt;
Contributing to projects you actually use is always a good way to get this going. Not only does being part of the amazing project feel good, but it also makes you feel proud that you contributed to a big community.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Joy of Exploration&lt;/strong&gt;&lt;br&gt;
One of the most unexpected pleasures of this journey was simply browsing through brilliant projects. Each repository I explored, whether I contributed to it or not, taught me something about software architecture, testing, or documentation. It's like getting a &lt;code&gt;behind-the-scenes&lt;/code&gt; tour of how professional softwares are built.&lt;/p&gt;

&lt;p&gt;I'm having a lot of fun doing this, and I hope that comes through in my writing. Open source contribution isn't just about adding lines to your resume. It's about being part of big community and contribute to tools that developers around the world rely on.&lt;br&gt;
Looking back at these four contributions, I'm proud of what I've accomplished. Looking forward, I'm excited about what comes next. Each contribution has taught me new technologies and new possibilities.&lt;br&gt;
If you're thinking about contributing to open source, just start. Find a project you use, look for a &lt;code&gt;good first issue&lt;/code&gt; and take that first step. The open source community is very welcoming and who knows? you might just have fun doing it.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>learning</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Optuna f-string Refactoring</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Wed, 29 Oct 2025 05:38:58 +0000</pubDate>
      <link>https://forem.com/jongwan93/optuna-f-string-refactoring-2043</link>
      <guid>https://forem.com/jongwan93/optuna-f-string-refactoring-2043</guid>
      <description>&lt;p&gt;Hello! Just submitted my 4th PR to open source. This time it's &lt;code&gt;Optuna&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Optuna?&lt;/strong&gt;&lt;br&gt;
Optuna is a hyperparameter optimization framework for machine learning. Basically when you're training ML models, you have tons of parameters to tune - learning rate, batch size, number of layers, etc. Optuna automates this process using smart algorithms instead of random guessing.&lt;br&gt;
What makes it interesting is the &lt;code&gt;define-by-run&lt;/code&gt; API. You can dynamically construct search spaces, which is way more flexible than traditional grid search or random search. It's used by a lot of ML practitioners and has integrations with &lt;code&gt;PyTorch&lt;/code&gt;, &lt;code&gt;TensorFlow&lt;/code&gt;, &lt;code&gt;XGBoost&lt;/code&gt;, and basically every major ML library.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I Did&lt;/strong&gt;&lt;br&gt;
Found this issue asking to replace old &lt;code&gt;.format()&lt;/code&gt; with &lt;code&gt;f-strings&lt;/code&gt;. Some what simple refactoring.&lt;br&gt;
&lt;a href="https://github.com/optuna/optuna/issues/6305" rel="noopener noreferrer"&gt;issue-6305&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;They wanted this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Old way (ugly)
"{cls}({kwargs})".format(cls=..., kwargs=...)

New way (clean)
f"{cls}({kwargs})"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The Code Change&lt;/strong&gt;&lt;br&gt;
Changed this old python 3.8 style:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def __repr__(self) -&amp;gt; str:
    return "{cls}({kwargs})".format(
        cls=self.__class__.__name__,
        kwargs=", ".join(
            "{field}={value}".format(
                field=field if not field.startswith("_") else field[1:],
                value=repr(getattr(self, field)),
            )
            for field in self.__dict__
        )
        + ", value=None",
    )
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Into this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def __repr__(self) -&amp;gt; str:
    kwargs = ", ".join(
        f"{field if not field.startswith('_') else field[1:]}={getattr(self, field)!r}"
        for field in self.__dict__
    ) + ", value=None"
    return f"{self.__class__.__name__}({kwargs})"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Points&lt;/strong&gt;&lt;br&gt;
The main change was replacing all &lt;code&gt;.format()&lt;/code&gt; calls with &lt;code&gt;f-strings&lt;/code&gt;, which is the modern Python way since 3.8+. I also used &lt;code&gt;!r&lt;/code&gt; instead of calling &lt;code&gt;repr()&lt;/code&gt; directly because that's more pythonic in f-strings. The issue specifically asked for one file per PR to make reviews easier, so I only touched this single file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why This is Easy&lt;/strong&gt;&lt;br&gt;
This is just syntax conversion with no logic changes at all. The output stays exactly the same, just written differently. The issue had clear examples showing exactly what they wanted, so there was zero guesswork involved. Best part is tests won't break because the functionality is identical, just cleaner code.&lt;/p&gt;

&lt;p&gt;This issue was some what easier then what I've been doing but I wanted to contribute to this project because &lt;code&gt;Optuna&lt;/code&gt; is a framework that I've been studying recently. It does feel much comfortable compare to the first contribution that I made to &lt;code&gt;Scikit-learn&lt;/code&gt;. I guess I am improving in some way through this process. &lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>opensource</category>
      <category>python</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Fixing Type Hints for Callable Objects with Custom Signatures in Dagster</title>
      <dc:creator>Steven Hur</dc:creator>
      <pubDate>Tue, 28 Oct 2025 20:49:06 +0000</pubDate>
      <link>https://forem.com/jongwan93/fixing-type-hints-for-callable-objects-with-custom-signatures-in-dagster-3j73</link>
      <guid>https://forem.com/jongwan93/fixing-type-hints-for-callable-objects-with-custom-signatures-in-dagster-3j73</guid>
      <description>&lt;p&gt;So... it's been an interesting week. After my last contribution to Scikit-learn (which was honestly pretty straightforward), I wanted to find something a bit more challenging. Something that would actually make me think, maybe?&lt;/p&gt;

&lt;p&gt;I've been getting more into Machine Learning(ML) lately, especially pipelines and orchestration stuff. That's when I found &lt;code&gt;Dagster&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Dagster?&lt;/strong&gt;&lt;br&gt;
If you're not familiar, &lt;code&gt;Dagster&lt;/code&gt; is a data orchestration platform. Think of it like this. When you're building ML pipelines or data workflows, you need something to coordinate all the different steps such as &lt;code&gt;fetching data&lt;/code&gt;, &lt;code&gt;transforming it&lt;/code&gt;, &lt;code&gt;training models&lt;/code&gt;, &lt;code&gt;deploying them&lt;/code&gt;, and etc. &lt;code&gt;Dagster&lt;/code&gt; helps you organize all of that massive work into something manageable size.&lt;br&gt;
What caught my attention is that it is actually used in production by real companies. This isn't some hobby project. Plus, it has a really active community and the codebase is actually pretty readable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding the Issue&lt;/strong&gt;&lt;br&gt;
I was browsing through their GitHub issues, I found Issue &lt;code&gt;#32574: "Callable object custom signatures are resolved incorrectly."&lt;/code&gt;&lt;br&gt;
&lt;a href="https://github.com/dagster-io/dagster/issues/32574" rel="noopener noreferrer"&gt;Issue-32574&lt;/a&gt;&lt;br&gt;
At first glance, I thought "Oh cool, this looks easy." But then I read the details and realized this was actually pretty interesting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Problem&lt;/strong&gt;&lt;br&gt;
Here's the deal: &lt;code&gt;Python&lt;/code&gt; has this cool feature where you can create callable objects (basically, classes with a &lt;code&gt;__call__&lt;/code&gt; method) that act like functions. You can even give them custom signatures using the &lt;code&gt;__signature__&lt;/code&gt; attribute. This is super useful for decorators and wrappers that need to preserve type information.&lt;br&gt;
But Dagster's &lt;code&gt;get_type_hints()&lt;/code&gt; function wasn't handling this correctly. When you had something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class MyWrapper:
    def __init__(self, fn):
        # Set custom signature
        self.__signature__ = inspect.signature(fn)

    def __call__(self, **kwargs):
        # Generic signature
        ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The code would crash with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TypeError: &amp;lt;callable object&amp;gt; is not a module, class, method, or function.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why? Because the code was trying to pass the callable instance directly to Python's &lt;code&gt;typing.get_type_hints()&lt;/code&gt;, which doesn't know how to handle arbitrary objects. It only works with actual functions, classes, and modules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution&lt;/strong&gt;&lt;br&gt;
The fix was actually straight forward once I understood the problem. Instead of passing the object to &lt;code&gt;typing.get_type_hints()&lt;/code&gt;, you should extract the type information directly from the &lt;code&gt;__signature__&lt;/code&gt; object.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if hasattr(fn, "__signature__"):
    sig = fn.__signature__
    hints = {}
    for param_name, param in sig.parameters.items():
        if param.annotation != inspect.Parameter.empty:
            hints[param_name] = param.annotation
    if sig.return_annotation != inspect.Signature.empty:
        hints['return'] = sig.return_annotation
    return hints  # Return immediately!
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The signature object already has all the type information you need which means that you can simply extract it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Testing&lt;/strong&gt;&lt;br&gt;
One of the most important procedure of open source contribution is &lt;code&gt;testing&lt;/code&gt;. I created &lt;code&gt;test_sensor_invocation_resources_callable_with_custom_signature()&lt;/code&gt; which basically does exactly what the issue described.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Creates a callable object with a custom `__signature__`
Verifies that `Dagster` can now correctly read the type hints
Confirms that resources are properly recognized.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the code passed the test, I ran the entire test suite to make sure I didn't break anything. All 52 tests in &lt;code&gt;test_sensor_invocation.py&lt;/code&gt; passed. That's always a good feeling.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F99nodir12kv7rjckok4b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F99nodir12kv7rjckok4b.png" alt=" " width="784" height="220"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I Learned&lt;/strong&gt;&lt;br&gt;
This contribution taught me way more than just "fix this bug".&lt;/p&gt;

&lt;p&gt;Python's Signature Protocol: I had no idea &lt;code&gt;Python&lt;/code&gt; had such a sophisticated system for custom signatures. The &lt;code&gt;__signature__&lt;/code&gt; attribute is part of the standard library and is specifically designed for cases like this.&lt;br&gt;
Testing is Critical: In a production system like &lt;code&gt;Dagster&lt;/code&gt;, you can't just "fix it and good to go." I had to make sure my change didn't break any existing functionality. The &lt;code&gt;test suite&lt;/code&gt; is your safety net.&lt;br&gt;
Reading Complex Codebases: This required understanding how &lt;code&gt;Dagster's&lt;/code&gt; decorator system works, how it resolves resources, and how the whole dependency injection mechanism functions. It was challenging but super rewarding.&lt;/p&gt;

&lt;p&gt;I'm really enjoying this open source contribution journey. Each project teaches me something new. If you're thinking about contributing to open source, my advice is, don't be afraid to tackle issues that seem a bit over your head. You'll realize you are not as stupid as you think you are. Just make sure you understand the problem and the project before you start coding.&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>python</category>
      <category>dataengineering</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
