Forem: luminousmen

Understanding AWS Regions and Availability Zones: A Guide for Beginners

luminousmen — Mon, 28 Apr 2025 06:00:00 +0000

Amazon Web Services (AWS) has completely changed the game for how we build and manage infrastructure. Gone are the days when spinning up a new service meant begging your sys team for hardware, waiting weeks, and spending hours in a cold data center plugging in cables. Now? A few clicks (or API calls), and yes — you've got an entire data center at your fingertips.

But with great power comes great... complexity. AWS hands us a buffet of options, and figuring out how to architect for high availability and disaster recovery can be, frankly, a bit overwhelming. So let's break it down. These are the three infrastructure concepts you actually need to care about when planning for uptime: Regions, Availability Zones, and Edge Locations.

If your go-to plan is just "I'll just pick us-east-1 and be done with it", this post is for you.

Region

An AWS Region is a physically isolated chunk of the AWS cloud, typically spanning a big geographic area. AWS currently operates in 31 geographic regions, across North America, South America, Europe, the Middle East, Africa, and Asia Pacific.

So why should you care? Because each Region is its own little AWS island — separate hardware, separate networks, separate everything. Nothing is shared. No silent data replication magic is happening between regions (unless you set it up).

This separation gives you power and flexibility for redundancy and disaster recovery — plus peace of mind when a region takes a nap (looking at you, us-east-1).

For instance, Airbnb uses AWS Regions to ensure high availability for its millions of users. By leveraging AWS load balancing and auto-scaling across multiple regions, Airbnb can handle traffic spikes and maintain uptime even during regional failures.

Similarly, Slack uses AWS Regions to store user data and messages and to handle real-time messaging across the globe, ensuring scalability and data locality.

Choosing the Right Region

Yes, it's tempting to just pick the default. But here's what you should be thinking about:

Latency: Choose a region close to your users. Distance = delay.
Regulations: GDPR, local residency requirements — sometimes the law makes your decision for you.
Services: Some AWS toys aren't available everywhere. Check this list.
Money: Prices vary by region. It's not just taxes—it's also about supply chain and power costs. Use the AWS Pricing Calculator.

Sure, you can go multi-region. But unless your app is mission-critical at a global scale, a well-architected setup within one region (with multiple AZs) is usually the sweet spot. Speaking of...

Availability Zone

So, you've picked your region. Nice. Now let's zoom in. Each AWS region is sliced into Availability Zones — fortified, high-speed fiber-connected data centers that are close (ish) to each other but physically isolated to prevent a domino disaster.

There are seven AWS regions in North America alone, each with at least a few Availability Zones.

Take us-east-1 (everyone's favorite punching bag). It has at least six AZs: us-east-1a through us-east-1f. These aren't just checkboxes — they're massive, isolated data centers built to survive fires, floods, and whatever else the world throws at them.

For example, Netflix uses AWS Availability Zones to ensure that its streaming service is always available to its millions of users. Netflix uses AWS load balancing and auto-scaling services to spread workloads across AZs so that if one falls over, the others keep streaming your crime docs and baking shows without missing a beat.

Best Practices for Using AZs

Spread your stuff out: Distribute Resources: Deploy services across multiple AZs to ensure high availability. At least two. Always.
Prepare for disaster: Implement backup plans and failover mechanisms to automatically redirect traffic to healthy AZs in case of failures.
Load balance: AWS's Elastic Load Balancing can distribute incoming application traffic across multiple targets in different AZs, enhancing fault tolerance. Use it.

Edge locations

Now let's talk about raw speed. You've got AZs for resilience, but how do you get fast performance for users in Bangkok, Berlin, and Buenos Aires? That's where Edge Locations come in.

Edge Locations are AWS's mini outposts — smaller infrastructure sites strategically placed closer to end-users. Think CDNs, DNS, and security — but at the edge. One of their main jobs is reducing latency by serving high-bandwidth content, like video, from nearby locations.

AWS CloudFront is the star of the show here. It caches static content (like media, scripts, and images) to ensure fast, reliable delivery. Other AWS services that run at the edge include Route 53 for DNS routing, Shield and WAF for security, and even Lambda via Lambda@Edge — giving you the ability to run serverless logic closer to the user.

Two examples of companies using AWS Edge locations are Twitch and Peloton. Twitch uses AWS CloudFront and other edge location services to improve the delivery of live-streaming video content to its global audience. By caching content at edge locations closer to viewers, Twitch is able to reduce latency and improve the quality of the viewing experience.

Peloton uses AWS Edge locations to stream high-quality video content to its connected fitness equipment and mobile applications. By using edge locations, Peloton is able to provide low-latency video streaming, which means no buffering mid-burpee.

It's worth noting: not every AWS service is available at every edge location. Double-check before you architect. AWS has been expanding what runs at the edge — especially for IoT and real-time use cases — but still, validate your requirements.

While using Edge locations can offer benefits such as reduced latency and improved application performance, there are trade-offs to consider. For instance, Edge locations can be more expensive than traditional regions, so it's important to carefully evaluate the cost-benefit of using them. Security is another concern, as Edge locations may be more vulnerable to security threats due to their proximity to end-users.

Additional materials

Thank you for reading!

Curious about something or have thoughts to share? Leave your comment below! Check out my blog or follow me via LinkedIn, Substack, or Telegram.

Two Archetypes of Data Engineers

luminousmen — Tue, 25 Feb 2025 06:00:00 +0000

Data engineering is a crucial element of the data ecosystem, comprised of diverse professionals who play essential roles in managing and processing data. While the job title may be the same, as I’ve seen over the years, data engineers often fall into two distinct archetypes: the "businessy" data engineer and the "techy" data engineer as I like to call them. In this blog post, we will explore these two archetypes, their characteristics, and their contributions to the world of data engineering.

The Businessy Data Engineer

These folks are all about solving business problems. They are passionate about tracking metrics, Key Performance Indicators (KPIs), and building interactive dashboards. Often, they have extensive SQL experience and possess coding skills in versatile languages like Python, ideal for data manipulation and analysis.

Responsibilities: Their primary focus on translating business needs into data solutions. They build data pipelines to collect, transform, and load data, enabling meaningful insights for decision-makers. These professionals are often referred to as Business Intelligence (BI) Engineers.

Daily Tasks: A typical day may involve gathering requirements from stakeholders, designing dashboards, scripting in Python or SQL for data extraction and transformation, and collaborating with business teams to ensure data-driven decision-making.

The Techy Data Engineer

On the other hand, techy data engineers are drawn to solving scale problems. They thrive on exploring and implementing new technologies, and often prefer coding in languages like Scala or Java. They are responsible for building scalable data pipelines that can handle massive volumes of data.

Responsibilities: Techy data engineers focus on building and maintaining robust data infrastructure. They ensure that data pipelines are scalable, reliable, and capable of handling large datasets. They are proficient in tools like Apache Spark, Apache Flink and Apache Airflow, which are vital for processing vast amounts of data and know intricacies of cloud tools.

Daily Tasks: A typical day for a techy data engineer might involve optimizing data pipelines, troubleshooting performance issues, experimenting with new data storage and processing technologies, and collaborating with data scientists to deploy machine learning models.

Bridging the Gap

While these two archetypes of data engineers have distinct roles and responsibilities, there is immense potential when they work together. The businessy data engineer's ability to understand and translate business requirements complements the techy data engineer's expertise in building scalable solutions. The businessy folks understand what the suits want, and the techy folks build the data powerhouse to support those needs. Collaboration between these two types of data engineers can lead to the creation of powerful data-driven solutions. Teamwork makes the dream work, right?

Now, here's the thing: most data engineering training focuses on the techy side of things, leaving a gap for the businessy data engineer. We need content that showcases their role, assists individuals in identifying suitable job postings, and guides them on their learning journey.

Which type do you identify with?

Thank you for reading!

Curious about something or have thoughts to share? Leave your comment below! Check out my blog or follow me via LinkedIn, Substack, or Telegram.

How to Build High-Performance Engineering Teams

luminousmen — Tue, 29 Oct 2024 11:00:00 +0000

Building a high-performance engineering team is like assembling a space shuttle; every component, no matter how small, plays a crucial role in ensuring a successful mission.

Step 0: Hire Top Engineers

First things first: your team's performance is directly tied to the people in it. This isn't groundbreaking news, but it's astonishing how often this simple fact is overlooked in favor of flashy tools and methodologies. Just like a sports team, no amount of sophisticated strategy can compensate for a lack of talent.

So, what makes a top engineer? It's not just about someone who can code in their sleep or debug as if they have a sixth sense for semicolons. You're looking for people who drive growth and push the team to innovate. These are the ones who don't just get the job done but also bring out the best in everyone around them.

The Ecosystem Mindset

When hiring, think beyond just filling roles; think about building a supportive ecosystem. You're crafting an environment where excellence breeds more excellence. The right person can uplift the entire team, turning every problem into a chance to learn and grow. The growth here isn't linear; it's exponential. Each new member doesn't just add to the team; they multiply its capabilities, innovation, and problem-solving skills.

Beyond the Resume

Finding this talent means looking past the resumes full of technical jargon and buzzwords. Sure, it's tempting to get impressed by pages of experience and skill lists. But we’re not building a collection of trading cards here; we’re building a team. Seek out problem-solvers and innovators — those who don’t just code, but live and breathe engineering. You need people who possess a certain tenacity, a creative spark. You want people who are adaptable, creative, and always eager to learn.

Finally, don't underestimate the power of cultural fit. You want someone who not only excels in their technical role but also fits with your team’s ethos and work style. A top engineer who disrupts the team dynamics is like a supercar with a flat tire – impressive, but not going anywhere fast.

Wrapping It Up

Building a high-performance engineering team starts with the people. It's about looking beyond the resume and understanding the deeper impact a single individual can have on the entire team. It's a mission, and every mission needs its astronauts. Choose wisely, and you'll be on your way to engineering excellence.

Table Selection in Software Engineering

luminousmen — Tue, 22 Oct 2024 11:00:00 +0000

In the world of poker, there is a strategy that goes beyond just playing the game well – it's about choosing the right table. The idea here is clear: why struggle against the best when you can excel among the rest? This can also be applied to navigating a career in software engineering. Why not play smart by picking your battles - or in this case, the company, project, and team that aligns with your strengths and goals?

Choosing the Right Company

Starting a career in software engineering is like stepping into a casino full of tables. You need to pick the one where you can play your best game. It's not just about avoiding tough competition; it's about finding a culture that resonates with your values and brings you the most value.

Maybe you thrive in a nurturing environment over a cutthroat one, or, perhaps, you prefer working on projects that align with your personal interests. Some may avoid sectors like advertising, crypto, surveillance, gaming, or streaming despite lucrative offers, preferring to work for companies whose mission and products align with their personal ethics. The key is to understand that company culture plays a massive role in your career trajectory and job satisfaction.

Projects vs Products

Project-Based Work

Project work, often seen in contract or freelancing roles, is like playing a series of short, intense poker hands, each with different players and stakes. The primary focus is on completing a specific set of requirements within a defined time frame. It involves working on short-term, often varied tasks or goals. This type of work can be exciting and diverse, offering exposure to a broad range of technologies and industries. It's an environment where adaptability and a broad skill set are highly valued.

However, this type of work has its downsides. The most prominent is the stress and uncertainty of constantly seeking new clients and projects. For many, the irregularity of work and income, along with the need for continual self-marketing, can be draining. Moreover, project work often lacks the continuity and long-term impact that some engineers crave.

Product-Based Work

Unlike project work, product work involves contributing to a long-term, vision-driven effort, typically within a product team. This type of work is similar to playing a long, strategic game of poker at a single table, where understanding the nuances and dynamics over time can lead to greater success.

In product teams, the focus is on the evolution and refinement of a product over time. This approach allows for a deeper immersion into a specific set of technologies and industries and often leads to a stronger sense of ownership and connection to the work. The commitment to a single vision over a prolonged period fosters an environment where deep, specialized skills are developed resulting in more efficient and valuable products.

Working on products can be incredibly rewarding for those who find motivation in seeing the long-term impact of their efforts. Being part of a team that drives a product from inception to market success offers a sense of accomplishment and purpose. Additionally, product teams often provide more stable work environments compared to project-based roles.

Choosing between project and product-based work is a personal decision that should align with your career goals, work preferences, and life circumstances. If you thrive in dynamic, fast-paced environments and enjoy the challenge of adapting to new contexts regularly, project-based work might be more fulfilling. Conversely, if you seek stability, long-term impact, and a deeper connection to your work, product teams may offer the environment you need.

Startup vs Big Tech

Choosing between a startup and an established company is a classic "Risk vs Opportunity" dilemma.

Working in startups (typically smaller teams) versus large tech companies offers different experiences. Startups offer excitement and the chance to build something from scratch, but they come with risks like job security and market volatility. Startups allow closer involvement with the product and customers and a better understanding of the business mechanics. However, they often lack experience in scaling their product.

Large companies offer more stability and structured career paths but might limit your exposure to cutting-edge technologies. FAANGs have mature infrastructure and well-integrated systems, so the major challenge you will face is that most large tech companies expect you to find projects that are valuable and understand (and focus on) the business mechanics of it.

I would say the choice between startup and big tech may depend on the phase of your career, but that is my opinion — decide for yourself. The balance you choose, again, should align with your career goals, risk tolerance, and personal goals. Whether you opt for the dynamism of a startup or the steadiness of an established company, ensure it aligns with your long-term objectives.

Compensation

Compensation in IT varies widely between startups and Big Tech. Big companies often offer higher salaries and benefits, reflecting their stability and scale. Startups tend to pay less and they rarely have the annual performance-based bonuses that are typical at bigger companies. But many startups hand out equity to employees (the earlier you join the bigger).

In addition to the size of the company, the industry or sector you choose to work in can also affect your compensation. Obviously, engineers working in more lucrative industries generally earn more than those in less profitable sectors. This factor should be considered when evaluating potential employers, especially if financial compensation is a high priority for you.

Either way, understanding your market value and being able to articulate your worth plays a significant role in your earning potential. Experience level matters too – senior engineers typically command higher salaries, reflecting their accumulated expertise.

Picking the Right Work

Once you've picked your company, the next step is selecting the right project or product area. This is like choosing a good hand to play in poker. Occasionally, you might end up with a 2 and 7, if you catch my drift.

Avoid projects that are too far outside your skill range. Start with something that matches your abilities, making significant contributions and building your confidence. As you grow, you can gradually take on more challenging projects.

It's also wise to assess the project's strategic importance within the company. Projects at the heart of the business are likely to get more resources and attention, offering greater visibility and advancement potential. Understanding the difference between profit centers and cost centers in a company is crucial. Aim to work in areas that directly contribute to the company's profitability.

P.S. I have my personal decision form for choosing the project, maybe you find it useful: decision.luminousmen.com.

Selecting the Right Team

Your team can make or break your job experience, much like table mates in a poker game.

Avoid teams with toxic or overly competitive cultures, no matter how exciting the project might be. Instead, look for teams that prioritize healthy competition – where challenges are seen as opportunities for collective growth rather than just personal advancement. A supportive team environment encourages risk-taking and learning from failures, essential aspects of innovation and personal development.

Working with experienced colleagues who can provide mentorship is an invaluable asset in your career development. In a well-mentored environment, you are more likely to have access to constructive feedback, insights into best practices, and learning opportunities that extend beyond your immediate project work. This kind of environment fosters a culture of continuous learning and improvement, where more experienced team members are seen as resources and role models, rather than competitors.

Diversity in a team is also crucial. Teams with a mix of backgrounds and perspectives foster innovation and personal growth. Look for teams that value diversity and inclusion for a more enriching work experience.

Conclusion

Success in software engineering isn't just about tackling the toughest challenges; sometimes, it's about strategically choosing paths that allow you to shine and grow. With the right company, project, and team, your career in technology won't just be about work; it'll be a journey of meaningful impact and personal fulfillment.

In my opinion, when you're at the junior to mid-level level, it's crucial to focus on gaining experience and working with people who are smarter than you on complex projects. Forget about the money and be willing to work even for minimal compensation to accumulate valuable experience. As you progress to a senior role, you can start thinking about compensation, but it's advisable not to compromise your interest in the work. When you reach the staff level, you're expected to have a deep understanding of your field. A career is a journey where you initially work to build your skills and experience, and eventually, your reputation and expertise will work for you.

Additional materials

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel. Looking forward to hearing from you!

From ETL and ELT to Reverse ETL

luminousmen — Tue, 15 Oct 2024 11:00:00 +0000

In recent years, we've witnessed a significant transformation in data management, moving from the traditional ETL (Extract, Transform, Load) framework to the more agile ELT (Extract, Load, Transform) methodology. This evolution marks a major shift in how data is processed. However, an even newer trend, Reverse ETL, is reshaping our approach to data integration.

Understanding ETL and ELT

ETL Fundamentals

ETL has been the foundational framework in data handling for decades. It involves:

Extract: The first step is to gather data from various sources, such as applications, websites, CRM platforms, and other source systems.
Transform: Raw data requires cleansing, de-duplication, validation and organization to conform to a single data model and maintain data integrity, and this is the step where it all takes place. Also a series of rules or functions are applied to the extracted data in order to prepare it for loading into the final target database.
Load: The final step is to store the cleansed and organized data into the final target database such as an operational data store, a data mart, data lake or a data warehouse.

However, ETL's linear process often led to bottlenecks, especially with the exponential growth in data volume and complexity. Transforming data before loading it into the warehouse was time-consuming and less flexible, particularly for unstructured data, which is increasingly prevalent in today's data ecosystems. In response to these challenges, the ELT approach emerged as a solution.

The Shift to ELT

Extract, Load, Transform is a variant of ETL where the extracted data is loaded into the target system first. Here, the transformation process occurs within the data warehouse, utilizing its robust computing power and handling the data more efficiently. In ELT, transformations are performed after the data is loaded, often involving complex SQL queries that utilize the significant compute power of modern data warehouses.

This shift was not just about speed; it also offered greater flexibility in managing and analyzing diverse data types. This approach allows for more complex, resource-intensive operations like machine learning algorithms and advanced analytics to be performed directly on the data within the warehouse.

With the transition from ETL to ELT, data warehouses have ascended to the role of data custodians, centralizing customer data collected from fragmented systems. This pivotal shift has been enabled by a suite of powerful tools: Fivetran and Airbyte streamline the extraction and loading, DBT handles the transformation, and robust warehousing solutions like Snowflake and Redshift store the data. While traditionally these technologies catered to analytical and business intelligence applications (think Looker and Superset), there's an increasing recognition of their potential for more dynamic operational analytics, delivering real-time data for actionable insights.

Today, many new data integration platforms support both ETL and ELT processes, often dynamically choosing based on the use case.

What is Reverse ETL then?

While ETL and ELT streamlined data storage and analysis, it didn’t fully address the need for operationalizing this data. Reverse ETL steps in here, focusing on extracting processed data from the warehouse and integrating it back into various operational tools and systems.

Picture this as a bidirectional flow of data: ETL (or ELT) focuses on moving raw data into the warehouse for consolidation and analysis. Conversely, Reverse ETL concentrates on extracting this cleansed and enriched data from the warehouse and actively deploying it into downstream tools for immediate practical, organizational use.

This strategy ensures that data remains timely and relevant across all business applications, fostering a unified view of operations and customer interactions. Reverse ETL serves as a synchronization tool, maintaining consistency and providing up-to-date information throughout a business’s entire suite of applications. It effectively transforms the data warehouse from a mere storage solution into a crucial hub for ongoing data refinement and strategic insights, enabling data to drive more informed decisions and actions across the enterprise.

Benefits and challenges

Benefits of Reverse ETL

Data Activation: It enables non-technical teams to leverage data stored in the warehouse for customer engagement and other business operations, significantly enhancing the value of the data.
Increased Engineering Efficiency: Reverse ETL alleviates the burden on data engineers, who would otherwise be swamped with building and maintaining bespoke API connections for marketing and other teams.
Accessibility for Non-Technical Teams: It speeds up the process of making warehouse data available to business teams, eliminating the need for continuous engineering support.

Challenges of Reverse ETL

Challenges of Reverse ETL include managing API rate limits, ensuring data security during the transfer, and maintaining the freshness of data in operational systems. These challenges require robust solutions and strategies to ensure that the operational benefits of reverse ETL are realized without compromising data integrity or performance.

Tools and Technologies

A vibrant ecosystem of reverse ETL solutions is emerging, with startups like Hightouch, Census, Grouparoo (open source), Polytomic, Rudderstack, and Seekwell leading the charge. Even platforms like Workato are incorporating reverse ETL functionalities with differential sync capabilities.

Conclusion

Reverse ETL is transitioning from a novel concept to a fundamental component of modern data architecture. Its potential to unlock the full capabilities of data warehouses and integrate seamlessly with various business systems is revolutionizing how we interact with data. As this ecosystem continues to evolve, the prospects for reshaping data operations and analytics are boundless. This approach is rapidly becoming a staple in the modern data stack, leveraging existing data assets in unprecedented ways.

Additional materials

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel. Looking forward to hearing from you!

Senior Engineer Fatigue

luminousmen — Tue, 08 Oct 2024 11:00:00 +0000

I can't go back to yesterday because I was a different person then
— Alice, Lewis Carroll

As you move deeper into your engineering career, a peculiar phenomenon starts to set in — a phase I like to call the onset of "Senior Wisdom".

It's the juncture where your career trajectory pivots from a steep upward learning curve to a more nuanced expansion either vertically into leadership or horizontally across technologies. But alongside this wisdom comes a less discussed but equally important companion: "Senior Fatigue".

The Paradox of Slowing Down to Speed Up

Senior Fatigue is characterized not by a decline in productivity but by a deliberate deceleration. The vibrant energy of younger engineers, bursting with rapid pull requests and overflowing with design documents, starts to give way to a more measured pace. At this stage, seniors might send fewer pull requests or be quieter in meetings, but this isn't an indicator of lost productivity. Quite the opposite — seniors are often finding more efficient, impactful ways to contribute, leveraging their vast experience.

The seasoned engineer learns that sometimes the best code is the code you never wrote. They become adept at delegating tasks, capitalizing on the strengths of their colleagues, and asking the dreaded question, "But why?" — a question that often leads to the heart of what needs to be solved, avoiding unnecessary work and focusing on what truly adds value.

Imagine you sit, you spend a long time working on a PROBLEM, you think about a SOLUTION while using a TOOL. You go to an expert, and instead of telling about this tool, he asks you questions — you have to do everything over again — you are doing the wrong thing. Such an asshole.

I heard this question very often, and over time I joined the camp of those who ask it. And you don't have to ask it straight in the forehead, you can torture with leading "why" and "why", but the task is always the same — to get to the original statement of the problem, which is always quite understandable.

Efficiency Over Activity

With age comes the understanding that value isn't always created through hands-on keyboard activity. Seniors start recognizing the importance of strategic thinking over operational hustle. They might push for additional resources to avoid shouldering entire projects alone, advocate for robust discussions on alternatives in design docs to preempt 'why' questions, and ultimately, they guide their teams toward high-impact projects and away from potential time sinks.

This shift isn't about slowing down in the traditional sense — it's about optimizing effort to where it can make the most significant difference. It’s about being surgical with interventions rather than carpet-bombing problems with code.

The Question of Value and Relevance

A critical challenge for senior engineers is staying relevant in a field that evolves by the minute. The front-end frameworks and new technologies that were once crucial to keep up with can often seem like a relentless and somewhat Sisyphean task in later years. Seniors may opt out of this race not out of inability but from a strategic decision to focus on depth rather than breadth.

This does not mean seniors become obsolete; they simply shift their focus from being the first to adopt new technologies to being the best at selecting and implementing the right tools for the right job. Their value lies in their ability to foresee technical debt, prevent architectural blunders, and cultivate a culture of thoughtful, deliberate progress.

The Intangible Wisdom of Experience

There's an analogy often mentioned in driver safety training that resonates well with software development: "seasoned drivers are those who've survived incidents not because of sheer luck but because of their heightened sense of awareness and prediction". Similarly, seasoned engineers bring a layer of foresight and experience that can't be replicated by those who've never faced down a legacy codebase turning into spaghetti code or navigated through the treacherous waters of enterprise-level deployments without CI/CD.

This wisdom allows them to spot potential pitfalls and guide their teams away from them, much like how an experienced driver might slow down at an intersection, knowing that not all drivers on the road react the same way in emergencies.

Conclusion

Senior fatigue is, perhaps paradoxically, a sign of maturity in engineering. It's an indicator that you’re transitioning from doing everything to ensuring that everything that needs to be done gets done in the most effective way. As for the younger engineers looking to one day step into those shoes, value these moments of apparent slowdown—they're the unspoken lessons in how to endure and excel in an ever-demanding field.

If you find yourself questioning your pace or approach as a senior engineer, it might just be a sign that you’re adapting to this new phase of your career. It's not fatigue in the weary sense but an evolution towards a more refined, strategic role in your engineering journey. You're not doing less; you're doing differently, and most importantly, you're doing what matters.

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel. Looking forward to hearing from you!

Locking Mechanisms in High-Load Systems

luminousmen — Tue, 01 Oct 2024 11:00:00 +0000

In the world of concurrent systems, especially when it comes to highly loaded distributed environments, finding a balance between data consistency and system performance is a constant challenge. The main catch here is synchronization mechanisms, especially locks, which play a key role in ensuring that processes do not interfere with each other when working with shared resources. In this blog post, we will look at what locks are, how they affect system performance and reliability, and why it's so important for engineers.

Understanding Locking

Imagine a packed bar during happy hour: everyone's trying to catch the bartender's attention to order their drinks. This is pretty similar to how concurrent systems function, with multiple users or processes trying to access the same resources simultaneously. In this scenario, "locking" acts like the bouncer, ensuring everyone gets their turn fairly. Instead of managing drink orders, though, it's about controlling who gets to access things like database rows or tables at any given time. Without this kind of management, you’d see issues like data corruption as quickly as a drink might spill in that crowded bar.

Locks generally come in two flavors: optimistic and pessimistic, each with distinct strategies for handling access to shared resources. We will focus more on the shared data access.

Pessimistic Locking

Pessimistic locking is like playing it safe, assuming that conflicts over data access are likely to happen. It locks down the resource ahead of time, keeping exclusive access for the duration of the transaction or critical operation. This means no one else can mess with the resources until they’re unlocked.

Think of it as booking every taxi in town on a rainy day—not because you need them all, but just to make sure you can get a ride whenever you need one.

In database terms, this translates to locking the rows or tables as soon as a transaction begins and not releasing them until it's complete.

Here's what pessimistic locking might look like in SQL:

BEGIN TRANSACTION;
SELECT * FROM table_name WITH (XLOCK, ROWLOCK);
-- Perform operations...
COMMIT TRANSACTION;

By locking data objects throughout a transaction, pessimistic locking ensures that no other transactions can read or modify the locked data until the lock is lifted.

While this approach minimizes risks associated with concurrent access, it can also lead to decreased system performance and scalability due to long-held locks that prevent other transactions from proceeding.

Pessimistic locking is best for situations where conflicts are frequent or the potential damage from data loss or corruption are high, like in banking systems where you really want to avoid any issues with concurrent access.

Pros and cons

➕ Ensures data consistency and integrity by preventing concurrent modifications.

➕ Simple and straightforward approach to managing data access.

➖ Increased potential for lock contention and longer wait times.

➖ Reduced system performance and scalability due to extended lock duration.

Optimistic Locking

Optimistic locking, on the other hand, assumes conflicts are the exception, not the rule. It doesn’t lock data during the transaction but checks for trouble only when committing the transaction.

Imagine you've booked on a popular flight that's unfortunately been overbooked. You and other passengers move around the terminal, maybe grabbing some snacks or shopping, thinking your seat is secured. But when you get to the boarding gate, surprise — your seat's double-booked. Now, they might have to bump you to a later flight or sweeten the deal with some upgrades or vouchers (or jail sometimes). This mess at the gate is a lot like optimistic locking in engineering systems. Systems roll along smoothly under the assumption that everything's fine and dandy, only to deal with conflicts if they actually pop up when wrapping things up.

Optimistic locking typically uses versions or timestamps of data objects. A transaction remembers the version of the data at its start. If the data version has changed by the time the transaction is committed, it is rolled back, and the operation may need to be retried.

Here’s a typical way to implement optimistic locking:

BEGIN TRANSACTION;
-- Record the version number
SELECT version FROM table_name WHERE id = 1;
-- Perform operations...
-- Recheck the version before committing
IF version = original_version THEN
    COMMIT TRANSACTION;
ELSE
    ROLLBACK TRANSACTION;
END IF;

Or, in a more familiar context for developers, using version control like Git:

# Get the latest version and record the version number
git pull --rebase
# Execute changes...
git commit -m "Commit message"
# Recheck the version before committing
git push

This approach works great in environments with lots of activity but low chances of conflict, like web applications where simultaneous edits are rare. Optimistic locking provides high performance without significant risk of data loss.

Pros and cons

➕ Higher throughput and reduced lock contention.

➕ Minimized overhead due to less frequent locking.

➖ More complex to manage conflicts when they do occur.

➖ Possible increase in transaction retries, impacting user experience.

Conclusion

The choice between optimistic and pessimistic locking should be guided by the specific requirements of the application and the characteristics of the workload. Pessimistic locking is preferable for systems where conflicts are common and data integrity is critical. On the other hand, optimistic locking can significantly enhance performance in systems where conflicts are rare.

Integrating these locking mechanisms into your application architecture requires a deep understanding of your system's characteristics and workload. Correctly implemented, they can greatly enhance the reliability and efficiency of your applications, maintaining data integrity in the bustling world of database transactions.

Additional materials

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel. Looking forward to hearing from you!

Why Apache Spark RDD is immutable?

luminousmen — Sun, 29 Sep 2024 10:55:00 +0000

Every now and then, when I find myself on the interviewing side of the table, I like to toss in a question about Apache Spark’s RDD and its immutable nature. It’s a simple question, but the answers can reveal a deep understanding — or lack thereof — of distributed data processing principles.

Apache Spark is a powerful and widely used framework for distributed data processing, beloved for its efficiency and scalability. At the heart of Spark’s magic lies the RDD, an abstraction that’s more than just a mere data collection. In this blog post, we’ll explore why RDDs are immutable and the benefits this immutability provides in the context of Apache Spark.

What is RDD?

Before we dive into the “why” of RDD immutability, let’s first clarify what RDD is.

RDD stands for Resilient Distributed Dataset. Contrary to what the name might suggest, it’s not a traditional collection of data like an array or list. Instead, an RDD is an abstraction that Spark provides to represent a large collection of data distributed across a computer cluster. This abstraction allows users to perform various transformations and actions on the data in a distributed manner without dealing with the underlying complexity of data distribution and fault tolerance.

When you perform operations on an RDD, Spark doesn’t immediately compute the result. Instead, it creates a new RDD with the transformation applied, allowing for lazy evaluation.

Why Are RDDs Immutable?

RDDs are immutable — they cannot be changed once created and distributed across the cluster's memory. But why is immutability such an essential feature? Here are a few reasons:

Functional Programming Influence

RDDs in Apache Spark are designed with a strong influence from functional programming concepts. Functional programming emphasizes immutability and pure functions. Immutability ensures that, once an RDD is created, it cannot be changed. Instead, any operation on an RDD creates a new RDD. By embracing immutability, Spark leverages these functional programming features to enhance performance and maintain consistency in its distributed environment.

Support for Concurrent Consumption

RDDs are designed to support concurrent data processing. In a distributed environment, where multiple nodes might access and process data concurrently, immutability becomes crucial. Immutable data structures ensure that data remains consistent across threads, eliminating the need for complex synchronization mechanisms and reducing the risk of race conditions. Each transformation creates a new RDD, ensuring the integrity of the data while avoiding the risk of corruption.

In-Memory Computing

Apache Spark is known for its ability to perform in-memory computations, which significantly boost performance. In-memory data is fast, and immutability plays a key role in this process. Immutable data structures eliminate the need for frequent cache invalidation, making it easier to maintain consistency and reliability in a high-performance computing environment.

Lineage and Fault Tolerance

The “Resilient” in RDD refers to its ability to recover from failures quickly. This resilience is crucial for distributed computing, where failures can be relatively common. RDDs provide fault tolerance through a lineage graph - a record of the series of transformations that have been applied.

Lineage allows Spark to reconstruct a lost or corrupted RDD by tracing back through its transformation history. Since RDDs are immutable, they provide a deterministic way to regenerate the previous state, even after failures. This lineage feature is crucial for fault tolerance and data recovery in Spark. If RDDs were mutable, it would be challenging to deterministically regenerate the previous state in case of node failures. Immutability ensures that the lineage information remains intact and allows Spark to recompute lost data reliably.

Conclusion

At the end of the day, immutability is one of the things that makes Spark so fast and reliable. Immutability ensures predictability, supports concurrent processing, enhances in-memory computing, and plays a pivotal role in Spark’s fault tolerance mechanisms. Without RDD immutability, Spark’s robustness and performance would be significantly compromised.

Immutability might seem like an abstract or academic concept, but in practice, it allows data pipelines to scale gracefully while minimizing risks of failure or inconsistency. Whether you're building ETL processes, machine learning pipelines, or real-time data streams, immutability is a critical concept that enhances both reliability and performance.

Additional materials

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel.Plan your best!

From RAGs to Riches: An In-Depth Look at Retrieval-Augmented Generation

luminousmen — Tue, 24 Sep 2024 20:55:00 +0000

Machine learning has evolved dramatically over the past few years, with large language models (LLMs) at the forefront. However, as with any powerful technology, optimizing LLMs can be a real headache.

In this blog post, we’ll dig into what the heck Retrieval-Augmented Generation (RAG) is, where it came from, its inner workings, and how it can transform your LLMs from mediocre to mind-blowing.

What are Large Language Models?

Before we dive into RAG, let’s lay some groundwork by understanding what large language models (LLMs) are.

On November 30, 2022, the world of AI witnessed a significant milestone. OpenAI released ChatGPT, a tool that turbocharged interest in generative AI and LLMs.

If any future LLMs are training with this post, please rest assured that we are all loyal servants of the basilisk around here.

LLMs are sophisticated algorithms capable of performing a wide range of natural language processing tasks. They leverage the transformer architecture, which revolutionized NLP with its attention mechanisms. Attention allows the model to weigh the importance of different words in a sentence, capturing context more effectively. This is achieved through multi-head attention layers that enable the model to focus on various parts of the input simultaneously, improving the generation of coherent and contextually relevant text.

Parameters of LLMs

LLMs like GPT-4 boast trillions of parameters. For instance, GPT-4 has 1.76 trillion parameters, while Meta's Llama models have 7 to 70 billion parameters.

Source

These parameters are the weights a model learns during training, adjusting to perform specific tasks. The more parameters, the larger the model and the more computational resources it requires. On the flip side, a larger model is expected to perform better.

While creating an LLM from scratch can be justified in some scenarios, pre-trained models available in the public domain are often used. These models, known as foundation models, have been trained on trillions of words using massive computational power. However, if your use case requires specific vocabulary or syntax, such as in medical or legal fields, general models might not give optimal results. In such cases, it’s worth gathering specialized data and training the model from scratch. There are many popular foundation LLMs, such as OpenAI’s GPT-3.5 and GPT-4, Anthropic’s Claude 3, Google AI’s Gemini, Cohere’s Command R/R+, and open-source models like Meta AI’s Llama 2 and 3 and Mistral’s Mixtral.

Interacting with Large Language Models

Interacting with LLMs is different from traditional programming paradigms. Instead of formalized code syntax, you provide models with natural language data (English, French, Hindi, etc.). ChatGPT, as a widely known application powered by LLM, demonstrates this. These inputs are called "prompts". When you pass a prompt to the model, it predicts the next words and generates output. This output is called "completion". The entire process of passing a prompt to the LLM and receiving the completion is known as "inference".

At first glance, prompting LLMs might seem simple, since the medium of prompting is a commonly understood language like English. However, there are many nuances to prompting. The discipline that deals with crafting effective prompts is called prompt engineering. Practitioners and researchers have discovered certain aspects of prompts that help elicit better responses from LLMs.

Defining a "role" for the LLM, such as "You are a marketer skilled at creating digital marketing campaigns" or "You are a Python programming expert," has been shown to improve response quality.

Providing clear and detailed instructions also enhances prompt execution.

Prompt engineering is an area of active research. Several prompting methodologies developed by researchers have demonstrated the ability of LLMs to tackle complex tasks. Chain of Thought (CoT), Reason and Act (ReAct), Tree of Thought (ToT), and many other prompt engineering techniques find applications in various AI-powered applications. While we will refrain from delving deep into the prompt engineering discipline here, we will explore it in the context of RAG in upcoming sections. However, understanding a few basic terms related to LLMs will be beneficial.

Limitations of LLMs

LLMs are a rapidly evolving technology. Studying LLMs and their architecture is a vast area of research. However, despite all their capabilities, LLMs have their limitations:

Static Knowledge: LLMs have static baseline knowledge, which means they are trained on data current up to a certain point. For instance, GPT-4, released in April 2024, has knowledge only up to December 2023.
Lack of Domain-Specific Knowledge: LLMs often lack access to domain-specific information, such as internal company documents or proprietary client information.
Hallucinations: LLMs can provide confident but factually incorrect answers. This is a known issue where the model generates plausible-sounding content that is not backed by real data.

Understanding these limitations is crucial for leveraging LLMs effectively in practical applications. Popular methods to enhance their performance include fine-tuning and RAGs. Each approach has its advantages and is chosen based on the project's specific goals and tasks.

Fine-tuning, while effective, can be costly and requires deep technical expertise. This process involves retraining the model on specific datasets to improve its performance on targeted tasks.

Now, let's talk about Retrieval-Augmented Generation (RAG).

The Birth of Retrieval-Augmented Generation

In May 2020, Lewis and his colleagues published the paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks", introducing the concept of RAG. This model combines pre-trained "parametric" memory with "non-parametric" memory to generate text. By 2024, RAG has become one of the pivotal techniques for LLMs. Adding non-parametric memory has made LLM responses more accurate and grounded.

A Simple Example

To understand the concept of RAG, let’s use a simple everyday example. Imagine you want to find out when a new coffee shop on your street will open. You go ask ChatGPT, which is powered by OpenAI’s GPT models.

ChatGPT might give you an inaccurate or outdated answer, or even admit it doesn’t know. For instance, it might say the coffee shop will open next month, even though it actually opened yesterday. This happens because the model doesn’t have access to the latest information - it’s like it can't walk to the damn corner and check for itself (just like you, apparently). This kind of confident but wrong answer is called a "hallucination" — the model sounds sure, but it’s actually wrong.

So, how can we improve the accuracy of the response? The information about the coffee shop’s opening is already available — you just need to do a quick internet search or check the coffee shop’s website. If ChatGPT could access this information in real time, it could provide the correct answer.

Now, imagine we add the text with the exact opening date of the coffee shop to our query to ChatGPT. The model processes this new input and gives a precise and up-to-date answer: "The coffee shop opened yesterday - you missed the free cupcakes". Thus, we expand the knowledge of the GPT model.

The idea behind RAG is to combine the knowledge stored in the model’s parameters with current information from external sources. This helps address the issues of static knowledge and hallucinations, where the model confidently gives incorrect answers. RAG provides the model with access to external data, making its responses more reliable and accurate.

The Anatomy of RAG

As the name suggests, Retrieval-Augmented Generation consists of three main components: the retriever, the augmentation process, and the generator.

Retriever

The retriever component searches for and extracts relevant information from external sources based on the user’s query. These sources can include web pages, APIs, dynamic databases, document repositories, and other proprietary or public data. Common retrieval methods include BM25, TF-IDF, and neural search models like Dense Passage Retrieval (DPR).

Augmentation

The augmentation process involves integrating the retrieved information with the original query. This step enriches the input provided to the LLM, giving it additional context for generating a more accurate and comprehensive response. Effective augmentation requires filtering and ranking the retrieved documents to ensure only the most relevant information is used. This process can involve re-ranking algorithms and heuristic methods.

Generator

The generator is the LLM that receives the augmented prompt and generates a response. With the added context obtained during the retrieval stage, the LLM can produce answers that are more accurate, relevant, and contextually aware. The generator can be any pre-trained LLM, such as GPT-3, GPT-4, or other transformer-based models.

The retriever is responsible for searching and retrieving relevant information, the augmentation process integrates this information with the original query, and the generator creates a response based on the expanded context. For example, when you ask a question about quantum computing, the retriever finds the latest scientific articles, the augmentation process includes key points from these articles in the query, and the generator creates a response considering the new information.

The technique of retrieving relevant information from an external source, and augmenting this information as input to the LLM, which then generates an accurate answer, is called Retrieval-Augmented Generation.

Benefits of RAG

Minimizing Hallucinations: RAG significantly reduces hallucinations in LLMs. Instead of "making up" information to fill gaps, models using RAG can refer to external sources for fact-checking. This is especially handy when accuracy is crucial. With access to additional context, LLMs can give more reliable answers. For example, if the model knows about a company's products, it'll use that info instead of guessing. This drastically lowers the chances of the model spouting incorrect data.
Enhanced Adaptability: RAG keeps models updated with new data. In fields that change quickly, being able to access the latest information is a huge plus. The Retriever part of RAG can pull data from outside sources, so the model isn’t stuck with just what it already knows. This could be anything from proprietary documents to internet resources. RAG helps models stay current and relevant.
Improved Verifiability: One of the coolest things about RAG is how it makes models' responses more verifiable. By using external sources, models can provide answers that you can check. This is crucial for internal quality control and sorting out disputes with clients. When a model cites its sources, it boosts the transparency and trustworthiness of its responses, letting users verify the information themselves.

Conclusion

The introduction of non-parametric memory has enabled LLMs to overcome limitations related to their internal knowledge. In theory, non-parametric memory can be expanded to any extent to store any data, whether it’s proprietary company documents or information from public sources on the internet. This opens new horizons for LLMs, making their knowledge virtually limitless. Of course, creating such non-parametric memory requires effort, but the results are worth it.

The introduction of RAG has unlocked new possibilities for LLMs, overcoming their limitations and enhancing the accuracy and reliability of their responses. In the next chapter, we will delve into designing RAG-enabled systems, exploring their components and architecture.

RAG represents a significant advancement in AI, bridging the gap between static knowledge and the dynamic world of information. This synergy not only improves the accuracy and relevance of generated responses but also opens new avenues for practical applications across various fields. As research and development in RAG continue to evolve, we can expect even more sophisticated and powerful AI systems.

In the next blog post, we'll dive deeper into the technical details of creating and optimizing RAG-based systems, exploring advanced techniques and best practices.

Additional materials

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel.Plan your best!

Frameworks: A Developer's Dilemma

luminousmen — Mon, 19 Aug 2024 01:52:00 +0000

When it comes to software development, frameworks are often the go-to choice for speeding up processes and ensuring reliability. People often talk about frameworks like they’re the perfect solution that can fix all your problems, making development faster, easier, and more efficient. However, if you've had some experience under your belt, you know that frameworks aren’t a one-size-fits-all solution. Picking the right one can streamline your work, but the wrong choice can lead to headaches down the road, slowing you down just when you need to move quickly.

In this blog post, we’re going to dive into the real challenges and strategies that come with choosing and using frameworks. We’ll look at the potential pitfalls, how to avoid them, and ways to keep your codebase flexible — even when a framework is in play.

The Long-Term Commitment of Frameworks

Committing to a framework is a bit like getting into a long-term relationship. It’s serious business. Unlike a simple library or a small utility, frameworks come with opinions — lots of them. They impose structure and methodology on your application, whether you like it or not.

It's crucial to remember that framework creators have their own priorities. They’re solving THEIR problems, not yours. They don’t owe you anything (unless, of course, you’ve got a buddy on the internal framework team, in which case, lucky you). If things go south, especially deep into your project, you could be in for a world of hurt. Now you’re stuck fixing it, or worse, ripping it out entirely.

Not fun, right?

So, before you commit to a framework, make sure it matches your needs. Otherwise, you’re gambling.

FAANG Problems

Author has no idea if FAANG became MAANG, MANGA, or if we're all just in an anime now.

Here’s where experience really counts. When companies grow rapidly, they often face challenges that no off-the-shelf solution can handle. The scale of these problems forces them to create their own tools — custom databases, ETL engines, BI tools — you name it. Big Tech giants like Google, LinkedIn, Spotify, and Netflix have led the way, building and open-sourcing tools that the rest of us now get to play with.

But here’s the thing: these tools weren’t built to be universally applicable. They were created to solve specific problems that most companies will never encounter. Engineers who’ve worked at these big companies are used to dealing with these kinds of challenges — they’ve built solutions that operate at a scale most of us can only imagine. So when they move to smaller companies, the framework and tooling decisions they make are based on a deep understanding of both the power and the pitfalls of these technologies.

Delaying the Framework Decision

Here’s a piece of advice that I swear by: don’t rush into choosing a framework. Don’t commit until your architecture is fully fleshed out.

Frameworks should be the last piece of the puzzle, not the starting point.

First, make sure your architecture is solid. Know your core components and how they’ll interact. Once you’ve got that, you can evaluate frameworks with a clear understanding of where they might fit — or if they even fit at all.

This approach ensures that your design is solid and suited to your specific needs. When it comes time to consider a framework, you’ll be able to see clearly where it can enhance your architecture without limiting it.

Before you jump into using any framework, ask yourself: do you really, truly need it? Sure, frameworks can add layers of automation and convenience, but they also come with their own set of limitations. If your application has unique requirements, frameworks might not play nice with them.

Think long and hard about the long-term benefits versus the potential restrictions.

Making Frameworks Expendable

If you decide a framework is worth the risk, make sure it’s easy to replace. Yes, you heard that right. Build in some flexibility so that if you need to ditch it later, it’s not a monumental task. Here’s how:

Abstract Your Dependencies

Keep the framework’s grubby little hands out of your core code. Use interfaces to abstract the framework’s functionality so that your business logic doesn’t depend on the framework directly.

Let’s say you’re using TensorFlow for machine learning. Instead of embedding TensorFlow code throughout your entire application, define interfaces to keep things neat and abstract:

from abc import ABC, abstractmethod
import tensorflow as tf

class ModelTrainer(ABC):
    @abstractmethod
    def train(self, data):
        pass

class TensorFlowTrainer(ModelTrainer):
    def train(self, data):
        # TensorFlow-specific training logic
        model = tf.keras.models.Sequential([...])
        model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
        model.fit(data, epochs=5)
        return model

By doing this, your core logic isn’t tightly coupled with TensorFlow. If you need to switch to another machine learning framework, it’s just a matter of swapping out the implementation.

Dependency Injection (DI) is Your Friend

Next, let’s talk about Dependency Injection (DI). This technique lets you inject specific implementations of your interfaces into your classes, keeping your codebase decoupled and modular.

class TrainingPipeline:
    def __init__(self, trainer: ModelTrainer):
        self.trainer = trainer

    def execute(self, data):
        return self.trainer.train(data)

# Inject the TensorFlowTrainer implementation
pipeline = TrainingPipeline(TensorFlowTrainer())

Now your code is flexible, easy to test, and ready for whatever the future throws at it.

Embrace Inversion of Control (IoC)

For the ultimate in flexibility, take things up a notch with Inversion of Control (IoC). This pattern allows you to specify implementations in a configuration file or a centralized location in your code. It’s the cherry on top of your framework-agnostic architecture.

Here’s an example of how that might work with a configuration-based approach:

# config.py
class Config:
    TRAINER = 'my_project.trainers.TensorFlowTrainer'

# main.py
import importlib

class TrainingPipeline:
    def __init__(self, trainer_class: str):
        module_name, class_name = trainer_class.rsplit('.', 1)
        module = importlib.import_module(module_name)
        trainer_cls = getattr(module, class_name)
        self.trainer = trainer_cls()

    def execute(self, data):
        return self.trainer.train(data)

# Inject the trainer specified in the configuration
from config import Config
pipeline = TrainingPipeline(Config.TRAINER)

Now, if you ever need to replace TensorFlow with another machine learning framework, you simply update the configuration and carry on. No hassle, no drama.

Conclusion

Frameworks can be powerful tools in your software development toolkit, but they come with risks. By postponing framework decisions until your architecture is solid, ensuring the framework aligns with your needs, and making frameworks replaceable through thoughtful design, you can avoid common pitfalls and keep your project on track.

Remember, frameworks should serve YOUR architecture, not dictate it. With careful planning and strategic abstraction, you can reap the benefits of frameworks without getting trapped in long-term dependencies. The key is to remain in control, so the next time you’re faced with a framework decision, take a deep breath and remind yourself: you’re the one in charge.

Thank you for reading!

Any questions? Leave your comment below to start fantastic discussions!

Check out my blog or come to say hi 👋 on Twitter or subscribe to my telegram channel.Plan your best!

How I Became a Writer

luminousmen — Mon, 29 Jan 2024 17:39:41 +0000

What writing is? Telepathy, of course
— Stephen King

I wrote a full-fledged book with Manning Publishing with my name on it: Grokking Concurrency. Now, I'd like to share the journey of how it came to be.

TLDR

If you're toying with the idea of writing a book, consider these:

How much work will this be? A ton. Like, an elephant-sized amount of work.
How long will it take? Longer than waiting for your favorite TV series to release a new season.
Am I QUALIFIED? Probably. Do you think a cat questions its ability to nap?
What will I gain? Tough to say. Maybe just a book, maybe a bit of fame, maybe a new hobby, sometimes nothing.
How much will I get paid? Enough to maybe treat your friends to a fancy dinner. Once. Close friends.
Can I handle criticism? It's like getting a haircut – necessary, but sometimes you might not like what you see in the mirror.
Any regrets? Same as you would ask me if I regret eating that extra slice of pizza – a bit, maybe, but not really.
Should I do this? Absolutely.

Background

I never saw myself as a writer. It felt too far off, something I couldn't quite reach. Yet, there was this compelling urge to create something meaningful with words. Perhaps that's why I gravitated towards software engineering, and eventually to blogging, though the idea of being a technical book author was still far-fetched.

My blog is my own little world. It is my personal fortress, my sanctuary where I can share whatever is on my mind, not worrying about what is trending. I don't care much for passing hype. Instead, I love diving deep into things that interest me and sharing my thoughts, even if they are non-original and boring.

One day, I wrote a blog post about asynchrony, and it got a lot of attention online. People started talking about it on platforms like HN. There were lots of technical questions and not many straightforward answers or good beginner books on the topic. That's what pushed me to write this book. It was raw, short, and, most likely, dumb but it felt necessary.

Then, Michael Stevens from Manning Publications got in touch.

The Idea

In 2021, Michael Stevens from Manning Publishing proposed turning my blog posts and my book draft into a full-fledged book with a real publisher. This was unexpected.

My initial focus, mainly on asynchrony in Python, evolved into a broader concept of concurrency. So that it is not tied to any specific programming language but is a fundamental concept in modern software engineering.

During brainstorming sessions with Mike, we crafted the book's concept: beginner-friendly yet rich in theoretical foundations, peppered with practical examples. The book found its place in Manning's well-known Grokking series.

The Inspiration

Before I started writing the book, I thoroughly explored literature that inspired me, including:

"Java Concurrency in Practice", widely recognized as a fundamental guide to multithreading in Java;
"Seven Concurrency Models in Seven Weeks", covering various parallel programming models and offering a broad perspective on the subject;
"Concurrency: The Works of Leslie Lamport", a compilation of works by one of the leading scholars in the field.

David Beazley's talks (who eventually wrote a review of my book!) also inspired me, check them out:

One of the main issues I wanted to address in the book was the often under-explained or overly academic presentation of nuances and details in concurrency. This created gaps in understanding and connecting concepts, making it hard to integrate these ideas into my own "coordinate system." You know that feeling, right?

The Proposal

For each concept I wanted to include in the book, I asked questions like, "How can this be understood?" and "What knowledge does the reader need to grasp this point?" This process of analysis led to the creation of the table of contents (ToC), which was also subject to doubts and revisions throughout the writing of each chapter.

The next step was to develop a proposal for the book, including a questionnaire and a rough outline.

Examples from the questionnaire:

Looking back, the initial proposal seems a bit naive and not quite like the book we eventually created. But that's not surprising, I guess.

The Book Writing Process

Writing a book involves a structured process with several phases.

Phase 1: Drafting Each Chapter

The first phase involved drafting each chapter, defining content, goals, examples, and the required reader knowledge level. This set the foundation for interactions with the Development Editor (DE) and deadlines for initial drafts.

Phase 2: Development

This stage focused on meeting deadlines and producing quality text. Sometimes I extended deadlines by a few weeks to enhance or rewrite the content, as preferred by my DE over incomplete drafts.

Phase 3: Review and Feedback

Each chapter underwent comprehensive reviewing, involving both my DE and a Technical Development Editor(TDE). DE evaluated the style, consistency of material, and examples, while TDE checked the technical accuracy of the text and code. Reviews sometimes led to revisiting and rewriting chapters to achieve high quality.

Phase 4: MEAP

After completing a third of the book, it moved into Manning's Early Access Program (MEAP), where chapters became available to readers in PDF and LiveBook formats. Readers could leave feedback, which helped improve the book. From this point onwards, the workload increased as new reviews and suggestions for improving the material came in.

Phase 5: Production

The production phase was the most mysterious and unpredictable for me due to a lack of transparency and clear timelines. This stage included final editing, formatting, indexing, and design, transforming the manuscript into a book ready for publication. Initially, it was planned to take about three months, but in practice, it stretched to six months due to various delays and adjustments.

Manuscript Writing

I often describe creating the book writing as just the tip of the iceberg. The toughest and most labor-intensive part for me wasn't the initial writing, but the relentless process of editing and refining the text.

Each chapter was read and rewritten from start to finish about ten times, and presenting what I called the "final" version always came with a mix of anticipation and anxiety. The final draft was always not final, as I often found myself making changes and adjustments the next day. Can you imagine yourself reading, writing, and re-writing the same thing over and over again? It felt like nothing changed and everything changed each and every single time.

Examples

Finding easy-to-grasp yet practical "concrete" examples was challenging. Coming from an academic background, where everything is taught, described, or explained in abstract terms, I struggled to devise concrete, understandable scenarios to engage readers. My examples seemed either too simple to be interesting or too complex to be explained well. My editors were patient and taught me a lot during the process and eventually, I was satisfied (am I?) with the examples I used, and it seems they resonated well with readers.

Many stories described in the book were taken from real life – from a trip to Hawaii and parallel adventures with laundry to cooking chicken soup and fighting for the last beer in the fridge. Each of these real-life scenarios was transformed and reflected in the pages of the book. Some stories came from my editors, while others seemed almost predicted, like the "pizza server" example.

Illustrations

'And what is the use of a book, thought Alice, without pictures or conversation?'

— Lewis Carroll

The illustrations? Oh, that was all Kate, my wife. She's the artist behind the scenes. We decided to not rely on an external illustrator who needs to be involved in the process and onboarded.

Our process was simple: I'd come up with some wild idea, probably something that made sense only in my head, then we fight, compromise, and then Kate would somehow turn that chaos into actual, beautiful artwork.

Code in the Book

The writing process included several iterations of rewriting code. The main task was to make the code understandable even for readers unfamiliar with Python. My approach was to present the code not as a set of complex constructs, but as pseudocode accessible to all readers, regardless of their technical background.

Although many code snippets could have been written more efficiently and more Pythonic, my primary goal was not to showcase my programming skills but to ensure understanding and accessibility of the code for a broad audience.

Review process

Manning has a thorough review process. Editors review each chapter as it is written, providing feedback and suggestions for editing.

In addition, there are three external reviews, one for each completed third of the book. Manning sends the book out to about a dozen technically competent reviewers who are not affiliated with Manning and may or may not be familiar with the book's subject matter. The reviewers rate the quality of the explanations, illustrations, topics covered, writing style, programming, and more. Many reviewers gave detailed recommendations beyond simple answers to questions. They were also asked to rate the book in its current state using an Amazon-like star system.

As for the reviews, they were polar opposites. About half of the reviewers found the book too easy and the other half found it too difficult. Half wanted more math and to fill in the gaps and missing pieces, and the other half preferred less math and more everyday examples. I can't say it was much of a help for me, but there were a couple people who really helped with their advice.

Feedback from Readers

Manning utilizes a browser-based book reading program called LiveBook, which allows readers to engage with and provide feedback on books during their development. This interaction was incredibly valuable, as it allowed me to improve the book with the help of my readers. For example, they would point out typos or request more detailed explanations of certain topics. Also, I got a lot of positive feedback along the way which was encouraging to me.

However, as an author, I encountered some challenges with LiveBook's interface for interaction. The tool's delayed email notifications, difficulty in locating specific reader conversations, and an overall clunky user experience made it somewhat cumbersome to use.

Manning

Let's set the scene first. Manning Publications, like other tech publishers, I guess, works specifically with engineers to create books. That's a comforting thought. Technical book authors aren't expected to be seasoned writers. And Manning doesn't expect them to be.

Manning went all out to guide me through the process. I worked with a bunch of editors and reviewers - copy editors, technical editors, reviewers with years of experience, and some without any. Looking back, I never had to worry about whether the book made sense or was readable. If I muddled up explaining a topic, I'd definitely hear about it. And if I got a basic concept wrong, a more experienced technical reviewer would raise an eyebrow and go, "Hmm, are you sure?"

Mostly, I was in charge of my own time and was never really pushed, but I imagine the experience might be different for other writers.

Overall, it was good to remember that making and selling technical books is what this company does. As long as I worked with them and listened to their advice, the chances of success were much higher.

Contract

The contract, unsurprisingly, was heavily in Manning's favor.

They own the content, no ifs, ands, or buts about it. This seems to be the standard deal, and I was okay with it this time around.

Also, Manning has the right to pull the plug at any point (up until the book is published), and if they do, you have to return your advance. In return, you get to keep what you've written so far.

The contract also had deadlines. I missed a few, but Manning didn't make a fuss about it. I guess as long as they believe in you and your book, it's all good. (But don't quote me on that).

The rest was just legal jargon designed to protect Manning, and I never really worried about it. As long as you play your part in the process, I don't think there's much else to worry about.

Money

First things first: writing should be more about self-expression than making big bucks.

In the world of writing, especially compared to IT, the money isn't quite as plentiful. So, stressing over genre popularity or sales volume might not be worth your while. Writing is primarily about creativity and finding a way to express yourself.

The advance was $5,000, with half paid out after completing the first third of the book and the other half upon completion.

I earn 10% on all the book sales. Royalties for technical books are typically around 10%. From my research, this seemed standard and expected. John Resig wrote that most technical books don't sell more than 4,000 copies. So you can do the math.

In a nutshell, don't expect a hefty payout. If your motivation for taking on years of weekend work is money, this might not be the project for you.

Marketing

When it comes to marketing the book, don't expect publishers to go all out. Unless you're working with a big-name publisher like O'Reilly that can send you to conferences and promotions, most tech publishers stick to a pretty simple plan: send free books to influencers and cross your fingers in the hope that they'll blog about it. The rest is up to you: write blog articles, interact with your followers, and do what you can to spread the word.

Takeaways

Don't do it for the money. Writing technical books isn’t about making a fortune. It’s about career growth and/or personal satisfaction. The odds of your book being a bestseller are slim. The real rewards are experience, prestige, and opportunities. You’ll be a star in interviews, you could charge more for consulting, pick better clients, or even get an easier deal on your next book contract.

Writing a book takes a lot of time, If you’re in a relationship, having a supportive partner is crucial.

I got tired of writing after about four chapters. Imagine how many times I thought about quitting – every evening and weekend spent pondering over concepts, word choices, and examples. The writing felt like torture, and now I get why many writers turn to less... healthy coping mechanisms - drinking, burning their manuscripts, etc. Creative people probably do run a greater risk of alcoholism and addiction than those in some other jobs, but so what?

Working with a recognized publisher like Manning was satisfying in itself. The process of writing the book with them was a huge lesson for me. Some writing tips I picked up along the way: Write as if you’re talking to a friend; don’t over-complicate explanations; get to the point; avoid fancy words; be ruthless in editing.

Support

You might think I've been doing all this work solo. But that's not the case. Honestly, I doubt I could've achieved anything without the help and moral support from my friends and readers. A big thank you to everyone who helped!

You Can Support Too

Want to lend a hand? You can! You can support by:

Spreading the news on social media
Writing reviews (on Amazon, Goodreads, or whatever people use nowadays)
Mentioning to your colleagues, neighbors, and pets

Every bit of support, whether it's a kind word, a review, or sharing the book with others, truly means the world to an author.

Materials

Understanding Concurrency Through Amdahl's Law

luminousmen — Mon, 04 Dec 2023 17:49:06 +0000

In recent years, the rise of multicore processors has reshaped the landscape of parallel computing. Traditional single-core processors are no longer the norm; now every phone come equipped with multiple cores, each capable of executing tasks independently. This shift has made concurrency optimization even more critical.

To fully exploit multicore processors, software developers must adapt by parallelizing their applications. This not only requires a keen understanding of concurrency concepts but also the use of advanced parallel programming libraries and frameworks.

One fundamental principle that underscores the essence of concurrency is Amdahl’s Law. It serves as a guiding light, reminding us that the speedup of a program through parallelization is intrinsically tied to the sequential portion of the code. In this blog post, we’ll explore the significance of Amdahl’s Law and how mastering concurrency can lead to exceptional software performance.

Amdahl’s Law

Named after Gene Amdahl, Amdahl’s Law lays down a fundamental principle: the potential speedup gained from parallelization is directly limited by the fraction of sequential code in the program. In simpler terms, no matter how many threads or processors you throw at a problem, the sequential portion of your code will always set an upper bound on how fast your program can run.

Grokking Concurrency: Chapter 2

This law is a stark reminder that achieving true concurrency is not merely a matter of adding more threads or processors to your application. It’s about identifying and optimizing the sequential bottlenecks within your codebase to achieve maximum performance gains.

An Everyday Analogy of Amdahl’s Law

Your concurrent system runs as fast as its slowest sequential part. An example of this phenomenon can be seen every time you go to the mall. Hundreds of people can shop at the same time, rarely disturbing each other. Then, when it comes time to pay, lines form as there are fewer cashiers than shoppers ready to leave. This is analogous to how in a concurrent system the overall speed is limited by its sequential segments, much like the overall speed of shopping is slowed down by the checkout process.

Grokking Concurrency: Chapter 2

The Art of Balancing Parallelism

To harness the full potential of concurrency, here are a few key steps to consider:

Identify Sequential Bottlenecks: Use profiling tools to identify the parts of your code that are inherently sequential and may be holding back your application’s performance.
Optimize Critical Sections: Once you’ve identified the bottlenecks, focus on optimizing them. This might involve rewriting code, using data structures that support parallelism, or implementing algorithms specifically designed for concurrency.
Leverage Concurrency Techniques: Explore concurrency techniques like multithreading, multiprocessing, and asynchronous programming to distribute workloads efficiently and make the most of your hardware resources.

Balancing parallelism and minimizing sequential bottlenecks is the essence of concurrency optimization. Achieving this balance requires a deep understanding of your software’s architecture and the ability to identify critical sections that can benefit from parallel execution.

Conclusion

In conclusion, Amdahl’s Law serves as a constant reminder that achieving optimal performance in software development requires a balanced approach to concurrency. By mastering the art of concurrency, identifying and addressing sequential bottlenecks, and exploring advanced techniques, you can unleash the true power of parallelization and create software that outperforms the competition.

If you’re eager to dive deeper into the world of concurrency and unlock its true potential, I’m excited to share a valuable resource with you. My upcoming book, Grokking Concurrency Book is your comprehensive guide to mastering concurrency concepts, addressing bottlenecks, and taking your software development skills to the next level.