Forem: Brian Tarbox

Listen to Your Cloud: Co-Developing a CloudTrail Sonifier with an AI Partner

Brian Tarbox — Tue, 14 Apr 2026 18:06:31 +0000

Listen to Your Cloud: Co-Developing a CloudTrail Sonifier with an AI Partner

Back in 2010 I won a Duke's Choice Award at JavaOne for Log4JFugue, a system that converted log4j output into music streams. The core idea was simple: just as an auto mechanic can listen to a car and hear what's wrong, developers should be able to listen to their applications. You'd map your program's key verbs (create, process, destroy, error) to instruments like bass drum, snare, and cymbal crash, count occurrences in one-second buckets, and generate a chord per second. Busy seconds sounded thick. Quiet seconds sounded thin. Errors sounded wrong. You could literally hear your application's health while doing other work. (shoutout to David Koelle, the creator of JFugue, the technology underlying Log4JFugue).

That was sixteen years ago and the project has been on the shelf for a while. Recently I started wondering what it would look like to apply the same concept to AWS CloudTrail logs. Not log4j lines from a single application, but the firehose of API calls across an entire AWS account. I decided to find out, and I decided to do it by co-developing the system with Claude. What followed was one of the most interesting pair-programming sessions I've had, and a real education in what AI-assisted development actually looks like in practice.

Starting from the Idea

I gave Claude the context: JFugue was a Java library for programmatic music creation, Log4JFugue used it to sonify log files, and I wanted something similar for CloudTrail. Could it build a Python program that does a tail-dash-follow on CloudTrail events?

Within a minute I had a complete first version. It mapped AWS services to General MIDI instruments (EC2 got piano, S3 got marimba, IAM got trumpet), classified API actions into pitch ranges by their CRUD nature, added dissonant intervals for error events, and even hashed source IPs to stereo pan positions. The code was well-structured, properly documented, and showed a real understanding of the musical concepts behind the original project. A strong start.

It also didn't work at all.

The Debugging Dance

What followed was a series of increasingly specific problems that we worked through one at a time. The first was boring: AWS credentials weren't configured. Claude walked through the options (aws configure, environment variables, SSO) and noted the minimum IAM policy needed. Fair enough.

The second was more interesting: the program ran fine but produced no sound. This is the classic MIDI trap and I'll admit I should have seen it coming. The original code used the mido library to send MIDI messages, but MIDI messages are just instructions. Without a synthesizer listening on the other end, you get silence. Claude proposed adding a sounddevice backend that synthesized audio directly using numpy waveforms, no MIDI routing required. That was the right call. It also added a --test flag that plays a C major scale on startup so you can verify audio works before waiting for CloudTrail events. Small thing, huge time saver.

Then we hit the CloudTrail delivery delay problem. I was generating events but seeing nothing. Turns out CloudTrail's lookup_events API has a 5 to 15 minute delivery delay from when an event occurs to when it shows up in the API. Our initial 2-minute lookback window was missing everything. Claude widened it to 20 minutes. Problem solved, but this was the kind of thing where my AWS experience (I've been working with CloudTrail for years) and Claude's ability to quickly restructure the code made for a good partnership. I knew the problem, Claude implemented the fix in seconds.

Getting the Music Right

Once events were flowing and audio was working, we moved to the part I actually cared about: making it sound right.

The first version played events sequentially, one note per event. This was fundamentally wrong. In Log4JFugue, the whole point was the chord-per-second model. All events within a one-second window get stacked into a single chord. You hear density. Fifteen API calls in one second produces a thick, rich chord. One lonely DescribeTable produces a single thin tone. The difference is immediately perceptible, and that perceptual bandwidth is the whole reason sonification works.

I explained this to Claude and it restructured the entire architecture around a ChordBucket data class. Events get grouped by timestamp, deduplicated pitches form the chord, and repeated occurrences of the same event drive up velocity instead of adding more notes. This was a substantial rewrite and it got it right on the first pass.

Then came the error sounds. I asked Claude to make errors more prominent. It went for it: minor seconds, tritones ("the devil's interval" as it noted), square wave timbres, noise bursts, a 55 Hz bass rumble, and error chords that rang 50% longer than normal chords. I ran it and nearly fell out of my chair. "Can you dial back the error effect about half? We want people to notice the error without giving them a heart attack." Claude's response: let's make it an alert, not a cardiac event. It dialed everything back, and I asked it to print the actual error messages alongside the musical output. Now you hear the dissonance and can glance over to see "s3.GetObject: AccessDenied" right there in the terminal.

The Timing Problem

The trickiest issue was pacing. After switching to 60-second poll intervals (to stop getting throttled by CloudTrail's API rate limits), we had a new problem: 20 seconds of music followed by 40 seconds of dead silence. The program was playing all the chords as fast as possible, then sleeping until the next poll. That's not ambient monitoring, that's morse code.

The fix was to stretch each chord's duration to fill the entire poll interval. Twenty chords across 60 seconds means each chord sustains for 3 full seconds, flowing directly into the next. This also created a nice emergent property: busy intervals with many events produce rapid-fire chord changes, while quiet intervals produce long sustained drones. The pace of the music now encodes the activity level, not just the chord thickness. That's actually better than what Log4JFugue did.

What I Learned About AI Co-Development

This project took a single extended conversation. The system went from concept to working, audible, properly-paced CloudTrail sonification through maybe a dozen iterations. Some observations from the process:

Claude is an excellent first-draft generator and a fast refactorer. The initial code was structurally sound even if it didn't work out of the box. When I described what needed to change, the changes came fast and were usually right.

Domain knowledge still matters enormously. I knew about the CloudTrail delivery delay, about the chord-per-second model being essential, about MIDI needing a synthesizer. Claude didn't volunteer any of these things. But once I identified the issue, it could fix it faster than I could have.

The back-and-forth is the whole point. This wasn't me typing a prompt and getting a finished product. It was a genuine iterative development process: try it, find what's wrong, describe the problem, get a fix, try again. The pattern is much closer to pair programming than it is to code generation.

And sometimes you have to say "go back." We went down a path trying to eliminate tiny audio gaps between chords by pre-rendering and threading. It added complexity without solving the actual problem (which turned out to be the pacing model, not the gap). I asked Claude to revert to the simpler version and we took a different approach. That's a normal part of development, and it worked fine here too.

If you want to try it yourself, you need boto3, sounddevice, and numpy. Point it at an AWS account with some activity and listen. After a few minutes you'll start to develop an intuition for what "normal" sounds like. And when something goes wrong, you'll hear it. That was the whole point of Log4JFugue, and it turns out the idea translates to the cloud just fine.

The Compiler Never Used Sarcasm: Why AI Feels Unsafe to the Neurodivergent Coder

Brian Tarbox — Mon, 02 Feb 2026 14:20:01 +0000

I have been writing code for 45 years. I started when "memory" was something you counted in bytes, not gigabytes, and when a "bug" was nearly literal.

Over four and a half decades, I have watched languages evolve from Assembly to C, to Java, to Python. But through every iteration, one fundamental truth remained constant: The machine was literal. If I told the computer to do $X$, and it did $Y$, it was because I made a mistake in the instructions. It wasn't because the computer misunderstood my tone, or didn't like my attitude, or was having a bad day.

For a neurodivergent mind programming was more than a career. It was a sanctuary.

But today, as we shift toward Generative AI and Large Language Models (LLMs), that sanctuary is dissolving. We are moving from a world of explicit instruction to a world of implicit persuasion. For the neurodivergent coder, this isn't just a technical pivot; it is the loss of the only language that ever truly made sense to us.

The Black Box Problem

We often refer to neural networks as "Black Boxes" because we don't truly know how they arrive at an answer. But here is the irony: The neurotypical mind is also a Black Box to us.

We flocked to computers because they were transparent boxes. We could see the registers, trace the execution stack, and inspect the variables.

By replacing explicit code with natural language models, we have essentially built a machine that mimics the neurotypical brain: it relies on context, implies rather than states, and is confidently wrong just often enough to make you doubt your own sanity.

The Sanctuary of Syntax

To understand why this shift is so jarring, we have to look at the Theory of Mind.

In psychology, Theory of Mind is the ability to attribute mental states, beliefs, intents, desires, emotions, to oneself and others. It is the ability to understand that what is in my head is different from what is in your head. For many neurodivergent people, this is an exhausting, high-friction process. Navigating a dinner party requires constant, real-time calculation of social signals, subtext, and hidden agendas. If I tell a co-worker that “we’re going out for drinks” is that an implied invitation or just passing on some information? If it’s just information and you say “great, lets go” then you’re being rude and presumptuous. But, if it was an invitation and you say “well, that’s nice” then you’ve been rude and unfriendly. Why can’t people just be clear in what they say?

Coding required zero Theory of Mind.
The compiler has no hidden agenda. It has no "mind" to theorize about. It operates on pure, unadulterated logic.
Human interaction: "I'm fine" (Could mean: I am happy, I am angry, I am tired, or go away).
Computer interaction: return 0; (Means: The function ended successfully).

For 45 years, the IDE (Integrated Development Environment) was a safe space where the rules of social engagement were suspended. The feedback was brutal, but it was honest. A syntax error isn't a judgment of your character; it is a factual statement about a missing semicolon. I recall back in those days thinking “well if you know the semicolon is missing why can’t you just add it?” The answer of course is that the compiler had no theory of mind and didn’t “know” what I wanted.

The Invasion of Ambiguity

Enter Artificial Intelligence.
We are told that "English is the hottest new programming language." We are told to "prompt" the machine. But prompting is not programming. Prompting is negotiating.

When we write a prompt for an LLM, we are suddenly thrust back into the messy world of Grice's Maxims. Paul Grice, a philosopher of language, proposed that effective communication relies on the Cooperative Principle—rules regarding quantity, quality, relation, and manner.

Humans violate these rules constantly. We use sarcasm (violating Quality), we ramble (violating Quantity), and we are passive-aggressive (violating Manner).
Traditional code strictly enforced these maxims. It was succinct, truthful, and relevant. But an LLM? It hallucinates (violating Quality). It gives verbose, flowery explanations when you ask for a boolean (violating Quantity). It requires you to "massage" the input to get the right output.

Suddenly, the "source code" is subject to the same linguistic ambiguity as a casual conversation. We have to guess how the model "feels" about a certain phrasing. We are essentially performing therapy on a matrix of floating-point numbers to get it to write a SQL query.

An example where ambiguity nearly caused an accident
There was an accident when a pilot needed a go-around and told the co-pilot "takeoff power". That is an instruction to set full (takeoff) power to help the plane gain altitude. Unfortunately the co-pilot heard "take off power", which he interpreted as “remove power” so he set the engines to idle. This situation was exacerbated by the fact that all Air Transport communications are done in English, which was not these pilots' primary language. It’s hard to see how guardrails or contextual grounding could have helped here.

The Shannon Limit of Certainty

I actually knew Claude Shannon. We lived in the same town and I was close friends with his daughter in high school and college. We were also both members of the MIT Juggling Club.

For those who only know the name from textbooks, Shannon was the "Father of Information Theory." He was the man who realized that all information could be represented in binary digits—bits. He gave us the fundamental unit of digital certainty.

In Shannon’s world, a "bit" was a measure of the reduction of uncertainty. It was the answer to a yes/no question. It was the mathematical opposite of ambiguity.

When we wrote code in C or Java for the last 45 years, we were living in the house that Shannon built. We were manipulating bits. We were resolving uncertainty. The goal of every line of code was to eliminate noise so that the signal was perfect.

But LLMs operate on a different part of Shannon’s work: The Entropy of English.

Shannon famously estimated the "entropy" (or unpredictability) of written English. He understood that human language is redundant and statistical. This is exactly how modern AI works—it exploits the statistical redundancy of language to predict the next word.

But here is the catch: Prediction is not precision.

By moving from traditional coding to Prompt Engineering, we are trading the Bit (absolute certainty) for the Token (probabilistic likelihood). We are leaving the noise-free channel of the compiler and wading back into the swamp of linguistic entropy—the very swamp Shannon helped us pave over with digital logic.

Determinism vs. Probability: The Anxiety of "Probably"

The deepest friction, however, is mathematical.
For decades, we lived in a Deterministic world.

If P, then Q. This is a binary comfort. It is verifiable. It is reproducible.
AI introduces a Probabilistic world.
P(Q|P)
(The probability of Q, given P).

When I ask an AI to write code, it doesn't "know" the code. It predicts the next most likely token based on a massive dataset. It operates on vibes and statistical likelihoods.

For the neurodivergent thinker who finds comfort in patterns and rigid systems, this is a source of profound anxiety. We are moving from a system that is "Correct or Incorrect" to a system that is "Good Enough."

Old World: You spent 3 hours debugging because the logic was flawed.
New World: You spend 3 hours "prompt engineering",which is really just trying to figure out the magic words to persuade the black box to behave. And then spent another 3 hours validating that the generated code was correct (you did check, right?)

Where Do We Go From Here?

I am not a Luddite. I’m an AWS Hero, AWS Ambassador and have 10 US technical patents. I use AI every day. It is a powerful tool. But I mourn the loss of the binary sanctuary.

We are entering an era where "coding" will look less like architecture and more like diplomacy. It will reward those who are good at linguistic nuance and persuasion, skills that have traditionally favored the neurotypical. On the other hand, intuitive leaps might actually favor the neurodivergent.

For those of us who spent decades finding solace in the absolute truth of a compiler error, we have to learn a new skill. We have to learn to tolerate the ambiguity of the machine, just as we have learned to tolerate the ambiguity of the world.

But I will miss the days when, if I said exactly what I meant, the machine did exactly what I said.

Brian Tarbox holds degrees in Linguistic Philosophy and Cognitive Psychology

Why Resilience Matters

Brian Tarbox — Sat, 06 Apr 2024 20:43:45 +0000

In today's digital landscape, where businesses heavily rely on cloud-based applications to drive their operations, ensuring the resilience and reliability of these systems is of paramount importance. Resilience refers to the ability of an application or system to withstand failures, recover quickly, and maintain continuous availability, even in the face of unexpected events or disruptions.

Achieving resilience is crucial for several reasons. First and foremost, it minimizes the risk of costly downtime, which can lead to significant financial losses, damage to brand reputation, and customer dissatisfaction. Additionally, resilient systems are better equipped to handle unexpected spikes in demand, ensuring that users can access the application or service without interruptions. Furthermore, resilience contributes to overall business continuity, enabling organizations to maintain critical operations and meet their obligations, even during challenging circumstances.

Shared Responsibility Model

When it comes to cloud computing, the concept of the Shared Responsibility Model is fundamental to understanding the division of responsibilities between the cloud provider and the customer. In the case of Amazon Web Services (AWS), the cloud provider is responsible for the security and availability of the underlying cloud infrastructure, including the hardware, software, networking, and facilities that run AWS Cloud services.

On the other hand, customers are accountable for securing and managing their applications and data within the cloud environment. This includes tasks such as configuring security groups, implementing access controls, and ensuring the resilience of their applications through proper design and operational practices.

Embracing Serverless Architecture

One effective way to shift more responsibility to the cloud provider and simplify resilience efforts is by embracing a serverless architecture. Serverless computing allows developers to focus on writing code without worrying about provisioning, scaling, or managing servers. AWS services like AWS Lambda, Amazon API Gateway, and Amazon DynamoDB enable developers to build and run applications without the need for server management, reducing the operational overhead and potential points of failure.

By leveraging serverless services, organizations can offload a significant portion of the infrastructure management responsibilities to AWS, allowing them to concentrate their efforts on application logic and resilience strategies specific to their use cases.

Control Plane vs. Data Plane

When discussing resilience in cloud computing, it's essential to understand the distinction between the control plane and the data plane. The control plane refers to the management and configuration of cloud resources, such as creating, modifying, or deleting instances, load balancers, or databases. The data plane, on the other hand, encompasses the actual data processing and application logic that runs on top of the cloud infrastructure.

While AWS is responsible for the resilience of the control plane, ensuring the availability and reliability of the underlying cloud services, customers are accountable for the resilience of their applications and data within the data plane. This includes implementing strategies for fault tolerance, redundancy, and failover mechanisms to ensure continuous operation in the event of failures or disruptions.

Infrastructure Design

Designing a resilient infrastructure is a critical aspect of building resilient cloud applications. This involves implementing redundancy at various levels, such as networking, storage, and compute resources.

Networking redundancy can be achieved by leveraging multiple Availability Zones (AZs) or even multiple AWS Regions, ensuring that if one AZ or Region experiences an outage, the application can failover to another location. Additionally, services like Amazon Route 53 can be used for DNS failover, automatically routing traffic to healthy endpoints.

Monitoring, logging, and alerting are essential components of a resilient infrastructure. By implementing comprehensive monitoring solutions like Amazon CloudWatch, organizations can proactively detect and respond to potential issues before they escalate into major incidents. Centralized logging and alerting mechanisms help teams quickly identify and troubleshoot problems, minimizing downtime and ensuring timely recovery.

Security is another crucial aspect of resilience. By implementing robust security measures, such as security groups, network access control lists (NACLs), and least-privileged access controls, organizations can mitigate the risk of security breaches, which can lead to significant downtime and data loss.

Application Design

While infrastructure design plays a vital role in resilience, the application itself must also be designed with resilience in mind. Adhering to good design principles, such as loose coupling and high cohesion, can help minimize the impact of failures and enable easier recovery.

Event-driven message passing and queuing systems like Amazon Simple Queue Service (SQS) can act as buffers, allowing applications to ride out transient errors and handle bursts of traffic without disruption. Implementing idempotent operations, where multiple identical requests have the same effect as a single request, can also enhance resilience by ensuring that duplicate requests do not cause unintended consequences.

Adopting a microservices architecture can further contribute to resilience by breaking down applications into smaller, independent components. This approach allows for more granular deployment and scaling, reducing the blast radius of failures and enabling teams to update or replace individual services without impacting the entire application.

Code reviews play a crucial role in ensuring the quality and resilience of the codebase. By involving peers and subject matter experts in the review process, potential issues can be identified and addressed before deployment, reducing the risk of failures and downtime.

Designing for observability is another key aspect of resilient applications. By exposing key metrics and integrating comprehensive monitoring and logging mechanisms, teams can gain valuable insights into the application's behavior, enabling proactive identification and resolution of issues.

Infrastructure as Code (IaC) practices, such as using tools like AWS CloudFormation or Terraform, can significantly enhance resilience by enabling automated deployments, updates, rollbacks, and replacements, reducing the risk of human error and ensuring consistent and repeatable configurations.

Operational Design

Resilience extends beyond the application and infrastructure design; operational practices also play a crucial role in ensuring continuous availability and recovery from failures.

Implementing robust backup and restore strategies is essential for protecting against data loss and enabling rapid recovery in the event of a disaster. Regular testing of backup and restore processes ensures that these mechanisms function as expected when needed.

Maintaining hot, warm, or pilot light standby environments can provide additional layers of resilience, allowing for rapid failover and minimizing downtime during major incidents or planned maintenance activities.

By incorporating these principles and best practices into the design and operation of cloud applications, organizations can significantly enhance the resilience and reliability of their systems, ensuring business continuity and delivering a seamless experience to their customers.

Certification is not enough

Brian Tarbox — Sun, 31 Mar 2024 16:45:27 +0000

Obtaining AWS certifications is a valuable step in validating one's knowledge and skills in cloud computing. However, relying solely on certifications without practical, hands-on experience can be detrimental, especially in a field as dynamic and complex as cloud engineering. The metaphor of becoming a pilot illustrates this point effectively.

There are a number of different licenses, certificates and ratings one can obtain for flying. They include Private Pilot, Instrument Rating, Commercial Single-Engine, Commercial Multi-Engine and Airline Transport Pilot. Personally I am a Private Pilot with an instrument rating. This means I can fly into clouds and navigate solely with the in-cockpit instruments. I'll say that getting this rating was one of the hardest things I've ever done. I got that rating two years after getting my private license.

In those years I had: a bird strike, an engine failure, landing light failure at night, door blow open, and fuel cap dislodge. We used to joke that my call sign was "303EC no I'm not declaring an emergency".

Now, imagine a scenario where an individual obtains all five of those licenses within a month through a flight school that promises a fast-track program. While they may have acquired the theoretical knowledge required for the certifications, they would lack the crucial practical experience of logging hundreds or thousands of hours in the air, handling various weather conditions, and dealing with unexpected situations. Would you feel comfortable entrusting your safety to such a pilot? The answer is likely no (or perhaps Hell No).

Similarly, in the realm of cloud computing, certifications alone do not equip individuals with the practical experience necessary to design, deploy, and maintain robust, scalable, and secure cloud solutions. Hands-on experience is invaluable, as it allows professionals to encounter real-world challenges, troubleshoot issues, and develop problem-solving skills that cannot be fully replicated in a certification exam environment.

A question I like to ask during an interview is "tell me about time when you were shocked by an AWS bill at work or in your personal account". If they've never encountered this there is a fair chance that they don't have much actual experience.

The Value of Learning from Failures and Mistakes
Practical experience not only provides exposure to different technologies and services but also offers opportunities to encounter failures and make mistakes – invaluable learning experiences that cannot be gained from certifications alone.

In the pilot metaphor, a seasoned pilot with thousands of hours of experience has likely faced various situations such as having to cancel or divert flights due to adverse weather conditions, dealing with mechanical issues, or navigating through unexpected airport closures. These experiences, although challenging at the time, contribute to the pilot's ability to make informed decisions, remain calm under pressure, and prioritize safety.

Likewise, in cloud computing, practical experience exposes professionals to a wide range of potential failures and mistakes. For instance, a cloud engineer might encounter scenarios such as misconfigured security groups leading to data breaches, improperly sized resources resulting in performance bottlenecks, or unexpected spikes in costs due to inefficient resource management. While these situations can be frustrating, they provide invaluable learning opportunities that cannot be replicated in a certification exam.

Encountering failures and making mistakes allow cloud professionals to develop critical problem-solving skills, troubleshooting techniques, and a deeper understanding of the intricacies of cloud services. They learn to anticipate potential issues, implement proactive monitoring and alerting mechanisms, and develop contingency plans to mitigate risks.

Moreover, these experiences foster a mindset of continuous improvement and a commitment to adhering to best practices. Cloud professionals who have faced real-world challenges are better equipped to design and implement robust, scalable, and secure cloud solutions that can withstand various failure scenarios.

In conclusion, practical experience in cloud computing is not only about gaining exposure to different technologies and services but also about encountering failures and making mistakes. These real-world challenges provide invaluable lessons that cannot be learned from certifications alone. Just as a seasoned pilot has faced and learned from various adverse situations, a cloud professional who has experienced and overcome failures and mistakes is better equipped to design, deploy, and maintain robust and resilient cloud solutions.

Certifications can validate experience but they are not a substitute for it.

Certification Tips from an AWS Hero

Brian Tarbox — Tue, 26 Mar 2024 20:18:21 +0000

Earning an AWS certification is a great way to validate your cloud skills and advance your career. You should keep in mind that getting a certification isn't enough to get you a job. You need experience to do a job, and the certification should be seen as validating that experience rather than replacing it.

Also, the exams are challenging and require dedicated preparation. Here are some study tips I've used to get both AWS Professional certifications, the Data and Security Specialities and the (now deprecated Alexa specialty)

Take an Initial Practice Exam

Before you start studying, take an initial AWS practice exam to get a baseline score and identify which areas you need to focus on. The practice exams from sources like Tutorials Dojo closely mirror the actual certification exams in terms of question style and difficulty. Your initial score will likely be low, but that's okay - it will show you which domains need more work.

Take a Class from an AWS Expert

While you can self-study using the free resources on the AWS site, many people find it valuable to take a structured course taught by an instructor who is recognized as an AWS expert. Courses from providers like Cloud Academy and Linux Academy provide in-depth coverage of the exam objectives and include hands-on labs. An expert instructor can explain complex topics and share insights from real-world experience. Personally I'm a huge fan an classes by Adrian Cantril, Stephane Maarek and Neal Davis.

Take Notes by Hand

As you go through your training course or other study materials, take notes by hand rather than typing them. The physical act of writing has been shown to improve learning and retention. Use a notebook or loose-leaf paper, and write down key facts, definitions, diagrams, and anything else that will help cement the concepts. I've become a huge fan of the reMarkable tablet for note taking. The reMarkable is just for note taking, you can't check email on it and it has no browser. That is it's strength ... it doesn't allow you to get distracted. Writing long hand is much slower than typing but that gives your brain more time to incorporate the concepts.

Use GenAI if you are confused

Don't be afraid to ask your favorite chat bot (mine is Perplexity.ai) to explain things to you. Sometimes just reading something that is phrased differently can make all the difference in understanding. Try prompts such as "when should I pick Kenesis Firehose rather than Kenesis Data Streams".

Take Practice Tests in Review Mode

One of the most effective ways to study is by taking practice tests in review mode so you get immediate feedback and explanations. Go through each practice question slowly and use both correct and incorrect answers as a learning experience

Analyze Questions You Missed

For any practice test questions you get wrong, go back and analyze why you missed them. Write out in longhand the reasons your incorrect response was wrong, and explain why the right answer is correct. This reinforces the underlying concepts. Don't just skim over missed questions - dig into them deeply.

Also, dig deeply into the questions you get right!

For each of the incorrect answers you will get a reason why the answer is wrong. You can use the factoid in other questions. If an answer is wrong because Service A can't be a source for Source B, you can keep that in mind for all following questions.

Revise your notes

In the US we say study but in the UK they use the term "revise". The key difference is that studying often just means re-reading your notes which is a rather passive activity. Revising by contrast, involves actually re-writing your notes. You might for example have gotten several questions wrong about Kinesis and the corresponding notes are spread out across multiple pages of notes. Taking the time to revisit these notes and gather them together forms a stronger memory map than just re-reading them all.

Review Notes and Repeat

Repeat this cycle of practice testing and reviewing notes until you consistently score above the passing mark. By following this study plan using practice tests, expert-led training, handwritten notes, and focused analysis of missed questions, you'll build the knowledge required to earn your AWS certification. The effort is challenging but extremely rewarding.

Where to take the exam

Keep in mind that these exams can take over three hours. If you opt for a remote testing experience you must

have a completely clear desk
no external monitors
no food or drink
no bathroom breaks - actually you can't even stand up
you are not allowed to speak or even move your mouth - because you might be giving information about the test to a hidden recorder.

If you choose this option put a sign on your door saying no interruptions and hope that your kids and pets can read! Your local library might be a better bet for a venue for a remote exam.

If there is an exam center near you that is often a good option. You are allowed a bathroom break and can bring in water (in a clear, sealed bottle).

Enhancing Data Security with S3 Object Lock

Brian Tarbox — Fri, 22 Mar 2024 12:51:57 +0000

As organizations increasingly store critical data in Amazon S3, the risk of cyber threats such as ransomware attacks escalates. According to SonicWall, there were 1,748 recorded ransomware attempts per customer during the first three quarters of 2021. Furthermore, a report from Positive Technologies states that cybercriminals can penetrate 93 percent of company networks. Ransomware attacks involving S3 data often involve stealing or encrypting the victim's data, holding it hostage until a ransom is paid.

While companies traditionally employ a layered approach to protecting their S3 data, including bucket policies, IAM roles, service control policies, and permission boundaries, these measures may not be sufficient if an attacker gains access to an administrative account. This is where S3 Object Lock comes into play.

Understanding S3 Object Lock

S3 Object Lock is a mechanism that prevents an object version from being deleted or modified. It is only available for versioned objects and does not prevent the creation of new versions. However, it can guarantee that a specific version will not be altered or deleted.

It is important to note that objects in S3 are immutable. When a PUT operation is performed on a non-versioned object, a new object is created with the same key, and the previous object is deleted. With versioned objects, the previous object gets a version tag, and the newly PUT object becomes the current version.

Object Locks can exist for either a specific time period called the "retention period" or indefinitely via a "legal hold." During an object's retention period or when a legal hold is enabled, the object cannot be deleted. A retention period can be extended as needed, and a legal hold can be disabled.

Governance and Compliance Modes

Object Lock operates in two modes: governance and compliance. In governance mode, a user with the s3:BypassGovernanceRetention permission can shorten a retention period, effectively removing the lock. Similarly, a user with the s3:PutObjectLegalHold permission can remove a legal hold.

However, in compliance mode, no one, including AWS, can shorten the retention period. If an Object-Locked object is set to compliance mode with a five-year retention period, that object will remain locked for five years, regardless of any attempts to modify or delete it.

	Governance Mode	Compliance Mode
Change Legal Hold	Requires special permission	Requires special permission
Extend Retention Period	Requires special permission	Requires special permission
Shorten Retention Period	Requires special permission	No one, including AWS, can shorten the retention period

Comparison with Glacier Vault Lock

While S3 Object Lock has similarities to Glacier Vault Lock, their respective use cases differ. Vault Lock is designed to protect a Glacier vault from modification, assuming that the objects stored in the vault are unlikely to be accessed frequently due to their petabyte scale, such as genomic or machine learning workloads.

On the other hand, Object Lock is used to protect data that may be subject to active usage, such as drug trial data, which is heavily used initially and may be required for re-calculation by regulatory bodies like the FDA.

Initiating S3 Object Lock

Object Lock only works on buckets with versioning enabled, and it can only be enabled for a bucket when it is first created. However, AWS can enable Object Lock for an existing bucket upon request.

To lock a large number of objects, lifecycle rules or S3 Batch Operations can be used. Lifecycle rules allow a limited set of operations to be performed on all objects in a bucket or those matching a filter pattern. S3 Batch Operations support Object Lock operations and can perform actions on a list of objects specified in a manifest, which can be a CSV file created manually or via S3 Inventory.

Easing into Object Lock Usage

Before implementing Object Lock, it is essential to understand the usage patterns of your objects. If your objects are short-lived, undergo frequent updates, or are not mission-critical, Object Lock might not be appropriate.

One approach is to start by setting short retention periods and using governance mode on a select group of objects. If applications encounter failures under this regime, an administrator can remove the lock or shorten the retention period until the application's behavior is understood. After a trial period in governance mode, organizations can switch to compliance mode, initially with relatively short retention periods before moving to longer periods.

Conclusion

S3 Object Lock provides robust protection for critical data stored in Amazon S3, helping organizations mitigate the risks of ransomware attacks and data breaches. While not suitable for all objects, it is worth considering for sensitive data such as personal health information (PHI), personally identifiable information (PII), or other business-critical or privacy-sensitive data. By implementing Object Lock, organizations can enhance their data security posture and provide peace of mind to their CISOs and stakeholders.

Running an Inclusive and Engaging AWS User Group

Brian Tarbox — Thu, 21 Mar 2024 22:32:15 +0000

AWS User Groups are a great way for cloud enthusiasts to come together, learn from each other, and grow their skills. However, running a successful user group requires more than just technical know-how. It's essential to create an inclusive environment that welcomes people from all backgrounds and encourages active participation from everyone. Here are some tips to help you run a better AWS User Group:

Attract a Culturally Diverse Set of Speakers

Diversity in speakers not only brings fresh perspectives but also makes your user group more inclusive and welcoming to a broader audience. Actively seek out speakers from underrepresented groups in tech, such as women, people of color, LGBTQ+ individuals, and those with disabilities.

Reach out to local universities, coding bootcamps, and organizations that support diversity in tech to find potential speakers. You can also leverage social media platforms and online communities to connect with a diverse pool of AWS experts and enthusiasts.

When inviting speakers, be mindful of their preferred pronouns and any accessibility needs they may have. Ensure that your event venue is accessible and that you provide accommodations, such as sign language interpreters or captioning, if needed.

Create a Safe Space for All Attendees

Establishing a code of conduct and enforcing it consistently is crucial for creating a safe and welcoming environment for all attendees. Your code of conduct should clearly outline expected behavior, define what constitutes unacceptable conduct, and specify the consequences for violations.

Encourage attendees to report any incidents or concerns they may have, and have a designated team ready to address them promptly and professionally. Consider having a quiet room or area where attendees can take a break if they feel overwhelmed or need a moment of respite.

Encourage Introverts to Engage

While user groups are social events, they can be overwhelming for introverts or those who prefer quieter settings. To ensure their engagement, consider the following strategies:

Provide opportunities for written or online participation, such as Q&A platforms or collaborative note-taking tools.
Encourage small group discussions or breakout sessions where introverts may feel more comfortable contributing.
Avoid putting introverts on the spot by calling on them unexpectedly. Instead, give them time to prepare their thoughts before asking for their input.
Offer virtual attendance options for those who prefer to participate remotely.

Defend Against Zoom Bombing

With the rise of virtual events, user groups must be vigilant against disruptive behavior like Zoom bombing. Here are some preventions:

Require registration and approval for attendees to join the meeting.
Enable the waiting room feature and have a co-organizer monitor and admit attendees. This can be challenging if you don't have a co-organizer.
Disable screen sharing for non-hosts and limit other potentially disruptive features.
Have a co-organizer dedicated to monitoring the chat and removing disruptive participants if necessary.
Be aware of the security options provided by your platform of choice

Manage Meeting Dynamics

Running an engaging and productive meeting requires careful planning and facilitation. Here are some tips:

Set clear expectations and ground rules at the beginning of the meeting, such as respecting others' opinions and avoiding interruptions... unless you actively want people to interject with questions during the talk
Use a structured agenda and timekeeping to ensure that the meeting stays on track and all topics are covered.
Encourage participation by asking open-ended questions and calling on quieter attendees to share their thoughts (without putting them on the spot).
Be mindful of your body language and tone, and ensure that you're not inadvertently favoring or dismissing certain attendees.
Assign a dedicated facilitator or moderator to manage the flow of the meeting and ensure that everyone has a chance to contribute.

Select Engaging Meetup Topics

Choosing the right topics is crucial for keeping your user group engaged and attracting new members. Here are some tips:

Survey your members regularly to understand their interests and pain points.
Stay up-to-date with the latest AWS releases, updates, and industry trends to identify relevant and timely topics.
Alternate between beginner-friendly and advanced topics to cater to attendees with varying skill levels. Some meeting are structured with two talks: an introductory level talk followed by a Deep Dive.
Consider inviting guest speakers from AWS or partner organizations to share their expertise on specific services or use cases. One thing to watch out for here is that AWS speakers can be "too polished" ... warn AWS speakers that your group will want to ask questions during the talk.
Encourage members to suggest topics or volunteer to present on areas they're passionate about or have experience with.

Running a successful AWS User Group requires a combination of technical expertise, organizational skills, and a commitment to creating an inclusive and welcoming environment. By following these tips, you can foster a vibrant community of AWS enthusiasts who feel valued, engaged, and empowered to learn and grow together.

Are LLM's essentially Teenagers?

Brian Tarbox — Thu, 22 Feb 2024 20:27:38 +0000

Introduction

Diving into the world of Large Language Models (LLMs) might feel like trying to have a heart-to-heart with a teenager. Both come with their own unique capabilities and peculiarities in their use of language, sprinkled with moments of baffling decision-making. Imagine trying to untangle the world through the eyes of a teen—full of confidence, sometimes too much, ready to take on complex conversations but occasionally tripping over their own shoelaces. This piece takes a light-hearted yet insightful stroll through the similarities between the mysterious minds of LLMs and the unpredictable nature of teenage behavior.

Similarities

Just like teenagers stepping into the big, wide world without much real-life experience under their belts, Large Language Models (LLMs) navigate the vast digital universe with a blend of overconfidence and, let's say, a vivid imagination. Both are in their formative years, so to speak, learning on the go and sometimes making decisions that leave us scratching our heads. This exploration into their common ground isn't just for fun—it sheds light on the quirks and capabilities of our AI counterparts.

Now, think about the last time you tried to follow the thought process of a teenager. Their decisions are shaped by a cocktail of factors: brain development, peer pressure, and personal experiences, to name a few. It's a puzzle that's tough to solve, mirroring the complexity of understanding how LLMs arrive at their conclusions. Even though feedback can guide them in new directions, peeling back the curtain to reveal the "why" behind their choices often feels like an exercise in guesswork.

Diving into conversations with Large Language Models (LLMs) or teenagers can sometimes feel like talking to someone who's convinced they know exactly where you're coming from—regardless of whether they actually do. Both LLMs and teens can come across as a bit too sure of themselves, often missing the mark on gauging the other person's level of expertise. LLMs, with their impressive language skills, still haven't mastered the art of recognizing who they're chatting with, much like a teenager confidently explaining the internet to a software engineer.

When it comes to learning, LLMs go through a kind of digital "growing up" that's reminiscent of human evolution but at hyper speed. They absorb vast oceans of text to get a grip on human chatter, a process that mirrors the slow, meticulous journey of human language development over millennia. This training is no small feat; it's a massive investment in understanding and mimicking the way we communicate. It highlights not just how LLMs learn to talk the talk but also puts into perspective the incredible journey of human language evolution—showing that both teenagers and AI have a lot of growing up to do, each in their own complex, sometimes overconfident way.

Just as teenagers navigate the tricky waters of growth, guided by the cheers and jeers from their world, Large Language Models (LLMs) and Generative AI learn to refine their digital personas through feedback. It's a bit like how a teen lights up with a well-timed compliment or mulls over a piece of constructive criticism, adjusting their course slightly with each new piece of advice. LLMs, fed on a diet of endless data, tweak their responses and improve their chatter based on the digital applause or boos they receive. This process is akin to a teenager's journey of self-discovery and adaptation, absorbing life's lessons and evolving. Both LLMs and teens show us the power of feedback—not just in shaping AI's ability to communicate, but in reminding us of the timeless act of learning from the responses we gather in day to day communication.

Differences

When it comes to solving a problem, Large Language Models (LLMs) act as digital detectives, sifting through mountains of data, applying intricate computational formulas to sniff out patterns and spit out answers. Their method is all about crunching numbers and matching patterns, which means while they often hit the nail on the head with contextually spot-on replies, figuring out the "why" behind their conclusions is a bit like trying to read tea leaves.

Then there are teenagers, whose approach to problem-solving is as layered as their personalities. Imagine them navigating a maze, where each turn is influenced by a mix of sharp cognitive skills, the social compass set by their peers, and the rich tapestry of their personal experiences. Their decisions emerge from a blend of thought, education, personal growth, and social interaction—making for a problem-solving style that’s holistic and grounded in experience.

While LLMs dissect problems with the precision of a computer algorithm, teenagers tackle them with a depth that comes from living through experiences, feeling every high and low, and learning from the social world around them. This distinction highlights not just the difference in how they arrive at solutions, but the contrast between the logical, pattern-based reasoning of AI and the complex, emotionally rich decision-making of human beings.

Healthy Skepticism of their Output

In the world of advice-giving, both Large Language Models (LLMs) and teenagers hold a unique place. They're like eager helpers, ready to chime in with insights or solutions. However, taking their words as gospel might lead you down a rabbit hole. LLMs, for all their linguistic finesse, sometimes echo the biases and errors marbled throughout their vast training data. It's a bit like getting directions from a well-meaning friend who's never actually been to the place they're describing.

Teenagers, with their boundless energy and fresh perspectives, also come with their own set of disclaimers. Their advice, while often insightful, carries the limitations of their life experiences. It's like they're seeing the world through a kaleidoscope—vibrant and full of potential, yet not always clear or accurate.

Both LLMs and teens share a common trait: a confident exterior that doesn't always match up with the depth of their knowledge. This confidence, while admirable, can sometimes lead us astray, especially when it comes to sifting through the information they provide. LLMs don't always know when they're out of their depth, spinning out answers without the ability to critique their own sources. Teens, influenced by their social circles and their own budding self-assurance, might not always question their conclusions with the rigor needed.

Peeling back the layers to understand why they've landed on a certain piece of advice is another challenge. With LLMs, you're dealing with a black box of algorithms and data; with teenagers, a complex web of thoughts and influences. Both can leave you puzzled, trying to trace the path from question to answer.

Navigating the insights offered by both LLMs and teenagers requires a discerning eye. It's a dance of valuing their input while also recognizing the need for a pinch of skepticism and a healthy dose of follow-up questions.
Making Use Of This Insight
Navigating conversations with Large Language Models (LLMs) and teenagers can sometimes feel like trying to solve a mystery without all the clues. But, just like any good detective, knowing the right questions to ask can make all the difference. Being clear and specific in your queries, such as using prompts like "How did you come up with that?" or "Explain it like I'm 10," can turn a vague answer into a treasure trove of insights. It's about encouraging a deeper dive into their thought processes, whether you're dealing with a sophisticated AI or a savvy teen.

Asking for elaboration with phrases like "Can you tell me more about that?" or "Could you put that another way?" can also work wonders. These techniques don't just apply to extracting more meaningful responses; they're about fostering understanding and clarity, regardless of whether you're interpreting the output of an LLM or decoding the latest teen lingo.

And then there's the lighter side of the comparison—the investment. Training an LLM can be as financially overwhelming as planning for a teenager's college education. It's a humorous but apt analogy that highlights the cost and commitment behind these endeavors. Sometimes, opting for a less intensive route—a smaller AI model or a more affordable educational path—might not just save resources but also turn out to be the smartest choice in the long run. In both scenarios, the key is to weigh the return on investment carefully, reminding us that bigger or more expensive isn't always better.

Conclusion

The comparison between Large Language Models (LLMs) and teenagers isn't just witty banter; it's a gateway to a deeper understanding of the complexities we face when interacting with advanced AI. Recognizing their shared traits—like how they respond to feedback, their sometimes misplaced confidence, and the opaque nature of their decision-making—can equip us with a more layered approach to engaging with LLMs. This perspective helps peel back the curtain on the enigmatic world of artificial intelligence, revealing not just its potential but also its limitations.

Indeed, as we race to keep up with the breakneck pace of AI development, any tool that demystifies our "soon to be robot overlords" is invaluable. By embracing this analogy, we're not just making sense of LLMs; we're paving the way for the creation of ethical standards and effective strategies that harness the power of LLMs across various fields. This not only enhances our grasp of their behavior and skills but also ensures that as we move forward, we do so with a keen awareness of the responsibility that comes with wielding such transformative technology.

Unraveling the Innovations of AWS Caspian, Grover and Time Sync

Brian Tarbox — Wed, 24 Jan 2024 17:28:31 +0000

Caspian, Grover and Time Sync are key features in the march towards "real" serverless.

Introduction to AWS Caspian
Caspian is a pioneering technology developed by AWS, primarily for the Aurora Serverless platform. It represents a paradigm shift in resource allocation and management for serverless databases. The technology is built upon several key innovations:

New Hypervisor Technology
At the heart of Caspian lies a newly developed hypervisor, distinct from traditional ones like AWS's Nitro. Traditional hypervisors allocate a fixed set of resources to an instance. In contrast, the Caspian hypervisor dynamically allocates and reallocates resources based on the database's real-time needs. This flexibility ensures that databases always have the necessary resources, thereby optimizing performance and efficiency.

Advanced Heat Management System
Caspian's heat management system is a cornerstone of its innovation. It oversees the real-time allocation of physical resources, ensuring that databases have access to necessary resources when required. This system is pivotal in managing database migrations between physical hosts with minimal performance impact, allowing for a smooth and efficient scaling process.

Cooperative Oversubscription Technique
A unique aspect of Caspian is its use of cooperative oversubscription. This approach allows each instance to support the maximum memory available on the host. However, physical memory allocation is based on the actual needs of the database running on the instance, not on a predetermined allocation. This technique ensures efficient resource utilization and reduces wastage.

Dynamic Resizing Capability
Perhaps the most striking feature of Caspian is its ability to enable Aurora Serverless databases to resize within milliseconds in response to changing workloads. This capability makes the database highly elastic, catering to the fluctuating demands of modern applications with unprecedented efficiency.

Exploring AWS Grover
While Caspian revolutionizes serverless computing, Grover is transforming the world of database storage and logging. Grover is an internal, optimized distributed storage system for Amazon Aurora. It brings several advancements:

Disaggregated Storage System
Grover introduces a disaggregated storage system, allowing Aurora to decouple the database from its storage. This separation leads to more efficient data handling and processing, enabling Aurora to manage data at scale more effectively.
Innovative Approach to Database Logs
Traditionally, databases log locally. Grover changes this by sending each log entry to a remote system. This system ensures the durability and availability of these logs across multiple Availability Zones, enhancing data reliability and recovery capabilities.
Data Structure Replication
One of the most innovative aspects of Grover is its ability to process the log and replicate the database's internal memory structures on a remote system. These structures can be sent back to the Aurora database as needed, significantly reducing the I/O demands and boosting overall efficiency.
Enhanced Performance and Durability
Grover's architecture offers superior performance, scalability, and durability compared to traditional database systems. It allows Aurora to provide higher throughput and resilience, making it a robust choice for modern applications that demand reliability and speed.

Impact on Cloud Computing
The implications of Caspian and Grover on cloud computing are profound. Caspian reimagines how resources are allocated and managed in a serverless environment. Its dynamic resizing capability ensures that serverless databases can adapt to workload changes swiftly and efficiently. This innovation allows businesses to manage their databases with unprecedented agility and cost-effectiveness.

Grover, on the other hand, revolutionizes data storage and logging for distributed databases. Its approach to handling database logs and the replication of data structures enhances the performance and durability of databases. The technology enables businesses to handle massive amounts of data with improved efficiency and reliability.

Time Sync
Caspian and Grover depend on the ability to safely perform distributed database writes. This requires a globally agreed upon time synchronization ability. In the past such a system was either too expensive or not accurate enough to allow for high through distributed writes.

The AWS Time Sync service is a timekeeping service designed to offer both precision and accuracy in time synchronization across AWS services and instances. It achieves this through a custom chip built into Nitro (of course). These on-board chips run at an incredible precision and because they're part of Nitro they are more efficient than past attempts at synchronization ensuring that the time delivered is consistent and accurate to a few nanoseconds globally.

You may not directly interact with any of these three services but your future applications may well depend on them.

At the core of the service is custom-designed infrastructure integrated with Nitro, including specialized reference clocks and a dedicated time synchronization network. This network distributes the timing pulse directly to each EC2 server, bypassing common sources of variability and ensuring ultra-precise timekeeping.

The latest version of the Time Sync service, as announced, brings time synchronization to within microseconds of UTC. This level of accuracy is pivotal for applications that require ultra-precise time measurements, such as high-frequency trading platforms and scientific experiments.

Conclusion

In conclusion, AWS's introduction of Caspian, Grover, and the Time Sync service represents a monumental stride in the realm of cloud computing, addressing key challenges in serverless computing, database management, and time-sensitive operations. Caspian and Grover, with their dynamic resource allocation, efficient data handling, and scalability, are paving the way for more robust, efficient, and cost-effective cloud solutions. Simultaneously, the Time Sync service not only strengthens AWS's existing offerings but also establishes a foundational component for emerging technologies like quantum computing.

The Future of Alexa Skills

Brian Tarbox — Mon, 02 Oct 2023 15:52:08 +0000

TL/DR: The Alexa skill ecosystem is evolving, with different outcomes for large brands vs. independent developers.

When you ask Alexa for a something that request can be processed in several ways. First Party responses are those created by the Alexa team itself and are in general what you get when you ask, “what’s the weather” or “what time is it”. Third Party responses are those handled by independent developers who write “skills” (think apps). These are generally what you get when you mention a skill or brand name, such as “ask Uber to get me a ride”.

The skill ecosystem can be thought of as similar to Apple’s App Store. Developers create skills, get them approved (certified) and then users request them. Until recently all skills were free to use but could include the ability to charge for things while running the skill. You could charge for an extra sleep sound, or an extra life in a game, or for freemium features. Recently Amazon added the ability to require a payment just to use a skill.

Another similarity to the App Store is that discoverability is a challenge. With several hundred thousand apps or skill it’s a real challenge to be discovered. Amazon will promote your skill once it becomes popular but that first step is up to you.

A significant difference from the App Store is that most Alexa users are unaware that skills exist or that there actually is an Amazon Skill store. As you can see, the vast majority of skills receive almost no user reviews.

In the attempt to make first vs third party responses transparent to the end user they forgot to educate the users that there was such a thing as skills. Part of the problem is that most consumers either didn’t understand skills or didn’t even know they existed. Most people view the Alexa device as a black box that answered their questions. The intricacies of first vs third party responses were irrelevant to most users.

This became more of an issue (or perhaps less of one depending on your point of view) when they added the ability to run a skill without first installing it. It used to be that to run one of the Sleep Sounds skills (a very popular category) you had to first enable the skill in the Skill store. Later it became possible to just say “Alexa tell sleep sounds to play” and it would transparently enable the skill. While this is good at reducing friction in the short term it contributed to the lack of end user education about skills.

It also gave Amazon more control over the ecosystem because there are now several hundred Sleep Sounds skills from which Amazon gets to select from when a user asks for sleep sounds. Amazon further muddied the waters via what is called a Name Free Interaction. Rather than saying “Alexa ask Dominos to order me a pizza” you could just say “Alexa order me a pizza” which, invisibly to you, would talk to the Dominos skill. This was great but opened the door to question: suppose Pizza Hut also had a skill, which one would get the pizza order? The answer is that skill selection is an Amazon black box.

I have a skill that answers questions about Premier League Football (British soccer for Americans). You can say “Alexa ask Premier League about upcoming fixtures”. I also have Name Free Interaction that allows you to say, “Alexa what are the upcoming premier league games”. However, its entirely up to Amazon if they send that request to my skill, to another soccer skill, or just handle it themselves.

Until last year Amazon had something called the Developer Rewards program in which they would pay developers a nominal amount if their skill had become popular. The idea was that great skills helped the entire ecosystem and so developers should be rewarded. My own soccer skill got over a million invocations and I used to get $300-$400 a month from Amazon for making Alexa more helpful for all these users. The program was however cancelled last year so unless your skill charged for things you made no money for it. This left many of us wondering why we were investing time into skill development.

To replace Developer Rewards Amazon added In-Skill-Purchases. From within a skill, you could make purchases, similar to in-game purchases on your phone. Just like Apple, Amazon takes a percentage of the sale. They reduced their cut from 30% to 20% when the Developer Rewards program was eliminated. They also added the ability to make purchases of physical items from the Amazon retail store from within a skill. As an example, you can purchase various team scarfs from with my soccer skill.

Some skill developers have used these new features to make significant revenue. A hypnosis skill lets you purchase hypnosis sessions for smoking cessation, weight loss, etc. Leaving aside the question of whether being hypnotized by the voice assistant is a Good Thing, this skill’s developer makes a nice income. However, for many skill developers this emphasis on monetization flies in the face of the “Delight The Customer” mantra that Amazon espouses.

Just this week Alexa announced a set of ChatGPT/Large-Language-Model inspired features to make interactions more natural and context aware. You might not even need to say the “Alexa” word because the device could detect your body language and tell that you were speaking to the device. This will be game changing in many ways, not the least of which in that it should eliminate the accidental activations of the system when another person (or zoom call or TV show) says “alexa”. Basically, the notion of having to say the Alexa word or say a skill name seems to be going away. Is this a Good Thing?

For most users of the device, it probably is a very good thing. It should reduce some of the friction currently impeding voice interactions. If the system gains more context, it may reduce the incidence of Smart Home errors. For example, I have smart lights in my bedroom and the last thing I say at night is “Alexa turn lights off”. However, the system seems to find “on” and “off” hard to distinguish and so often believes I’ve asked for the lights to be turned on. If the system had access to the smart home context and knew that the lights were already on it could (perhaps) draw the conclusion that it was more likely that I was saying “lights off”.

While I’m not sure how Dominos and Pizza Hut are going to fight it out for the “I want a pizza” utterance I’m hard pressed to see how this helps the independent developer. Just as the Apple App Store fills in gaps in the native capabilities of the phone, skills do things that Alexa can’t do by itself. As Alexa gains the ability to do more things that almost certainly reduces the room for other skills. We must convince Alexa to pick our skill when a user says something that our skill could handle. I know there are lots of places other than my skill where one could find soccer scores. On the other hand, my skill does offer several custom metrics and graphs that really aren’t available elsewhere. The trick, as always, is discovery. If a developer created a skill that searched local pizza shop prices to find the best deal, would Alexa ever select that skill? That’s hard to say. I think small developers are going to have to learn new tricks to survive.

Finally understanding Outposts and Local Zones

Brian Tarbox — Wed, 13 Sep 2023 12:43:01 +0000

I've been developing with AWS for over ten years, have about five certifications,I run the local Boston User Group and I've been to seven re:Invents. I'm pretty deep into the platform. And yet, there are still lots of services I've never touched. For example, though I actually have written software that runs on the National Polar Orbital Observatory I've never used Ground Station.

Some of the AWS services are just so specialized that unless you've had a customer need for them its unusual to have experience with them. Until recently Local Zones and Outposts were in that category for me.

I could certainly tell you what they were but without any real understanding. An Outpost is an AWS supplied rack running a subset of AWS services that you install in your on-premises location. It has very specific requirements, as you might imagine, but basically you can install AWS locally. You can create subnets on your Outpost and specify them when you create AWS resources such as EC2 instances, EBS volumes, ECS clusters, and RDS instances. Instances in Outpost subnets communicate with other instances in the AWS Region using private IP addresses, all within the same VPC. Each Outpost has a network connection back to an AWS region as well as to the rest of your on-premise network.

A Local Zone solves a similar problem by providing an AWS presence closer to end users so as to provide single digit latency. While we're not supposed to know exactly where the actual hardware for a given AZ lives, the location of a Local Zone is the whole point.

Outposts can be a standard 42U rack or a 1U or 2U server. Racks support more AWS services than Servers. As an example EC2 and ECS are supported by both Racks and Servers while EKS is not available on servers.

Without having used either or even really understood a use case for them however, my knowledge was quite thin.

Recently however I had the opportunity to attend the AWS/Riot Games Valorant World Championship. A group of AWS Heros and Community Builders were invited to attend the games and get behind the scenes looks at the technology behind the games.

If you're new to Valorant its a five v. five first person shooter game with an enormous following. We watched the finals in an 18,000 seat arena with 189,000 twitch followers. The folks at Riot Games have an incredibly tuned system that runs at a blistering 132 frames a second, giving them 7.5 milliseconds per frame. Latency, especially unpredictable latency is a game killer for Riot. If a player has just a few tens of milliseconds of advantage they can "peek" around a corner and get back before their opponent can see them.

For the in person championship Riot ran the entire tournament on a single outpost at the tournament site. All ten players had a latency that was as identical as can be measured. The tournament team did say that if the single server were to go down they would be dead in the water. They actually had a security guard preventing anyone from going physically anywhere near the server!

For general game play Riot has a system for finding a location to run a game that provides all of the players equitable latency. So, for example a game between players in Boston might run in US-East1 while a game with players in New England and Florida might run in Ohio. These games will not have 7.5 millisecond latency but the latency will be equitable.

The Riot Games strategy is to add Outposts as needed but to then encourage AWS to add Local Zones in those locations. A Local Zone is a middle ground between "regular" AZs which are of course fully AWS managed and an Outpost which lives in your site.

An Outpost is not a cheap solution. In large part this is due to the fact that you are running an Outpost 24x7 as you are not sharing it with anyone. An EC2 configuration with 4 m5.12xlarge costs over five thousand dollars per month, while a 12 r5.24xlarge configuration runs nearly $25,000/month and 380 TB of S3 costs just under $40,000/month..

Local Zone pricing is different because you are back to a pay as you go model. Running an m5.8xlarge instance in the Boston Local Zone costs $1.92 / hour. This is slightly higher than the normal regional cost of $1.536 /hour but much lower than the $6.9 / monthly cost for a similar Outpost.

So, Local Zones are more expensive than "normal" regions, and Outposts are more expensive than Local Zones, but there are times when the trade off is worth it.

If we're got your attention and would like to learn more the Boston AWS User Group is planning on an Outpost/Local Zone Deep Dive in mid November.

What Can Alexa Learn from LLMs?

Brian Tarbox — Mon, 14 Aug 2023 14:35:05 +0000

Language Learning Models (LLMs) have drastically transformed our perception of machines and their grasp over language. With models like OpenAI's GPT series and AWS’s Bedrock, we've witnessed a sea change in the capabilities of machines to comprehend and generate human-like text. An integral feature ingrained in these LLMs is their capacity to utilize context. When prodded for further dialogue or refinement, they typically yield better, more focused answers, a testament to their ability to adapt and learn. However, widely used consumer interfaces like Amazon's Alexa do not seem to make any use of context. This may explain the limits of their adoption by consumers.

The Contextual Prowess of LLMs
Advanced LLMs capture the essence of context through various means:

• Iterative Refinement: Upon receiving a query, LLMs produce a response based on their vast training data. If the user feels the answer isn't satisfactory and asks for more detail or clarity, the LLM can delve deeper, offering a refined and potentially more accurate answer.

• Handling Ambiguity: Natural language is often ambiguous. Faced with multifaceted statements, LLMs can utilize prior interactions or seek further clarity to pinpoint the user's exact intention.

• Adapting to Conversational Flow: Unlike older chatbots that treated each user input in isolation, LLMs can maintain a semblance of conversational continuity, building upon prior exchanges to ensure a cohesive and contextual dialogue.

Alexa Does Not Contextualize
One type of failure is that Alexa treats each interaction as standalone “conversation”. If you ask it the same question repeatedly it generally gives the same repetitive answer, even when you tell it that the answer was wrong.

User: “play xxxx”
Alexa: “playing yyyy”
User: “no, play xxxx”
Alexa: “playing yyyy”

You can keep repeating “no” all day and Alexa will never change its response.

Another way it lacks context is in Home Automation. The words “on” and “off can sound very similar. So, imagine your lights are on and you say: “turn lights off”. Alexa’s natural language understanding (NLU) can easily hear “turn lights on” and conclude there is nothing to do. If the user repeats the instruction Alexa simply does not have the ability to notice that: if the lights are on its far more likely that the user said “off” rather than “on”. So, Alexa continues to decide there is nothing to do, and the user gets more and more frustrated.

Now, all of this applies primarily to what we call First Party interactions, which means interactions you have with Amazon developed features. These are the requests for things like “what is the weather”, “set a timer”, or “when is the next Red Sox game.” No context is thought to be required for these requests as they are completely standalone.

By contrast, there are over a hundred thousand “skills” (Amazon’s word for applications) written by independent developers. These skills often do maintain context, but the extent of that context is hit and miss. In my own Premier League skill for example (which covers UK Football), if you ask about the “Red Sox” the skill reminds you that you’re talking to a football skill and the Red Sox are a baseball team. If this was a First Party skill Alexa would likely not respond at all.

A third way that Alexa ignores context is its lack of emotive sensitivity. While detecting the emotional content of a voice utterance isn’t a completely solved problem it can certainly be done (https://towardsdatascience.com/detecting-emotions-from-voice-clips-f1f7cc5d4827).

One of the hallmarks of a bad Alexa interaction is a user becoming increasingly frustrated or even angry at Alexa and Alexa being unaware of the fact. At Voice22 and at a Meetup presentation I spoke about how Alexa could incorporate emotion in its responses (https://www.youtube.com/watch?v=4LyQy-Aq79o). Unfortunately, for privacy reasons Alexa is designed so that the actual voice utterance is not available to a third-party developer. This means that there is no opportunity to analyze the interaction for emotion. First Party skills however could use emotion detection since they are Amazon internal and could access the raw voice recording. Additionally, many third party skills notice if a user is saying "No" a lot and offer a help message or even an apology.

Until recently this wasn’t much of a concern because compared with the alternatives Alexa’s interactions were pretty good. With LLMs raising the bar however Alexa’s interactions seem increasingly lacking.

Can Alexa Learn Context?
One would certainly think so. Context after is simply having awareness of what has happened and what is happening. Alexa certainly knows that the lights are, that its nighttime, and that in each of the previous 100 days the last user interaction has been to turn off the lights. We know that Alexa keeps track of these interactions as you can see them all via the Alexa app. So, it’s actually shocking that this readily available information is not being used.

Many skill writers are incorporating LLMs into their responses. It’s early days for this approach and there are several stumbling blocks … not the least of which is that Alexa only has eight seconds to create a response, and LLMs are often not that fast. Various skill writers are designing solutions to this issue but as yet no general solution has emerged.

On the other hand, a third-party skill can easily maintain its own context which can be fed into the next LLM prompt which would presumably create a higher quality response.

What’s really needed however is a large-scale change in the Alexa First Party responses. This change will not be free or easy and the large scale layoffs in the Alexa division are not encouraging. On the other hand, perhaps the division can stop creating strange products like the flying Alexa or the $1599 Astro robot and concentrate on fixing the basics.

Alexa could use the new era of LLMs to retake the lead in intelligent home automation. I hope they do.