<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Mihai Chiorean</title>
    <description>The latest articles on Forem by Mihai Chiorean (@mihaichiorean).</description>
    <link>https://forem.com/mihaichiorean</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F417576%2F6b412862-3615-4764-b9f0-f0888e755342.jpg</url>
      <title>Forem: Mihai Chiorean</title>
      <link>https://forem.com/mihaichiorean</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/mihaichiorean"/>
    <language>en</language>
    <item>
      <title>From Mobile Simplicity to Pi Complexity</title>
      <dc:creator>Mihai Chiorean</dc:creator>
      <pubDate>Tue, 15 Oct 2024 18:24:39 +0000</pubDate>
      <link>https://forem.com/mihaichiorean/from-mobile-simplicity-to-pi-complexity-2fa2</link>
      <guid>https://forem.com/mihaichiorean/from-mobile-simplicity-to-pi-complexity-2fa2</guid>
      <description>&lt;p&gt;As a backend developer with a solid grounding in Linux and a fair share of experience in mobile development for both Android and iOS, I recently decided to venture into the world of Raspberry Pi 5. I naively assumed that setting up a Raspberry Pi would be as straightforward as plugging in an Android device or an iPhone and hitting the run button in my IDE. Spoiler alert: it wasn’t.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Illusion of Plug and Play
&lt;/h2&gt;

&lt;p&gt;In mobile development, we’re spoiled by the simplicity of connecting a device via USB and seamlessly deploying our code. The development environments are mature, and the tooling is designed to minimize friction. With Raspberry Pi, I quickly realized that this convenience doesn’t translate.&lt;/p&gt;

&lt;h2&gt;
  
  
  The OS Conundrum
&lt;/h2&gt;

&lt;p&gt;The first hurdle was choosing an operating system and flashing it onto an SD card. Unlike mobile devices that come with a pre-installed OS, the Raspberry Pi requires you to select and install one yourself. Options like Raspberry Pi OS, Ubuntu, and others each come with their own quirks and compatibility considerations. This step isn’t just a minor setup task; the choice of OS affects not only performance but also the tools and libraries available to you and it’s a critical decision that affects the entire development experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Peripheral Predicament
&lt;/h2&gt;

&lt;p&gt;Most online resources assume you’re plugging in peripherals like a mouse, keyboard, and monitor directly into the Raspberry Pi and doing development on the device itself. However, this approach can be impractical for developers who might not have these peripherals readily available. Instead, we often have powerful laptops with our own development environments already set up — complete with local dotfiles, custom IDE configurations, and personalized editor settings. Recreating all of this on a single-board computer feels counterintuitive when we could leverage our existing setups.&lt;/p&gt;

&lt;h2&gt;
  
  
  The USB-C Connectivity Challenge
&lt;/h2&gt;

&lt;p&gt;Attempting to connect the Raspberry Pi 5 to my MacBook Pro via USB-C didn’t yield the immediate connection I expected. There was no automatic recognition, no prompt to trust the device, nothing. It turns out that setting up USB networking with the Raspberry Pi requires additional configuration. I stumbled upon &lt;a href="https://blog.hardill.me.uk/2023/01/07/raspberry-pi-usb-gadget-creator-update/" rel="noopener noreferrer"&gt;this guide&lt;/a&gt;, which, while helpful, was far from trivial. It involved modifying configuration files and setting up network interfaces manually.&lt;/p&gt;

&lt;h2&gt;
  
  
  SSH: A Necessary Evil?
&lt;/h2&gt;

&lt;p&gt;Some developers might not have peripherals like a mouse, keyboard, and monitor readily available, making SSH the default method of interaction despite its learning curve.&lt;/p&gt;

&lt;p&gt;Once the OS is installed, connecting to the Raspberry Pi becomes the next challenge. For many, SSH emerges as the primary means of interaction. However, not everyone has extensive experience with SSH, and setting it up can introduce additional complexities.&lt;/p&gt;

&lt;p&gt;Relying on SSH adds another layer of detachment from the development process. Instead of the immediate feedback loop we’re accustomed to, every code change requires building and deploying over an SSH connection — often without the safety net of an interactive debugger. This can significantly disrupt the development workflow, especially when compared to the seamless experience of mobile development environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  The IDE and Development Stack Dilemma
&lt;/h2&gt;

&lt;p&gt;In mobile development, IDEs like Xcode and Android Studio provide a rich set of tools, including debugging, breakpoints, and seamless deployment. With the Raspberry Pi, there’s no out-of-the-box IDE that offers this level of integration. You’re faced with choosing a development stack that may or may not support all the features you need. Do you develop directly on the Pi, or do you set up a cross-compilation environment? Each option has its own set of complexities.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Debugging Desert
&lt;/h2&gt;

&lt;p&gt;Speaking of debugging, the lack of straightforward debugging tools was perhaps the most jarring difference. Setting breakpoints and inspecting variables in real-time is a staple of my usual workflow. With the Raspberry Pi, setting up such an environment requires significant effort, and in some cases, might not even be feasible depending on the chosen development stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  Breaking Through: Achieving a Streamlined Setup
&lt;/h2&gt;

&lt;p&gt;Despite the initial challenges, I eventually succeeded in setting up the Raspberry Pi to be recognized by macOS. After configuring the necessary USB networking settings and tweaking some configurations, my MacBook Pro could finally communicate with the Raspberry Pi directly. This breakthrough was significant — it meant I could leverage my existing development environment without the need for additional peripherals.&lt;/p&gt;

&lt;p&gt;But I didn’t stop there. I went the extra mile and created a custom OS image with all my configurations, dependencies, and settings pre-installed. This image could be flashed onto another Raspberry Pi and would work out of the box. The ability to replicate the setup effortlessly on multiple devices not only saved time but also opened up possibilities for scaling projects or collaborating with others who could benefit from a ready-to-use environment.&lt;/p&gt;

&lt;p&gt;To enhance the development experience further, I’m currently working on integrating Visual Studio Code into my workflow. By utilizing VSCode’s remote development capabilities — or possibly developing a custom extension — I aim to replicate the rich IDE features I’m accustomed to. This setup would allow me to edit, build, and debug code on the Raspberry Pi directly from VSCode on my MacBook Pro. It brings back the convenience of a seamless development environment, bridging the gap between the powerful tools on my laptop and the Raspberry Pi’s hardware capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;While the journey to get a simple ‘Hello World’ running on the Raspberry Pi 5 was unexpectedly convoluted, breaking through those barriers proved incredibly rewarding. By configuring the device to work seamlessly with my MacBook Pro and leveraging tools like VSCode, I’ve started to bridge the gap between powerful development environments and the Raspberry Pi’s unique capabilities. Yes, the initial setup is complex and can be a barrier to entry, but with persistence and the right resources, it’s a challenge that can be overcome. For developers willing to venture into this space, the possibilities are vast and well worth the effort.&lt;/p&gt;

&lt;h2&gt;
  
  
  Invitation to Connect
&lt;/h2&gt;

&lt;p&gt;If you’re interested in trying out this setup or if you have your own experiences with Raspberry Pi development, I’d love to hear from you! Please feel free to reach out through the comments section below or connect with me on social media. Let’s collaborate and share our insights to make the Raspberry Pi development experience better for everyone.&lt;/p&gt;

&lt;h3&gt;
  
  
  References:
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://blog.hardill.me.uk/2023/01/07/raspberry-pi-usb-gadget-creator-update/" rel="noopener noreferrer"&gt;https://blog.hardill.me.uk/2023/01/07/raspberry-pi-usb-gadget-creator-update/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is cross posted from &lt;a href="https://medium.com/@mihaichiorean/from-mobile-simplicity-to-pi-complexity-3d18d8916fcc" rel="noopener noreferrer"&gt;https://medium.com/@mihaichiorean/from-mobile-simplicity-to-pi-complexity-3d18d8916fcc&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>raspberrypi</category>
      <category>robotics</category>
      <category>iot</category>
    </item>
    <item>
      <title>Querying CSV files in AWS S3 from Go, using SQL</title>
      <dc:creator>Mihai Chiorean</dc:creator>
      <pubDate>Fri, 10 Jul 2020 16:45:06 +0000</pubDate>
      <link>https://forem.com/mihaichiorean/querying-csv-files-in-aws-s3-from-go-56kf</link>
      <guid>https://forem.com/mihaichiorean/querying-csv-files-in-aws-s3-from-go-56kf</guid>
      <description>&lt;h2&gt;
  
  
  tl;dr
&lt;/h2&gt;

&lt;p&gt;S3 Select allows you to query CSV data in a CSV/JSON file stored in S3 using SQL queries.&lt;/p&gt;

&lt;p&gt;I'm going to present an example of how I did this in Go, since I had a hard time finding clear examples while trying to work on it.&lt;/p&gt;

&lt;h4&gt;
  
  
  Update: now with command line tool
&lt;/h4&gt;

&lt;h2&gt;
  
  
  Benefits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;only the selected records need to come over the wire, not the whole file&lt;/li&gt;
&lt;li&gt;no need to keep some datastore in sync with the latest csv to query the information&lt;/li&gt;
&lt;li&gt;reduction in code complexity on the consumer end&lt;/li&gt;
&lt;li&gt;can arguably be done sync, just like calling any other external API **&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  S3 Select in Golang
&lt;/h2&gt;

&lt;p&gt;In this example I used a simple CSV I found online as an example.&lt;br&gt;
I've structured it how I would in an application: a package that wraps the s3 specific code and provides some interface.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.
├── FL_insurance_sample.csv
├── README.md
├── client
│   └── csvdb
│       ├── client.go
│       └── client_test.go
├── config.yaml
├── go.mod
├── go.sum
├── main.go
└── vendor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First, the imports:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
       &lt;span class="o"&gt;....&lt;/span&gt;
 &lt;span class="m"&gt;7&lt;/span&gt;
 &lt;span class="m"&gt;8&lt;/span&gt;     &lt;span class="s"&gt;"github.com/aws/aws-sdk-go/aws"&lt;/span&gt;
 &lt;span class="m"&gt;9&lt;/span&gt;     &lt;span class="s"&gt;"github.com/aws/aws-sdk-go/service/s3"&lt;/span&gt;
&lt;span class="m"&gt;10&lt;/span&gt;     &lt;span class="s"&gt;"github.com/aws/aws-sdk-go/service/s3/s3iface"&lt;/span&gt;
       &lt;span class="o"&gt;....&lt;/span&gt;
&lt;span class="m"&gt;12&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll notice this uses the first version of the Go AWS SDK. &lt;a href="https://github.com/aws/aws-sdk-go-v2"&gt;There is a v2&lt;/a&gt;, however, at the time of this writing it was still v0.x.x.&lt;/p&gt;

&lt;p&gt;Next, I put the client struct together. It's fairly basic - wraps the S3 API interface and the bucket + key of the resource it will access.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="m"&gt;32&lt;/span&gt; &lt;span class="c"&gt;// Client represents the struct used to make queries to an s3 csv&lt;/span&gt;
&lt;span class="m"&gt;33&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;34&lt;/span&gt;     &lt;span class="n"&gt;s3iface&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;S3API&lt;/span&gt;
&lt;span class="m"&gt;35&lt;/span&gt;     &lt;span class="c"&gt;// S3 path - prefix + file/resource name&lt;/span&gt;
&lt;span class="m"&gt;36&lt;/span&gt;     &lt;span class="n"&gt;key&lt;/span&gt;    &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="m"&gt;37&lt;/span&gt;     &lt;span class="n"&gt;bucket&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="m"&gt;38&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="m"&gt;39&lt;/span&gt;
&lt;span class="m"&gt;40&lt;/span&gt; &lt;span class="c"&gt;// NewClient instantiates a new client struct&lt;/span&gt;
&lt;span class="m"&gt;41&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;s3iface&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;S3API&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;42&lt;/span&gt;     &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;43&lt;/span&gt;         &lt;span class="n"&gt;S3API&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="m"&gt;44&lt;/span&gt;         &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="m"&gt;45&lt;/span&gt;         &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="m"&gt;46&lt;/span&gt;     &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="m"&gt;47&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now let's move on to the interesting bits. First, we need to create the input for the S3 query&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="m"&gt;65&lt;/span&gt;     &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SelectObjectContentInput&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;66&lt;/span&gt;         &lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;         &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="m"&gt;67&lt;/span&gt;         &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;            &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="m"&gt;68&lt;/span&gt;         &lt;span class="n"&gt;Expression&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;     &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="m"&gt;69&lt;/span&gt;         &lt;span class="n"&gt;ExpressionType&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SQL"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="m"&gt;70&lt;/span&gt;         &lt;span class="n"&gt;InputSerialization&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;InputSerialization&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;71&lt;/span&gt;             &lt;span class="n"&gt;CSV&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSVInput&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;72&lt;/span&gt;                 &lt;span class="c"&gt;// query using header names. This is a choice for this example&lt;/span&gt;
&lt;span class="m"&gt;73&lt;/span&gt;                 &lt;span class="c"&gt;// many csv files do not have a header row; In that case,&lt;/span&gt;
&lt;span class="m"&gt;74&lt;/span&gt;                 &lt;span class="c"&gt;// this property would not be needed and the "filters" would be&lt;/span&gt;
&lt;span class="m"&gt;75&lt;/span&gt;                 &lt;span class="c"&gt;// on the column index (e.g. _1, _2, _3...)&lt;/span&gt;
&lt;span class="m"&gt;76&lt;/span&gt;                 &lt;span class="n"&gt;FileHeaderInfo&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;aws&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Use"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="m"&gt;77&lt;/span&gt;             &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="m"&gt;78&lt;/span&gt;         &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="m"&gt;79&lt;/span&gt;     &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Expression:&lt;/code&gt; is the actual SQL query which looks something like &lt;code&gt;SELECT * FROM s3object s WHERE s.column = 'value'&lt;/code&gt; (&lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-glacier-select-sql-reference-select.html"&gt;Read more here&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;InputSerialization&lt;/code&gt; can be CSV or JSON. I'm focusing on CSV for this example.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;FileHeaderInfo&lt;/code&gt; should be set to "Use" if you plan on using column names (header) in your CSV. Otherwise you'd have to use &lt;code&gt;s._1, s._2&lt;/code&gt; in your query - address the column by it's index.&lt;/li&gt;
&lt;li&gt;the AWS Go SDK uses &lt;code&gt;*string&lt;/code&gt; a lot. They've made a convenience function to deal with that. Hence the use of &lt;code&gt;aws.String(...)&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next, let's set up the response serialization for this request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="m"&gt;82&lt;/span&gt;     &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetOutputSerialization&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OutputSerialization&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;83&lt;/span&gt;         &lt;span class="n"&gt;JSON&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONOutput&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;
&lt;span class="m"&gt;84&lt;/span&gt;     &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similar to the input, both CSV and JSON can be used. I've used json here because it's easy to parse and populate a struct in Go. I've not done csv parsing in Go to be honest.&lt;/p&gt;

&lt;p&gt;Ready to call S3 now.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="m"&gt;85&lt;/span&gt;     &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SelectObjectContentWithContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="m"&gt;86&lt;/span&gt;     &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;87&lt;/span&gt;         &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
&lt;span class="m"&gt;88&lt;/span&gt;     &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="m"&gt;89&lt;/span&gt;
&lt;span class="m"&gt;90&lt;/span&gt;     &lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetEventStream&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="m"&gt;91&lt;/span&gt;     &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output of this call exposes a stream of events that we need to read the resulting data. This stream is implemented using a channel and sends some extra data apart from the actual rows themselves. The types of events exposed are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ContinuationEvent - not sure what this is&lt;/li&gt;
&lt;li&gt;EndEvent - indicates no more messages will be sent; request is completed&lt;/li&gt;
&lt;li&gt;ProgressEvent - data about the progress of an operation&lt;/li&gt;
&lt;li&gt;RecordsEvent - the actual rows&lt;/li&gt;
&lt;li&gt;StatsEvent - stats about the operation: total bytes, bytes processed etc&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#SelectObjectContentEventStreamEvent"&gt;Read more here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm going to be using just the &lt;code&gt;RecordsEvent&lt;/code&gt; only here, to keep this on point. However you might want to consider looking at the other events too (for example the EndEvent) when building your application.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt; &lt;span class="m"&gt;93&lt;/span&gt;     &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Row&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;
 &lt;span class="m"&gt;94&lt;/span&gt;     &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Events&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="m"&gt;95&lt;/span&gt;         &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="m"&gt;96&lt;/span&gt;             &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
 &lt;span class="m"&gt;97&lt;/span&gt;         &lt;span class="p"&gt;}&lt;/span&gt; 
 &lt;span class="m"&gt;98&lt;/span&gt;
 &lt;span class="m"&gt;99&lt;/span&gt;         &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;100&lt;/span&gt;         &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RecordsEvent&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="m"&gt;101&lt;/span&gt;             &lt;span class="n"&gt;rec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;s3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RecordsEvent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="m"&gt;102&lt;/span&gt;             &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="n"&gt;Row&lt;/span&gt;
&lt;span class="m"&gt;103&lt;/span&gt;             &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unmarshal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rec&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;104&lt;/span&gt;                 &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Wrapf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"unable to parse json: %s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rec&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="m"&gt;105&lt;/span&gt;             &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="m"&gt;106&lt;/span&gt;             &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="m"&gt;107&lt;/span&gt;         &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
&lt;span class="m"&gt;108&lt;/span&gt;         &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="m"&gt;109&lt;/span&gt;     &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As events are read from the channel, check for errors and return. Otherwise, switch based on the underlying event type. We're only looking at records here, so there is just 1 case. Convert the interface value to the underlying &lt;code&gt;RecordsEvent&lt;/code&gt; type and parse the payload.&lt;/p&gt;

&lt;p&gt;Here's what the &lt;code&gt;Row&lt;/code&gt; type looks like - a subset of fields (columns) from the CSV with the column name as their json tag.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="m"&gt;26&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Row&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;span class="m"&gt;27&lt;/span&gt;     &lt;span class="n"&gt;ID&lt;/span&gt;        &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="s"&gt;`json:"policyID"`&lt;/span&gt;
&lt;span class="m"&gt;28&lt;/span&gt;     &lt;span class="n"&gt;StateCode&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="s"&gt;`json:"statecode"`&lt;/span&gt;
&lt;span class="m"&gt;29&lt;/span&gt;     &lt;span class="n"&gt;County&lt;/span&gt;    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="s"&gt;`json:"county"`&lt;/span&gt;
&lt;span class="m"&gt;30&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing considerations
&lt;/h2&gt;

&lt;p&gt;When unit testing this example, I ran into some issues. &lt;br&gt;
My intuition told me I should create a mock of &lt;code&gt;s3iface.S3API&lt;/code&gt; and mock the response from the call to &lt;code&gt;SelectObjectContentWithContext&lt;/code&gt;. However this didn't work as expected. &lt;/p&gt;

&lt;p&gt;The &lt;code&gt;s3.SelectObjectContentOutput&lt;/code&gt; contains a &lt;code&gt;SelectObjectContentEventStream&lt;/code&gt; which does not have a factory method that I could find. It has some private fields. Instantiating it when mocking proved to be problematic because those private fields are &lt;code&gt;nil&lt;/code&gt; and are used in the calls &lt;code&gt;.Err()&lt;/code&gt; and &lt;code&gt;.Close()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The route I took is to use &lt;code&gt;net/http/httptest&lt;/code&gt; to "mock" the S3 backend instead. I created an http test server with a generic handler and pointed the AWS/S3 session make requests to this. &lt;/p&gt;

&lt;h2&gt;
  
  
  Update
&lt;/h2&gt;

&lt;p&gt;Once finishing this post I realized that it might be useful to have an easy way to use this example to try queries on a specific CSV. The repo now has a &lt;code&gt;main&lt;/code&gt; and you can build a command line tool to run some queries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;git clone https://github.com/mihai-chiorean/s3-select-example.git &amp;amp;&amp;amp; cd s3-select-example/cmd/s3ql

go install
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Usage
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;~/w/s3-select-example ❯❯❯ s3ql &lt;span class="nt"&gt;--help&lt;/span&gt;
A small cli to run/test sql queries against a CSV &lt;span class="k"&gt;in &lt;/span&gt;AWS S3

Usage:
  s3ql &lt;span class="o"&gt;[&lt;/span&gt;flags]

Flags:
  &lt;span class="nt"&gt;-b&lt;/span&gt;, &lt;span class="nt"&gt;--bucket&lt;/span&gt; string   S3 bucket where the data resides
  &lt;span class="nt"&gt;-h&lt;/span&gt;, &lt;span class="nt"&gt;--help&lt;/span&gt;            &lt;span class="nb"&gt;help &lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;s3ql
  &lt;span class="nt"&gt;-k&lt;/span&gt;, &lt;span class="nt"&gt;--key&lt;/span&gt; string      The S3 resource - prefix/key - of the CSV file
  &lt;span class="nt"&gt;-r&lt;/span&gt;, &lt;span class="nt"&gt;--region&lt;/span&gt; string   The AWS region where the bucket is
  &lt;span class="nt"&gt;-t&lt;/span&gt;, &lt;span class="nt"&gt;--use-headers&lt;/span&gt;     Tells S3 Select that the csv has a header row and to use it &lt;span class="k"&gt;in &lt;/span&gt;the query. &lt;span class="o"&gt;(&lt;/span&gt;default &lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;s3ql &lt;span class="nt"&gt;--bucket&lt;/span&gt; &amp;lt;bucket&amp;gt; &lt;span class="nt"&gt;--key&lt;/span&gt; FL_insurance_sample.csv &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-2 &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="se"&gt;\*&lt;/span&gt; from s3object s where s.statecode &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="se"&gt;\'&lt;/span&gt;FL&lt;span class="se"&gt;\'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Will print out, line by line, the json output of each matched record in the CSV.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/mihai-chiorean/s3-select-example"&gt;Full code for the example&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/aws/s3-glacier-select/"&gt;https://aws.amazon.com/blogs/aws/s3-glacier-select/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/selecting-content-from-objects.html"&gt;https://docs.aws.amazon.com/AmazonS3/latest/dev/selecting-content-from-objects.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/aws/aws-sdk-go-v2"&gt;https://github.com/aws/aws-sdk-go-v2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.SelectObjectContentWithContext"&gt;https://docs.aws.amazon.com/sdk-for-go/api/service/s3/#S3.SelectObjectContentWithContext&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-glacier-select-sql-reference-select.html"&gt;https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-glacier-select-sql-reference-select.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>go</category>
      <category>sql</category>
    </item>
  </channel>
</rss>
