<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: John Carrot</title>
    <description>The latest articles on Forem by John Carrot (@johncarrot).</description>
    <link>https://forem.com/johncarrot</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F247049%2Fb170e70a-c333-4c73-8e54-7fa8e3a5d37e.png</url>
      <title>Forem: John Carrot</title>
      <link>https://forem.com/johncarrot</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/johncarrot"/>
    <language>en</language>
    <item>
      <title>Manipulating arrays with PostgreSQL</title>
      <dc:creator>John Carrot</dc:creator>
      <pubDate>Sat, 12 Oct 2019 08:44:18 +0000</pubDate>
      <link>https://forem.com/johncarrot/manipulating-arrays-with-postgres-ak1</link>
      <guid>https://forem.com/johncarrot/manipulating-arrays-with-postgres-ak1</guid>
      <description>&lt;p&gt;Before we move on to manipulating array values stored in the database, it is useful to first learn the basics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Representing a list
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT '{a,b,c}'::text[];

   text 
----------
 {a,b,c}
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We have to specify the type. Else postgres would not know the type since it looks like a string.&lt;/p&gt;

&lt;p&gt;Or we could also use the &lt;code&gt;ARRAY[]&lt;/code&gt; style&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# SELECT ARRAY['a','b','c'];

 array 
-------
 {a,b,c}
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;SELECT&lt;/code&gt;ing &lt;code&gt;FROM&lt;/code&gt; an array
&lt;/h3&gt;

&lt;p&gt;So why doesn’t the following work?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# SELECT * FROM ARRAY['a','b','c']

ERROR:  42601: syntax error at or near "ARRAY"
LINE 1: SELECT * FROM ARRAY['a','b','c'];
                      ^
LOCATION:  scanner_yyerror, scan.l:1134
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;code&gt;SELECT FROM&lt;/code&gt; expects that the data source is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a set of rows&lt;/li&gt;
&lt;li&gt;OR points to a set of word (table)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;ARRAY[]&lt;/code&gt; returns an atomic value. Which is why it works with just &lt;code&gt;SELECT&lt;/code&gt; but fails when &lt;code&gt;FROM&lt;/code&gt; is used.&lt;/p&gt;

&lt;h3&gt;
  
  
  Convert arrays into rows using &lt;code&gt;UNNEST&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# SELECT UNNEST(ARRAY['a','b']);

 unnest 
--------
 a
 b
(2 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Convert rows into arrays using &lt;code&gt;ARRAY()&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Watch out. We are not using &lt;code&gt;ARRAY[]&lt;/code&gt;. We are using &lt;code&gt;array()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT array(
  SELECT UNNEST(ARRAY['a','b'])
);

 array 
-------
 {a,b}
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;To summarize:&lt;/p&gt;

&lt;blockquote&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;UNNEST&lt;/code&gt; converts array into a series of rows.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;array()&lt;/code&gt; converts a bunch of rows into an array.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ARRAY[]&lt;/code&gt; used to represent/define an array.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Concatenate arrays using &lt;code&gt;array_cat()&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# SELECT array_cat('{a,b}'::text[], '{b,c}'::text[]);

 array_cat 
-----------
 {a,b,b,c}
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We coerced the argument into text arrays to indicate to &lt;code&gt;array_cat&lt;/code&gt; what data types it was dealing with. If we used the &lt;code&gt;ARRAY[]&lt;/code&gt; initializer, this would not be required. Example below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT array_cat(ARRAY['a','b'], ARRAY['b', 'c']);
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Notice that the final array contains &lt;code&gt;b&lt;/code&gt; twice. We would love to remove duplicates. There are two ways to do it and we'll explore both below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Removing duplicates from an array using &lt;code&gt;DISTINCT&lt;/code&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;This method works with integers, text, float, etc&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;code&gt;DISTINCT&lt;/code&gt; helps remove duplicates from all data types. But it requires rows as input. Not arrays.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;So first convert an array to rows using &lt;code&gt;UNNEST&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Pass it to DISTINCT&lt;/li&gt;
&lt;li&gt;And then if you still require an array as output, use &lt;code&gt;array()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT DISTINCT UNNEST(
  '{apple, banana, cat, apple}'::text[]
);

 unnest 
--------
 banana
 cat
 apple
(3 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;





&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT DISTINCT UNNEST(
  '{1.245, 1.3, 1.5, 1.2}'::float[]
) ORDER BY 1;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;And we have columns. But we need an array back as output. So convert using &lt;code&gt;array()&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT array(
  SELECT DISTINCT UNNEST(
    '{apple, banana, cat, apple}'::text[]
  )
);

       array        
--------------------
 {banana,cat,apple}
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What if we want it sorted? Use &lt;code&gt;ORDER BY 1&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT array(
  SELECT DISTINCT UNNEST(
    '{apple, banana, cat, apple}'::text[]
  ) ORDER BY 1
);
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;ORDER BY 1&lt;/code&gt; would sort by the first column (We could starting with &lt;code&gt;1&lt;/code&gt; here). In this case we are working with on-the-fly arrays, without giving them column names. So we use column number references.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Want to try this with a float array? Let's do it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT array(
  SELECT DISTINCT UNNEST(
    '{1.2, 1.245, 2.3, 1.245, 1.246}'::text[]
  )
);

         array         
-----------------------
 {1.2,1.246,2.3,1.245}
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Removing duplicates from integer arrays using &lt;code&gt;sort&lt;/code&gt; and &lt;code&gt;uniq&lt;/code&gt;
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Caveat:&lt;/strong&gt; This method only works for integer arrays (yeah so no floats too).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;code&gt;sort&lt;/code&gt; and &lt;code&gt;uniq&lt;/code&gt; are functions available in the &lt;code&gt;intarray&lt;/code&gt; extension. Enable the extension in your database by running the following statement.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE EXTENSION IF NOT EXISTS intarray;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Try using the &lt;code&gt;uniq&lt;/code&gt; function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT uniq(
  '{1,2,1,3}'::int[]
);

   uniq    
-----------
 {1,2,1,3}
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Oops. Does not work. &lt;code&gt;uniq&lt;/code&gt;  only works on a sorted list. So use the &lt;code&gt;sort&lt;/code&gt; function to sort it first.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT uniq(
  sort(
    '{1,2,1,3}'::int[]
  )
);

  uniq   
---------
 {1,2,3}
(1 row)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;I haven't yet benchmarked in detail which method works the fastest for sorting integers. I used both methods as a part of another benchmarking to update 100k rows (coming soon). For now, all I can say is, both aren't slow. I'll try a more detailed benchmarking another day.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  What else?
&lt;/h3&gt;

&lt;p&gt;These are essentials that I find useful to know. There are a lot more array functions that Postgres supports. Go &lt;a href="https://www.postgresql.org/docs/12/functions-array.html"&gt;check it out&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I'll be sharing more from my notes on PostgreSQL. &lt;a href="https://deploymyapp.substack.com/subscribe"&gt;Subscribe to get updates&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>data</category>
      <category>sql</category>
      <category>database</category>
    </item>
    <item>
      <title>Generate dummy data with Postgres</title>
      <dc:creator>John Carrot</dc:creator>
      <pubDate>Sat, 12 Oct 2019 07:21:21 +0000</pubDate>
      <link>https://forem.com/johncarrot/generate-dummy-data-with-postgres-2cp5</link>
      <guid>https://forem.com/johncarrot/generate-dummy-data-with-postgres-2cp5</guid>
      <description>&lt;h3&gt;
  
  
  Generate 5 records
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# SELECT generate_series(1, 5);

 generate_series 
-----------------
               1
               2
               3
               4
               5
(5 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Generate in increments of 5
&lt;/h3&gt;

&lt;p&gt;Start with 1, and generates a series by adding 5 to the previous number.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# SELECT * FROM generate_series(1,25,5)

 generate_series 
-----------------
               1
               6
              11
              16
              21
(5 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Generate a series with date
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;RedShift, as of OCT-2019 is based on Postgres 8.0. Passing timestamps as intervals is not supported with Redshift. So datetime ranges will not work on Redshift.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT generate_series(
  '2019-10-1',
  '2019-10-05',
  '1 day'::interval
);

      generate_series      
---------------------------
 2019-10-01 00:00:00+05:30
 2019-10-02 00:00:00+05:30
 2019-10-03 00:00:00+05:30
 2019-10-04 00:00:00+05:30
 2019-10-05 00:00:00+05:30
(5 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The third argument for step interval has to be passed, without which Postgres would not know how much to increment the step for date ranges. We got a list of timestamps. Let us coerce that into a date type instead.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT generate_series(
  '2019-10-1',
  '2019-10-05',
  '1 day'::interval
)::date;

 generate_series 
-----------------
 2019-10-01
 2019-10-02
 2019-10-03
 2019-10-04
 2019-10-05
(5 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;We could also do weekly intervals instead of days.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT generate_series(
  '2019-12-1',
  '2019-12-31',
  '1 week'::interval
);

      generate_series      
---------------------------
 2019-12-01 00:00:00+05:30
 2019-12-08 00:00:00+05:30
 2019-12-15 00:00:00+05:30
 2019-12-22 00:00:00+05:30
 2019-12-29 00:00:00+05:30
(5 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Try playing with other intervals - 5 weeks, 10 days, 3 months, etc.&lt;/p&gt;

&lt;h3&gt;
  
  
  Want more columns?
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# SELECT generate_series(1, 5), 'hello', '123';

 generate_series | ?column? | ?column? 
-----------------+----------+----------
               1 | hello    | 123
               2 | hello    | 123
               3 | hello    | 123
               4 | hello    | 123
               5 | hello    | 123
(5 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Do more?
&lt;/h3&gt;



&lt;div class="highlight"&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT
  number,
  concat('hello', number)
FROM
  generate_series(1, 5) AS number


 number | concat 
--------+--------
      1 | hello1
      2 | hello2
      3 | hello3
      4 | hello4
      5 | hello5
(5 rows)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



</description>
      <category>postgres</category>
      <category>database</category>
      <category>sql</category>
      <category>data</category>
    </item>
  </channel>
</rss>
