<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Forem: Sandeep Kanabar</title>
    <description>The latest articles on Forem by Sandeep Kanabar (@sandeepkanabar).</description>
    <link>https://forem.com/sandeepkanabar</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F511534%2F008d1f63-21fc-4a9e-aa83-e59cb9a0b01d.png</url>
      <title>Forem: Sandeep Kanabar</title>
      <link>https://forem.com/sandeepkanabar</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.com/feed/sandeepkanabar"/>
    <language>en</language>
    <item>
      <title>Shard your open-search indices like a pro!</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Sun, 31 Mar 2024 18:53:28 +0000</pubDate>
      <link>https://forem.com/aws-builders/shard-your-open-search-indices-like-a-pro-khp</link>
      <guid>https://forem.com/aws-builders/shard-your-open-search-indices-like-a-pro-khp</guid>
      <description>&lt;p&gt;Ever struggled with the growing size of your OpenSearch cluster indices and wondered how you could efficiently manage them?&lt;/p&gt;

&lt;p&gt;You can definitely leverage Index State Management (ISM) &lt;a href="https://opensearch.org/docs/latest/im-plugin/ism/index/"&gt;policies&lt;/a&gt; but knowing the intricacies of sharding goes a long way in helping you scale your cluster efficiently while keeping its performance optimal and most importantly keep a check on cost ($$). &lt;/p&gt;

&lt;p&gt;One might wonder why the need to learn sharding strategies in a managed service. Isn't the service supposed to handle it for you? Well, yes and NO. One analogy that I can give is of cars. There are cars with automatic gears and cars with manual gears. A lot of people these days prefer automatic cars. They run great on highways but what if you need to navigate the car through crowded traffic streets with curvy turns? The automatic car "would" work but it would be far from efficient, performant and scalable. The same goes for managed services.&lt;/p&gt;

&lt;p&gt;So how do you shard your indices efficiently? Let's begin with the basics.&lt;/p&gt;

&lt;p&gt;First determine if your indices are something that can be organised as time-based indices. In that case, opting for day-wise, weekly, monthly or even yearly indices may make sense. Note that you can always mix-n-match meaning you can club between day-wise, weekly, monthly, yearly. Say your current data flows into day-wise indices and then you could have an ISM policy such that data older than 90 days is &lt;a href="https://opensearch.org/docs/latest/api-reference/document-apis/reindex/"&gt;re-indexed&lt;/a&gt; into monthly indices. Or you can have monthly indices for current year and past years monthly indices could be re-indexed into yearly indices. &lt;/p&gt;

&lt;p&gt;You can also leverage the open-source &lt;a href="https://github.com/elastic/curator"&gt;curator&lt;/a&gt; tool to manage all this using yaml scripts, as an alternative to ISM policies. &lt;/p&gt;

&lt;p&gt;So the question that arises is: why would you begin with day-wise indices if you want to merge (aka reindex) them later into monthly? why begin with monthly if you want to reindex them to yearly?&lt;/p&gt;

&lt;p&gt;The answer to this lies in "performance" and "efficiency" and also balancing indexing (write performance) and querying (read performance).&lt;/p&gt;

&lt;p&gt;Your current indices are getting live data. So you want to maximise indexing aka write performance. To maximize indexing performance, you can have more shards. More shards means more parallel writes leading to efficient writes. &lt;/p&gt;

&lt;p&gt;But here's the catch - the more the shards, more the time taken to search across all of them! Meaning query performance will suffer with more shards. Thus, there's a trade-off between indexing vs query performance and the trick lies in striking a balance. &lt;/p&gt;

&lt;p&gt;So how do we strike that balance. One way is to keep current day's index with more number of shards but past indices which have no data flowing in, can be "reindexed" to reduce the no of shards and then force-merged to reduce to a single (1) segment. When you re-index, the index name will change but with aliasing, this is simplified. Simply flip your alias to point to the new index name. This strategy helps to boost search/query performance and at the same time keep the indexing performance great as well. Win-win!&lt;/p&gt;

&lt;p&gt;So how many shards should you begin with?&lt;/p&gt;

&lt;p&gt;Arrive at how much data flows per day. Say 30 GB. Now, how many nodes do you have? Say 3 nodes. In that case, having 3 primary shards will be optimal as each node will have 1 shard and you can configure 1 replica. However, in this case, the primary shard size of 10 GB is way too small. Ideally, shards should be around 30-50 GB.&lt;/p&gt;

&lt;p&gt;Say daily data flow is 16 GB. In that case, just having 1 primary shard would do. Remember that too many small-sized shards is very detrimental to performance as there's context switching involved with underlying lucene indices.&lt;/p&gt;

&lt;p&gt;Let's say the indices are monthly indices and per day's average data flow is 30 GB. Thus per month it would average 30 GB * 30 days = 900 GB. Assuming 3 nodes, configuring this index with 12 shards would mean each primary shard size is 900/12 = 75 GB. The ideal size is 30-50 GB and this exceeds it by a huge margin. So let's look at having say 21 shards. In this case, each primary shard would be 900/21 = 43 GB. That's acceptable. With 24 shards, 900/24 = 37.5 GB primary shard size is also a good option. Remember that too large shards take a long time to move between nodes and slow down cluster recovery.&lt;/p&gt;

&lt;p&gt;This also helps you to plan your cluster capacity and scale it accordingly. Archiving old data into cold storage (s3/Azure storage blob/gcs bucket) is a good option. Another option is to snapshot old indices and then just delete them. The snapshots are stored in gcs/s3/azblobs which is cheap and can be easily restored on demand. &lt;/p&gt;

&lt;p&gt;Let's say you have an index for a customer master and the dataset is v less like just a master inventory and less than 10 MB. In that case, a single shard and single replica would suffice. You can also opt for 2 replicas but it all depends. If you take regular snapshots, then you can manage fine with 1 replicas. More replicas means more storage and more $$.&lt;/p&gt;

&lt;p&gt;Time-based indices have an advantage when it comes to snapshots. Say you have monthly indices and today is 31-March-2024 and you have snapshotted data pertaining to March-2022 into snap-my_monthly_index_2022_03. In that case, you can delete the index corresponding to March 2022 from your cluster if no searches are being performed against it and save storage costs. And in case it's later needed, you can easily restore the index quickly from snapshots. &lt;/p&gt;

&lt;p&gt;Hopefully, this beginner guide helps you in your sharding journey. Good luck.&lt;/p&gt;

</description>
      <category>sharding</category>
      <category>opensearch</category>
      <category>snapshots</category>
      <category>reindex</category>
    </item>
    <item>
      <title>Benefits of time-based indices on reindex, alias, ilm</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Wed, 01 Feb 2023 11:58:31 +0000</pubDate>
      <link>https://forem.com/sandeepkanabar/benefits-of-time-based-indices-on-reindex-alias-ilm-3hd3</link>
      <guid>https://forem.com/sandeepkanabar/benefits-of-time-based-indices-on-reindex-alias-ilm-3hd3</guid>
      <description>&lt;p&gt;In addition to the benefits listed &lt;a href="https://dev.to/sandeepkanabar/sizing-shards-using-time-based-indices-1ebl"&gt;here&lt;/a&gt;, using time-based indices helps in the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Avoids having to reindex entire data&lt;/li&gt;
&lt;li&gt;Efficient Deletion and application of ILM&lt;/li&gt;
&lt;li&gt;Easy to include / exclude indices based on alias&lt;/li&gt;
&lt;/ol&gt;

&lt;h5&gt;
  
  
  1. Avoids having to &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html" rel="noopener noreferrer"&gt;reindex&lt;/a&gt; entire data &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;If the data influx increases, we could easily set &lt;code&gt;"number_of_shards": 3&lt;/code&gt; in the &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index-templates.html" rel="noopener noreferrer"&gt;index template&lt;/a&gt; and this would get effected for &lt;code&gt;tomorrow's&lt;/code&gt; day-wise index. Without the need to reindex any data, the number of shards could be easily changed.&lt;/p&gt;

&lt;h5&gt;
  
  
  2. Efficient Deletion and application of ILM &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Let's say we need to retain data upto 90 days. Thus, for a day-wise index which is &lt;code&gt;older than 90 days&lt;/code&gt;, that &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#delete-indices-not-documents" rel="noopener noreferrer"&gt;entire index&lt;/a&gt; can be purged / deleted. This is far more efficient than purging older records from indices which makes them pretty un-optimised from search perspective.&lt;br&gt;
Also, &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/set-up-lifecycle-policy.html" rel="noopener noreferrer"&gt;index lifecycle management&lt;/a&gt; becomes simplified with time-based indices.&lt;/p&gt;

&lt;h5&gt;
  
  
  3. Easy to include / exclude indices based on alias &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Let's assume the cluster needs to retain 90 days data but needs to search only on the &lt;code&gt;last 60 days&lt;/code&gt; data. &lt;strong&gt;Alias&lt;/strong&gt; to the rescue. In this case, define an &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html" rel="noopener noreferrer"&gt;alias&lt;/a&gt; in index template that gets mapped to newly created day-wise indices. As soon as a past index becomes older than 60 days, the alias is removed from that index. This ensures that at any given point of time, the alias will point to a maximum of 60 day-wise indices.&lt;/p&gt;

</description>
      <category>crypto</category>
      <category>web3</category>
      <category>blockchain</category>
      <category>offers</category>
    </item>
    <item>
      <title>Sizing shards using time-based indices</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Wed, 01 Feb 2023 11:41:36 +0000</pubDate>
      <link>https://forem.com/sandeepkanabar/sizing-shards-using-time-based-indices-1ebl</link>
      <guid>https://forem.com/sandeepkanabar/sizing-shards-using-time-based-indices-1ebl</guid>
      <description>&lt;p&gt;This post lists a few advantages of making use of time-based indices (as well as &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/data-streams.html"&gt;DataStreams&lt;/a&gt;) in Elasticsearch.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Increasing / Decreasing the number of shards becomes easy&lt;/li&gt;
&lt;li&gt;Helps to plan cluster capacity and growth size&lt;/li&gt;
&lt;li&gt;Easily determine optimum number of shards&lt;/li&gt;
&lt;/ol&gt;

&lt;h5&gt;
  
  
  1. Increasing / Decreasing the number of shards becomes easy &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Say, an &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index-templates.html"&gt;index template&lt;/a&gt; that makes use of &lt;code&gt;day-wise&lt;/code&gt; indices is configured with &lt;code&gt;1 shard&lt;/code&gt; in index settings. In case the indexing rate is slow or the shard size becomes too large (&amp;gt; 40-50 GB), the index template can be easily modified to increase the &lt;code&gt;number_of_ shards&lt;/code&gt; to &lt;code&gt;3&lt;/code&gt; or &lt;code&gt;5&lt;/code&gt; or &lt;code&gt;n&lt;/code&gt;. And this gets effected from the &lt;strong&gt;next day&lt;/strong&gt;. Similarly, if a day-wise index pattern is configured with more than required number of shards &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html"&gt;oversharded&lt;/a&gt;, reducing the number of shares becomes pretty easy as it's just a matter of changing the template which would be effected next day (unless re-indexing is done).&lt;/p&gt;

&lt;h5&gt;
  
  
  2. Helps to plan cluster capacity and &lt;a href="https://www.elastic.co/blog/benchmarking-and-sizing-your-elasticsearch-cluster-for-logs-and-metrics"&gt;growth size&lt;/a&gt; &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Let's say &lt;code&gt;100 events per second&lt;/code&gt; flow into an Elasticsearch cluster and &lt;code&gt;each event&lt;/code&gt; averages about &lt;code&gt;1 KB&lt;/code&gt; in size. Thus, per day, there would be: &lt;br&gt;
&lt;code&gt;86400 seconds * 100 events/second = 8,640,000&lt;/code&gt; events. &lt;/p&gt;

&lt;p&gt;Since each event averages about 1 KB, the total size of 8,640,000 events = &lt;code&gt;8,640,000 * 1 KB = 8,640,000 KB / (1024 * 1024) = ~8.24 GB&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Thus, with a &lt;code&gt;day-wise index&lt;/code&gt;, we could see that the day-wise index size would be &lt;code&gt;~9 GB per day&lt;/code&gt; without any replicas. Considering 1 replica, the size per day would be &lt;code&gt;~18 GB&lt;/code&gt; and size for &lt;code&gt;30 days&lt;/code&gt; would be &lt;code&gt;~540 GB&lt;/code&gt;. This helps with capacity planning and estimating cluster growth rate.&lt;/p&gt;

&lt;h5&gt;
  
  
  3. Easily determine optimum number of shards &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;With data set of about &lt;code&gt;9GB per day&lt;/code&gt;, for a &lt;code&gt;day-wise index&lt;/code&gt;, we could start by setting &lt;code&gt;"number_of_shards" : 1&lt;/code&gt; in the &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index-templates.html"&gt;index template&lt;/a&gt; since each &lt;code&gt;primary shard&lt;/code&gt; would be about 9 GB which is pretty reasonable for a single shard. Shards for &lt;code&gt;time-based&lt;/code&gt; indices can be in the range of &lt;code&gt;10-50 GB&lt;/code&gt; as mentioned &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/size-your-shards.html#shard-size-recommendation"&gt;here&lt;/a&gt;. With a bit of trial and error based on the daily ingestion rate, we can arrive at Optimum shard size that helps in stabilizing the cluster and boosting performance.&lt;/p&gt;

</description>
      <category>sharding</category>
      <category>elasticsearch</category>
    </item>
    <item>
      <title>Guido Lena Cota - Practice Mock Tests for (ECE) exam</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Wed, 25 Jan 2023 12:39:25 +0000</pubDate>
      <link>https://forem.com/sandeepkanabar/guido-lena-cota-practice-mock-tests-for-ece-exam-2l9a</link>
      <guid>https://forem.com/sandeepkanabar/guido-lena-cota-practice-mock-tests-for-ece-exam-2l9a</guid>
      <description>&lt;p&gt;While I was preparing for the Elastic Certified Engineer exam, I googled about mock practice tests and chanced upon this &lt;a href="https://medium.com/kreuzwerker-gmbh/elastic-certified-engineer-exam-what-to-expect-and-how-to-rock-it-cf409ed48d7b" rel="noopener noreferrer"&gt;excellent blog series&lt;/a&gt; of 4 tests.(Scroll to the bottom of the blog for links).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74h39wsfzzk6rxe7kz4l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F74h39wsfzzk6rxe7kz4l.png" alt="Links to my blog series of exercises:" width="800" height="472"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Attempting the exercises in these links did wonders to my confidence as I found lapses in my preparation strategy, got a chance to correct them and also got exposure to the questions and how to answer them. &lt;/p&gt;

&lt;p&gt;As for the format, while it can be found from Rich Raposa’s &lt;a href="https://youtu.be/9UpB-s_ZfNE?t=1452" rel="noopener noreferrer"&gt;webinar&lt;/a&gt;, the webinar tells you what to expect but attempting the mock exercises actually gives you a &lt;strong&gt;&lt;em&gt;feel&lt;/em&gt;&lt;/strong&gt; of the exam which makes a difference. &lt;/p&gt;

&lt;p&gt;From the exercises, the &lt;a href="https://medium.com/kreuzwerker-gmbh/exercises-for-the-elastic-certified-engineer-exam-deploy-and-operate-a-cluster-b06741760d47" rel="noopener noreferrer"&gt;first part&lt;/a&gt; is about deploying and operating a cluster. The first exercise – configure the nodes to avoid split brain - is no longer on the agenda so that can be skipped. Same way exercise 2 can be skipped as well. Exercise 3 is about optimization and could help you with Cluster Management section e.g. diagnose shard issues and repair a cluster’s health. Exercise 4 is about snapshots, cross-cluster search which again comes under Cluster Management so both exercise 3 and 4 are helpful.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://medium.com/kreuzwerker-gmbh/exercises-for-the-elastic-certified-engineer-exam-store-data-into-elasticsearch-cbce230bcc6" rel="noopener noreferrer"&gt;second part&lt;/a&gt; is pretty relevant as it covers the topics of data management – index, templates (index and dynamic) and also data processing – reindex, ingest pipeline, painless scripting. One thing to note is that the exam can combine multiple objectives into one question. E.g. a question on re-index could also include ingest pipelines. Do attempt all the exercises in this post.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://medium.com/kreuzwerker-gmbh/exercises-for-the-elastic-certified-engineer-exam-model-data-into-elasticsearch-5eb69086cdaa" rel="noopener noreferrer"&gt;third part&lt;/a&gt; covers mapping and text analysis which would be under Data Processing Category. This is very important and would help to attempt all the exercises here. Note that this does &lt;em&gt;&lt;strong&gt;NOT&lt;/strong&gt;&lt;/em&gt; have exercises for &lt;code&gt;Runtime&lt;/code&gt; fields. &lt;/p&gt;

&lt;p&gt;The &lt;a href="https://medium.com/kreuzwerker-gmbh/exercises-for-the-elastic-certified-engineer-exam-search-and-aggregations-1eefcfb6e992" rel="noopener noreferrer"&gt;final part&lt;/a&gt; covers exercises for Search and Aggregations and forms a very important part of exam preparation. One thing to note is: Pipeline and Matrix aggregations don’t look to be in the exam agenda. Please skip whatever doesn’t conform to the agenda. In case of any doubts about whether a particular topic is on exam agenda or not, just drop an email to certification[at]elastic[dot]co. The Elastic Certification team is pretty responsive and helpful. &lt;/p&gt;

&lt;p&gt;Final Note: These exercises are excellent to boost your preparation strategy and level up your confidence but do &lt;em&gt;&lt;strong&gt;NOT&lt;/strong&gt;&lt;/em&gt; cover all the topics so you'll likely need mock exercises from other sources for the missing topics.&lt;/p&gt;

</description>
      <category>welcome</category>
      <category>softwaredevelopment</category>
      <category>startup</category>
      <category>authentication</category>
    </item>
    <item>
      <title>Practice Mock Tests for the Elastic Certified Engineer (ECE) exam</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Sun, 22 Jan 2023 17:35:22 +0000</pubDate>
      <link>https://forem.com/sandeepkanabar/practice-mock-tests-for-the-elastic-certified-engineer-ece-exam-ali</link>
      <guid>https://forem.com/sandeepkanabar/practice-mock-tests-for-the-elastic-certified-engineer-ece-exam-ali</guid>
      <description>&lt;p&gt;So you've gone through the syllabus and prepared all the necessary topics for the ECE exam. You are days away from attempting the exam. &lt;em&gt;How do you get that booster dose of confidence to tackle the exam&lt;/em&gt;? How do you avoid those nervous moments on seeing the first question in the exam and feeling your heart sink? There are times when we prepare the syllabus in entirety and yet we are short of confidence in facing the exam &lt;strong&gt;&lt;em&gt;due to lack of practice of facing the questions&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's where &lt;strong&gt;&lt;code&gt;Mock tests and practice exercises&lt;/code&gt;&lt;/strong&gt; can help. They help us to validate and solidify our preparation strategy. They sort of give us the much needed fillip to bolster our confidence and face the exam. Imagine preparing a question and seeing a similar one appearing in the exam. Instead of nervousness, we smile and are excited to tackle the exam.&lt;/p&gt;

&lt;p&gt;Once my preparation was done, roughly 2-3 days before the exam, I began preparing mock tests that were available for free. I didn't have much time in hand and was just looking to up my confidence. Also, &lt;em&gt;preparing is one thing but seeing the questions and forming your thoughts and answering is altogether a different thing&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;For my exam, I gained a lot of confidence by practising the mock tests and exercises available for free at &lt;a href="https://medium.com/kreuzwerker-gmbh/elastic-certified-engineer-exam-what-to-expect-and-how-to-rock-it-cf409ed48d7b" rel="noopener noreferrer"&gt;guido-lenacota&lt;/a&gt; and &lt;a href="https://acloudguru.com/hands-on-labs/ece-practice-exam-part-1" rel="noopener noreferrer"&gt;acloudguru&lt;/a&gt;, however you can also choose to purchase the &lt;a href="https://www.elastic.co/training/subscriptions" rel="noopener noreferrer"&gt;elastic subscriptions&lt;/a&gt;. The standard and professional versions come with a practice exam attempt which is very beneficial. And even &lt;a href="https://www.udemy.com/course/elastic-certified-engineer-exam/" rel="noopener noreferrer"&gt;Udemy&lt;/a&gt; has come up with a practice test at  which is pretty affordable. &lt;/p&gt;

&lt;p&gt;Hope this post helps you in your certification journey.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Disclaimer&lt;/em&gt;&lt;/strong&gt;: This is not to suggest that mock tests and practice exercises are mandatory. &lt;strong&gt;&lt;em&gt;Even without them&lt;/em&gt;&lt;/strong&gt;, you can ace the exam and pass it with flying colours. It's just that for some of us, we feel better prepared on attempting mock tests.&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>productivity</category>
      <category>career</category>
    </item>
    <item>
      <title>Elastic Certified Engineer Certification - is it worth?</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Sun, 08 Jan 2023 12:04:30 +0000</pubDate>
      <link>https://forem.com/sandeepkanabar/elastic-certified-engineer-certification-is-it-worth-4l55</link>
      <guid>https://forem.com/sandeepkanabar/elastic-certified-engineer-certification-is-it-worth-4l55</guid>
      <description>&lt;p&gt;Is it worth achieving &lt;a href="https://www.elastic.co/training/elastic-certified-engineer-exam"&gt;Elastic Certified Engineer Certification&lt;/a&gt;?&lt;/p&gt;

&lt;p&gt;Often times we wonder if &lt;strong&gt;Elastic Certified Engineer&lt;/strong&gt; exam is worth the effort? &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Will it add value to our career? &lt;/li&gt;
&lt;li&gt;Will it boost our knowledge and understanding of the core concepts? &lt;/li&gt;
&lt;li&gt;Will it help us with practical implementation of ES in our day-to-day job?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I had the very &lt;em&gt;&lt;strong&gt;same&lt;/strong&gt;&lt;/em&gt; questions when I signed up for the Elastic Certified Engineer exam and it wasn't until I actually got to solid preparation that most of my above doubts were laid to rest. The remaining lingering doubts got resolved when I had a cursory glance at the excellent questions that were part of my exam. &lt;/p&gt;

&lt;p&gt;This blog post is a culmination of my thoughts on above questions. Please note that I'm sharing this entirely in my &lt;strong&gt;personal capacity and passion&lt;/strong&gt;. &lt;/p&gt;

&lt;h3&gt;
  
  
  Thoughts at Preparation Stage
&lt;/h3&gt;

&lt;p&gt;After I signed up for the exam, given that we all have a busy work life, I kept wondering &lt;em&gt;&lt;strong&gt;why did I even sign up for this&lt;/strong&gt;&lt;/em&gt;? I kept telling myself why did I unnecessarily over-burden myself. As the deadline kept approaching, these voices grew louder and a part of me just wanted to skip giving this exam. "&lt;em&gt;&lt;strong&gt;After all, no one would know if you don't give the exam. Just chuck it&lt;/strong&gt;&lt;/em&gt;" - my mind voice told me. I wasn't convinced. I was determined to give the exam since I was very passionate about ES. "&lt;em&gt;&lt;strong&gt;You've quite a few years of experience with ES. You don't need to practice much. You'll easily sail through&lt;/strong&gt;&lt;/em&gt;", the voice croaked again. For a better part I listened to that voice until it was just about 3 weeks to go for the exam. I then decided to sit down and go through the agenda first to get a gist of the topics and my heart sank. There was &lt;strong&gt;&lt;code&gt;SO MUCH I DIDN'T KNOW&lt;/code&gt;&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;In an instant, all that overconfidence vanished. All those thoughts of I-know-ES-well melted. &lt;strong&gt;Humility returned&lt;/strong&gt;. Began reading very seriously every single day for few hours. Poured through the amazing &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/elasticsearch-intro.html"&gt;documentation&lt;/a&gt; and found there was so much to learn. So many new things. So much of un-learning and re-learning. &lt;/p&gt;

&lt;p&gt;I was pretty familiar with ES 6.x managing multiple 6.x stacks in &lt;code&gt;production&lt;/code&gt; but the exam back then was on 7.13 version. (Right now it's on &lt;a href="https://www.elastic.co/training/certification/faq"&gt;&lt;strong&gt;8.1&lt;/strong&gt;&lt;/a&gt; version. Elastic doesn't change exam versions too frequently and that's a good thing). There were tons and tons of improvements from 6.x to 7.x that it felt overwhelming. Went through all the breaking changes doc to familiarise myself with what's new and then came back to the exam agenda. While I was quite familiar with ingest processors, I realised that a large number of ingest processors had been added in 7.x that would make my work a breeze. Previously what would me take a block of code with if and else would now just be few lines.&lt;/p&gt;

&lt;p&gt;I had fond memories of implementing cross-cluster replication using Kafka MirrorMaker way back in 2017. It was refreshing to see &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/master/ccr-getting-started-tutorial.html"&gt;Cross-Cluster Replication (CCR)&lt;/a&gt; in action making the replication so easy and super intuitive. I was used to one way of writing queries and reading the documentation during preparation made me realise I could write it much better and more efficiently. There were so many &lt;strong&gt;&lt;code&gt;AHA moments&lt;/code&gt;&lt;/strong&gt; during preparing and I made multiple mental notes of correcting a couple of things and replacing those with their more efficient counterparts.&lt;/p&gt;

&lt;p&gt;The exam required knowledge of runtime fields and this was something very new for me. &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/runtime.html"&gt;Runtime fields&lt;/a&gt; are incredibly powerful if used the right way. As I continued preparing, I couldn't help but thank my stars that I signed up for the exam as it made me go through all these topics and get a good grasp on a number of things I wasn't aware of. &lt;/p&gt;

&lt;p&gt;I always made use of time-based indices and it was a pleasant surprise to see the wonderful engineers at Elastic introduce &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/data-streams.html"&gt;data streams&lt;/a&gt; which was a much-needed feature. &lt;/p&gt;

&lt;h3&gt;
  
  
  Thoughts during Exam time
&lt;/h3&gt;

&lt;p&gt;The first thing I did when my exam started was to have a cursory glance at the questions to gauge my preparation and confidence level. One look at the questions and I couldn't help marvelling. The questions were very practical, in fact quite a few matched what I was actually doing in my day-to-day job. I had attempted a particular re-indexing strategy at work and a question on re-index was pretty similar to what I had done. I smiled. The questions on queries, aggregation, ccr were so relevant and resembling real world scenarios. I also realised there was neither guesswork nor cramming involved here. &lt;strong&gt;&lt;em&gt;&lt;code&gt;You needed to really understand the ins and outs of concepts to clear this exam and at that point I felt really glad that I took up this exam which enriched my knowledge immensely&lt;/code&gt;&lt;/em&gt;&lt;/strong&gt;. A huge shoutout and &lt;strong&gt;respect&lt;/strong&gt; to the creators of this exam. &lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;Hopefully, this blog post makes it easier for you in case you are wondering if it's worth attempting the Elastic Certified Engineer Exam. &lt;strong&gt;&lt;code&gt;YES, IT IS. It would be the most productive use of your time and effort. Go for it.&lt;/code&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>elk</category>
      <category>elasticsearch</category>
      <category>certification</category>
      <category>attempt</category>
    </item>
    <item>
      <title>Acing Elastic Certified Engineer Exam</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Sun, 08 Jan 2023 03:03:13 +0000</pubDate>
      <link>https://forem.com/sandeepkanabar/acing-elastic-certified-engineer-exam-1cm1</link>
      <guid>https://forem.com/sandeepkanabar/acing-elastic-certified-engineer-exam-1cm1</guid>
      <description>&lt;p&gt;This blog post shares some basic tips and techniques that can be helpful in acing the Elastic Certified Engineer Exam.&lt;/p&gt;

&lt;h3&gt;
  
  
  First things first
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Check the Exam Version (8.1 at the time of writing this post)&lt;/li&gt;
&lt;li&gt;Note the duration - 3 hours&lt;/li&gt;
&lt;li&gt;Check Exam Topics&lt;/li&gt;
&lt;li&gt;Read the complete FAQ&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of the above can be found at &lt;a href="https://www.elastic.co/training/elastic-certified-engineer-exam" rel="noopener noreferrer"&gt;Elastic Certified Engineer Exam page&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;Please do not miss to read the &lt;a href="https://www.elastic.co/training/certification/faq" rel="noopener noreferrer"&gt;FAQ&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Next Steps
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Without missing a beat, go through Rich Raposa's &lt;a href="https://www.youtube.com/watch?v=9UpB-s_ZfNE&amp;amp;t=1579s" rel="noopener noreferrer"&gt;webinar&lt;/a&gt;. This is extremely important as it not only gives an overview of what to expect in the exam, but also shows how the exam environment looks like. In essence, it gives some familiarity with the exam environment which is immensely helpful. The webinar is dated July 2021 but it's pretty relevant. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Read this excellent &lt;a href="https://www.linkedin.com/pulse/elastic-certified-engineer-exam-my-experience-how-i-surbhi-mahajan/" rel="noopener noreferrer"&gt;post&lt;/a&gt; by Surbhi Mahajan. It has links to a plethora of resources including mock tests.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Re-read the Exam Topics. This is extremely important as it's easy to get way-laid during exam preparation when curiosity gets the better of us and start reading and trying out topics that are not on syllabus / agenda. If you have any questions related to the topics, send an email to &lt;a href="mailto:certification@elastic.co"&gt;certification@elastic.co&lt;/a&gt;. The certification folks at Elastic are pretty empathetic and very responsive.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Preparation Strategy
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The exam topics has 5 main sections. The questions will be distributed across them so makes sense to prepare from all the sections.&lt;/li&gt;
&lt;li&gt;From each section, read few topics and gain some confidence. Confidence is the key here. If you find losing your confidence, take up a topic that's more familiar, tackle it, regain confidence and take up challenging / un-familiar topics. &lt;/li&gt;
&lt;li&gt;The exam has around 10 questions with a time duration of 3 hours (180 minutes). So approx 17 minutes per question with a buffer of 10 mins. During the exam, don't spend too much time on one question as it affects confidence. First attempt the questions that seem familiar and then use the remaining time to attempt other questions. There's &lt;code&gt;partial grading&lt;/code&gt; so do attempt &lt;code&gt;all&lt;/code&gt; the questions.&lt;/li&gt;
&lt;li&gt;Be well-versed with &lt;a href="https://www.elastic.co/guide/en/elasticsearch/reference/8.5/elasticsearch-intro.html" rel="noopener noreferrer"&gt;Elastic Documentation&lt;/a&gt;. Know how to navigate and find the relevant info. The documentation is fully available to use during the exam so no need to memorise the APIs. Just remember where to look.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hopefully this helps you in your preparation for the ECE Exam. Good luck!&lt;/p&gt;

</description>
      <category>tooling</category>
      <category>discuss</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Optimum Sharding strategy in OpenSearch</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Tue, 01 Nov 2022 06:56:27 +0000</pubDate>
      <link>https://forem.com/aws-builders/optimum-sharding-strategy-in-opensearch-3fa8</link>
      <guid>https://forem.com/aws-builders/optimum-sharding-strategy-in-opensearch-3fa8</guid>
      <description>&lt;p&gt;This article explores a few tips on optimum &lt;code&gt;sharding&lt;/code&gt; strategy in OpenSearch.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Using time-based indices wherever possible. There are a number of advantages of using time-based indices as mentioned in &lt;a href="https://dev.to/aws-builders/advantages-of-using-time-based-indices-in-opensearch-2eie"&gt;this&lt;/a&gt; article. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If unsure, begin with &lt;code&gt;1&lt;/code&gt; shard. With time-based indices, it offers the flexibility of modifying the number of shards anytime. &lt;br&gt;
E.g.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if the event count per second is 100 and each event is 1KB, then per day
number of events = 100 per sec * 86400 secs in day = 86,40,000
approx size of each event = 1KB
size of all events per day = 1 KB * 86,40,000 = 86,40,000 KB = ~9 GB per day
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each shard is good enough to hold around &lt;code&gt;30-50&lt;/code&gt; GB data. In the above scenario, with a daily dataset size of &lt;code&gt;9 GB&lt;/code&gt;, a &lt;code&gt;single shard&lt;/code&gt; should suffice in case of day-wise indices.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Consider another scenario -
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;If the event count per second is 200 and each event is 2KB, then per day
number of events = 200 per sec * 86400 secs in day = 1,72,80,000
approx size of each event = 2KB
size of all events per day = 2 KB * 1,72,80,000 = 3,45,60,000 KB = ~34 GB per day

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here also, a single shard might suffice but it would impact indexing making it slower. Opting for 3 primary shards would mean each shard would be ~12 GB. &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;For scenario discussed in point 3, the shards of size ~12 GB might look too smaller but then past indices being read-only could be &lt;a href="https://aws.amazon.com/premiumsupport/knowledge-center/opensearch-deleted-documents/"&gt;force-merged&lt;/a&gt; to 1 segment. Alternatively, the no of shards could be reduced for past indices by re-indexing them, e.g. say reindex day-wise indices to monthly indices and then force-merge them. This could lead to 30 day-wise indices with each index have 1 shard (thereby total 30 shards for 30 indices) become a single monthly index with say 9 or 12 shards depending on the size of shards.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The best way is to experiment and find out what works best. Day-wise indices offer scope to experiment as the template could be easily modified to vary the no of shards for newly created indices.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Keep shards EVEN-sized even for different types of indices. Eg. say &lt;code&gt;twitter&lt;/code&gt; index has &lt;code&gt;5 shards&lt;/code&gt; each of &lt;code&gt;10 GB&lt;/code&gt;, then design &lt;code&gt;posts&lt;/code&gt; index such that the shard size for posts index is also approx around &lt;code&gt;10-15 GB&lt;/code&gt; or &lt;code&gt;10-20 GB&lt;/code&gt;. The reason being, if &lt;code&gt;twitter&lt;/code&gt; index shard is &lt;code&gt;10 GB&lt;/code&gt; and &lt;code&gt;posts&lt;/code&gt; index shard is say &lt;code&gt;50 GB&lt;/code&gt;, then it might lead to un-even disk space.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Feel free to add your questions / thoughts in the comments below. &lt;/p&gt;

</description>
      <category>opensearch</category>
      <category>aws</category>
      <category>sharding</category>
    </item>
    <item>
      <title>Advantages of using time-based indices in OpenSearch</title>
      <dc:creator>Sandeep Kanabar</dc:creator>
      <pubDate>Sat, 06 Nov 2021 22:37:17 +0000</pubDate>
      <link>https://forem.com/aws-builders/advantages-of-using-time-based-indices-in-opensearch-2eie</link>
      <guid>https://forem.com/aws-builders/advantages-of-using-time-based-indices-in-opensearch-2eie</guid>
      <description>&lt;p&gt;This post lists a few advantages of using time-based indices in OpenSearch Cluster.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Increasing / Decreasing the number of shards becomes easy&lt;/li&gt;
&lt;li&gt;Helps to plan cluster capacity and growth size&lt;/li&gt;
&lt;li&gt;Easily determine optimum number of shards&lt;/li&gt;
&lt;li&gt;Avoids having to reindex entire data&lt;/li&gt;
&lt;li&gt;Efficient Deletion and application of ISM&lt;/li&gt;
&lt;li&gt;Easy to include / exclude indices based on alias&lt;/li&gt;
&lt;li&gt;Snapshot and Restore becomes a breeze with day-wise indices&lt;/li&gt;
&lt;li&gt;Apply best_compression to day-wise indices&lt;/li&gt;
&lt;li&gt;Force-merge past indices&lt;/li&gt;
&lt;/ol&gt;

&lt;h5&gt;
  
  
  1. Increasing / Decreasing the number of shards becomes easy &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Say, an &lt;a href="https://opensearch.org/docs/latest/opensearch/index-templates/"&gt;index template&lt;/a&gt; that makes use of &lt;code&gt;day-wise&lt;/code&gt; indices is configured with &lt;code&gt;1 shard&lt;/code&gt; in index settings. In case the indexing rate is slow or the shard size becomes too large (&amp;gt; 50 GB), the index template can be easily modified to increase the &lt;code&gt;number_of_ shards&lt;/code&gt; to &lt;code&gt;3&lt;/code&gt; or &lt;code&gt;5&lt;/code&gt;. And this gets effected from the &lt;strong&gt;next day&lt;/strong&gt;. Similarly, if a day-wise index pattern is configured with more than required number of shards (&lt;a href="https://aws.amazon.com/blogs/big-data/best-practices-for-configuring-your-amazon-opensearch-service-domain/"&gt;oversharded&lt;/a&gt;), reducing it becomes easy.&lt;/p&gt;

&lt;h5&gt;
  
  
  2. Helps to plan cluster capacity and &lt;a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/sizing-domains.html"&gt;growth size&lt;/a&gt; &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Let's say &lt;code&gt;100 events per second&lt;/code&gt; flow into an OpenSearch cluster and &lt;code&gt;each event&lt;/code&gt; averages about &lt;code&gt;1 KB&lt;/code&gt; in size. Thus, per day, there would be: &lt;br&gt;
&lt;code&gt;86400 seconds * 100 events/second = 8,640,000&lt;/code&gt; events. &lt;/p&gt;

&lt;p&gt;Since each event averages about 1 KB, the total size of 8,640,000 events = &lt;code&gt;8,640,000 * 1 KB = 8,640,000 KB / (1024 * 1024) = ~8.24 GB&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Thus, with a &lt;code&gt;day-wise index&lt;/code&gt;, we could see that the day-wise index size would be &lt;code&gt;~9 GB per day&lt;/code&gt; without any replicas. Considering 1 replica, the size per day would be &lt;code&gt;~18 GB&lt;/code&gt; and size for &lt;code&gt;30 days&lt;/code&gt; would be &lt;code&gt;~540 GB&lt;/code&gt;. This helps with capacity planning and estimating cluster growth rate.&lt;/p&gt;
&lt;h5&gt;
  
  
  3. Easily determine optimum number of shards &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;With data set of about &lt;code&gt;9GB per day&lt;/code&gt;, for a &lt;code&gt;day-wise index&lt;/code&gt;, we could start by setting &lt;code&gt;"number_of_shards" : 1&lt;/code&gt; in the &lt;a href="https://opensearch.org/docs/latest/opensearch/index-templates/"&gt;index template&lt;/a&gt; since each &lt;code&gt;primary shard&lt;/code&gt; would be about 9 GB which is pretty reasonable for a single shard. Shards for &lt;code&gt;time-based&lt;/code&gt; indices can be in the range of &lt;code&gt;40-50 GB&lt;/code&gt;.&lt;/p&gt;
&lt;h5&gt;
  
  
  4. Avoids having to &lt;a href="https://opensearch.org/docs/latest/opensearch/reindex-data/"&gt;reindex&lt;/a&gt; entire data &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;If the data influx increases, we could easily set &lt;code&gt;"number_of_shards": 3&lt;/code&gt; in the &lt;a href="https://opensearch.org/docs/latest/opensearch/index-templates/"&gt;index template&lt;/a&gt; and this would get effected for &lt;code&gt;tomorrow's&lt;/code&gt; day-wise index. Without the need to reindex any data, the number of shards could be easily changed.&lt;/p&gt;
&lt;h5&gt;
  
  
  5. Efficient Deletion and application of ISM &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Let's say we need to retain data upto 90 days. Thus, for a day-wise index which is &lt;code&gt;older than 90 days&lt;/code&gt;, that entire index can be purged / deleted. This is far more efficient than purging records from indices.&lt;br&gt;
Also, application of &lt;a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/ism.html"&gt;index state management&lt;/a&gt; becomes simplified with time-based indices.&lt;/p&gt;
&lt;h5&gt;
  
  
  6. Easy to include / exclude indices based on alias &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Let's assume the cluster needs to retain 90 days data but needs to search only on the &lt;code&gt;last 60 days&lt;/code&gt; data. &lt;strong&gt;Alias&lt;/strong&gt; to the rescue. In this case, define an &lt;a href="https://opensearch.org/docs/latest/opensearch/index-alias/"&gt;alias&lt;/a&gt; in index template that gets mapped to newly created day-wise indices. As soon as a past index becomes older than 60 days, the alias is removed from that index. This ensures that at any given point of time, the alias will point to a maximum of 60 day-wise indices.&lt;/p&gt;
&lt;h5&gt;
  
  
  7. Snapshot and Restore becomes a breeze with day-wise indices &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Say you have an index named &lt;code&gt;my_index-2021.11.04&lt;/code&gt; created on Nov 04, 2021. On &lt;code&gt;Nov 05, 2021 at say 00:45&lt;/code&gt; hours when data is no longer being written to the &lt;code&gt;my_index-2021.11.04&lt;/code&gt;, a snapshot, &lt;code&gt;snap-my_index-2021.11.04&lt;/code&gt; could be triggered for that index. This snapshot would contain just the &lt;code&gt;my_index-2021.11.04&lt;/code&gt;. In case the index is deleted and needs to be restored, it can be easily restored from the snapshot &lt;code&gt;snap-my_index-2021.11.04&lt;/code&gt;.&lt;/p&gt;
&lt;h5&gt;
  
  
  8. Apply best_compression to day-wise indices &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;The &lt;a href="https://opensearch.org/docs/latest/opensearch/index-templates/"&gt;index template&lt;/a&gt; can be modified to set &lt;code&gt;"codec": "best_compression"&lt;/code&gt; in index settings i.e.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    "settings": {
      "codec": "best_compression"
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Depending on the use case, this could help to &lt;code&gt;save disk space from 10% to 30%&lt;/code&gt; or even more. The mileage would vary.&lt;/p&gt;

&lt;p&gt;"codec": "best_compression" &lt;strong&gt;CANNOT&lt;/strong&gt; be dynamically applied on existing &lt;code&gt;open&lt;/code&gt; indices. The index needs to &lt;a href="https://opensearch.org/docs/latest/opensearch/rest-api/index-apis/close-index/"&gt;closed&lt;/a&gt; first, then the setting applied dynamically and then the index needs to be &lt;a href="https://endpoint_name/my_index-2021.11.05/_open"&gt;opened&lt;/a&gt;.&lt;/p&gt;

&lt;h5&gt;
  
  
  9. Force-merge past indices &lt;a&gt;&lt;/a&gt;
&lt;/h5&gt;

&lt;p&gt;Since the data gets written only to current day's index, in case no updation happens to past data, all past indices are effectively read-only. Thus, such indices can be &lt;a href="https://docs.aws.amazon.com/opensearch-service/latest/developerguide/supported-operations.html#version_7_10"&gt;forcemerged&lt;/a&gt; by setting &lt;code&gt;"max_num_segments":1&lt;/code&gt;. This boosts search speed tremendously. &lt;/p&gt;

</description>
      <category>aws</category>
      <category>productivity</category>
      <category>performance</category>
      <category>opensearch</category>
    </item>
  </channel>
</rss>
