Forem: Quadcode team

Where to find Quadcode experts and news

Quadcode team — Mon, 20 Mar 2023 09:18:59 +0000

Hi, this is Quadcode. We will no longer blog on Dev.to, but our experts will continue to share their knowledge about fintech, IT, HR, design, and management on other platforms.

Check us out on other platforms:

Stay tuned!

Clustered and Foreign Tables in PostgreSQL

Quadcode team — Fri, 10 Mar 2023 15:31:52 +0000

My name is Yakupov Azat. I'm a Data Architect, and here is the continuation of the saga of table types in PostgreSQL. In this part, we'll talk about clustered and foreign tables. Let's look at examples of their creation, areas of application, and pros and cons of their use.

Clustered Tables in PostgreSQL

In PostgreSQL, a clustered table is a table whose physical order of rows has been changed to match the order of the rows in an index. This is different from the normal behavior of a table, where the physical order of the rows may not match the order of the rows as defined by any particular index.

Few people like chaos, everyone likes order. Within the framework of relational databases, the concept of chaos is closely intertwined with the storage of information, because throughout its life cycle, a table is constantly changing.

In the process of working with the DBMS at the disk level, the contents of the table are constantly changing. For example, you've updated data and your updated row falls on another page of the table (here we should mention FILLFACTOR) with a dead tuple in the current position. Then the autovacuum process deleted the dead tuple, and the vacated slot was filled with the newly received row.

A simple test that you can do yourself. Make the following commands into a regular newly created table:

INSERT INTO test(id,name) VALUES(1, ‘Peter’);
INSERT INTO test(id,name) VALUES(2, ‘Ivan’);
INSERT INTO test(id,name) VALUES(3, ‘Sergey’);

After executing the SQL query (please note, there is no ORDER BY):

SELECT *
   FROM test;

You'll see the expected picture:

But by doing a row update

UPDATE test
  SET name = ‘Ruslan’
WHERE id = 2;

and then by executing the same SQL, you'll get:

The order of the rows has changed! Entropy has grown.

Now imagine that you're looking for data in the table, for example, number 4. How can you do this within the chaotic green topology I drew below left? Just going through record after record: you accidentally poke at some number and compare it to the needed number 4. In fact, you'd have to go through all the entries, because there could be more than one number 4. In other words, a sequential scan is needed.

But when you have order, as in the table on the right, you’ll clearly know that the number 4 lies between 3 and 5. That’s the whole point of organizing order and clustered tables: they help create an ordered structure out of chaos. If you randomly select a random position in the ordered blue table in search of the number 4, then three outcomes are possible:

The number is equal to the desired one.
The number is less than the desired one.
The number is greater than the desired one.

This gives a great advantage in the speed of search execution. If the number is greater than 4, you’ll continue to search up the table. If less, you’ll go down. Or you can get a range and search for the number 4 inside it. This is much faster than searching through all the data, as it was in the unorganized green topology, namely, logarithmic times faster.

Consider an example of creating a clustered table:

CREATE TABLE test.cluster_table
(id       INTEGER,
 name VARCHAR) WITH (FILLFACTOR = 90);

CREATE INDEX id_idx ON test.cluster_table (id);

CLUSTER [VERBOSE] test.cluster_table USING id_idx;

Here I created a table called cluster_table and set the FILLFACTOR value for it to 90% — this is the fill percentage. It doesn’t affect our clustered table in any way, it’s just an example of how you can set a property when creating a table of this type. Next, I create a BTree index on the table (CREATE INDEX) in the id field and call the CLUSTER command. The CLUSTER command does the clustering of the table using the index that we previously created.

It’s important to know here that until clustering is completed, all current transactions in the table will be blocked. Traffic blocking occurs because Postgres is trying to rebuild the table in the order you require based on the index. And after creating this order, Postgres should save it to another file.

In fact, this is a disk-level data migration operation from one file to another, but only in the specified order. The data should be placed based on the index, in our case by the id field. I have graphically shown this in the figure below by referring to the metadata before and after the clustering of the table.

Initially, the table was placed in a file with the number 45969. After the CLUSTER command, the file name changed. Data was moved from one file to another. Therefore, blocking occurs, and, accordingly, incoming traffic can’t use this table until it becomes available.

You can also create an index for subsequent clustering that contains many columns (multicolumn index), or specify a descending order for certain columns (DESC / ASC).

You can optionally use the CLUSTER VERBOSE command, which will return details of what PostgreSQL did, namely how many pages there were, which pages were moved, and so on.

Test Cases and Data Order

Let’s do a little test:

CREATE TABLE test.cluster_table
(id       INTEGER,
 name VARCHAR) WITH (FILLFACTOR = 90);

CREATE INDEX id_idx ON test.cluster_table (id);

INSERT INTO test.cluster_table
  SELECT (random( )*100)::INTEGER, 'test'
     FROM generate_series(1,100) AS g(i);

SELECT id
   FROM test.cluster_table;

Let’s create a table, an index by the id field and then generate 100 arbitrary rows using the generate_series command. The result is unordered data:

To achieve order in the output, we need to add the keyword ORDER BY. But here it’s important to remember that the ORDER BY operation also requires resources and you should pay for it. Every nanosecond counts with high-load traffic, and then there’s sorting.

In this case, let’s do the clustering of the table with the CLUSTER VERBOSE command, using the index that I created in advance:

CLUSTER VERBOSE test.cluster_table USING id_idx;

SELECT id
 FROM test.cluster_table;

Voila, the data is sorted without sorting:

But there’s a trap here. Let’s update all the rows — but in fact, it’s enough to change the value of one row.

UPDATE test.cluster_table
   SET id = id * (random( )::INTEGER);

SELECT id
 FROM test.cluster_table;

In this case, chaos will return to our clustered table:

To restore the order, you’ll need to run the CLUSTER command again. You don’t even have to specify the index again, because it’s preserved in the PostgreSQL metadata. And the database will understand next time what you’re doing clustering on.

CLUSTER VERBOSE test.cluster_table;

SELECT id
   FROM test.cluster_table;

You’ll be able to observe the order again only after the CLUSTER command. This is the Achilles heel of clustered tables: any change in the clustering key can immediately bring disorder to the data.

When Clustered Tables are Suitable

Clustered tables are suitable if your data is reference tables (well, or SCD — Slowly Changing Dimension), for example, an address system. This type of tables is convenient if you upload new data quite rarely, for example, once a month.

If the table changes very often and is subject to INSERT, UPDATE and DELETE operations, it will have to be clustered constantly, and this is not very convenient and generally critical. The purpose of clustering is to avoid unnecessary ORDER BY in constant queries to the table by a clustered field or fields.

Metadata of a Clustered Table

From the metadata of the clustered table, you can understand that it’s clustered:

SELECT  c.oid AS “OID”,
        c.relname AS “Relation name”
  FROM pg_class c INNER JOIN pg_index i 
                  ON i.indrelid = c.oid
WHERE c.relkind = ‘r’ AND 
      c.relhasindex AND 
      i.indisclustered;

The “true” value in the relhasindex field indicates that there’s an index to support clustering. When we rebuild the cluster in the next CLUSTER command, PostgreSQL will use the specified index from the metadata.

Foreign Tables in PostgreSQL

Foreign tables in PostgreSQL are tables that are stored outside of the database. These tables can be located on another database server or in a file, for example. Foreign tables are useful from the point of view of quickly getting data from another source, if you have the ability to join it.

In addition, if you tinker with it, you can provide a so-called data lifecycle, to provide a Retention Policy metric. The following set of tools can help you here:

VIEW (virtual table).
A set of regular tables separated by data retention logic (POOD design) with up-to-date data.
Foreign tables that focus on files that store data outside the database on cheaper disks (here you’ll find old data that exceeded the Retention Policy metric).

There are many foreign tables and connection types, for example:

CSV file.
Connection with many other RDBMS.
Connection to some NoSQL databases.

Let’s look at an example of a foreign table based on a CSV file. The file_fdw extension based on fdw — foreign data wrapper — will help us with this:

CREATE EXTENSION file_fdw;

CREATE SERVER csv_log FOREIGN DATA WRAPPER file_fdw;

CREATE FOREIGN TABLE test.csv (
  id   INTEGER,
  name VARCHAR
) SERVER csv_log 
  OPTIONS (filename '/var/lib/postgresql/file.csv', 
                   delimiter ‘;', format 'csv');

I create a foreign table and describe the attributes, specifying the server for fdw, which I created in advance with options for working with the file.

If I make an SQL query to a foreign table, I’ll see the data that is presented in the file. Since the foreign table is registered (meaning there’s an entry in the PostgreSQL metadata), I have a hypothesis: is the data stored not in an external file, but in a PostgreSQL data file?

SELECT  oid AS “OID”,
        pg_relation_filepath(oid) AS “File path”,
        pg_relation_size(oid) AS “Relation Size”  
 FROM pg_class
WHERE relname = ‘csv’;

Execution result:

So, the foreign table as an object is registered in the metadata (there’s an OID identifier of the object), but there’s no corresponding data file, that is, the data is presented only in an external source.

Queries to Foreign Tables

How do queries to foreign tables work? Let’s take the CSV file as an example.

While the data’s being loaded, there’s a fairly long time delay, so we store the old data somewhere on the old disks. To get the data, we need to open an external file descriptor, then copy the data to memory or to a temporary file and return the data to us. If we re-execute the same request a little later, there won’t be any acceleration: the process remains the same.

There are a great many libraries of foreign tables for various needs. For example, postgres_fdw. With its help, we can connect to PostgreSQL from PostgreSQL. It’s very much like a database link:

CREATE EXTENSION postgres_fdw;

DROP FOREIGN TABLE test.csv;

CREATE SERVER pg_log FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host '192.168.56.10', port '5432', dbname ‘course_db');

CREATE USER MAPPING FOR test SERVER pg_log 
OPTIONS (user 'test', password 'test');

CREATE FOREIGN TABLE test.csv (
  id   INTEGER,
  name VARCHAR
) SERVER pg_log 
  OPTIONS (schema_name 'test', table_name ‘user');

A huge number of libraries are available for working with external sources. For example:

Oracle, MySQL, SQLite, MS SQL Server, Sybase.
Cassandra, MongoBD, HBase, Redis, Neo4j.
Twitter, Telegram
JSON, XLM, GeoFiles, LDAP.

Metadata of the Foreign Table

As we found out, the foreign table as an object is fixed in the metadata:

SELECT  oid AS "OID",
        relname AS “Relation name",
        CASE
         WHEN relpersistence = 'p' THEN 'Permanent'
         WHEN relpersistence = 't' THEN 'Temporary'
         ELSE 'Unlogged'
       END AS “Type”,
       relkind AS “Subtype”
 FROM pg_class
WHERE relname = ‘csv’;

It’s a permanent table (surprisingly), but it has a pointer “f” which is a subtype of the relation. And it indicates that our table is foreign, that is, external.

In the Next Series

That’s all for today. In the following material we’ll analyze:

Partitioned tables.
Inherited tables.

Logged, Unlogged and Temporary Tables in PostgreSQL

Quadcode team — Fri, 17 Feb 2023 10:12:57 +0000

There are many different types of tables in PostgreSQL. Each of them is designed for specific tasks. The most common and well-known is the heap table. I wrote about its structure in another article. The standard table allows you to store rows, update data, and make OLAP and OLTP requests.

However, there are still a number of tables that are simply forgotten about. In my opinion, the interesting tables now are unlogged and temporary tables. In this article we'll talk about them and compare them with logged tables.

Logged Tables in PostgreSQL

In PostgreSQL, a logged table is a type of table for which all changes to the data are recorded in the write-ahead log, which is used for crash recovery and replication.

Let's see what a log is in relational databases. To do this, I'll draw an analogy with the real world. You get a key at the reception desk and sign in to the visitor's log. The signature is your ID. The timestamp you enter in the log shows when you took the desired key. Your reception desk logbook entry is a pre-recording of the event you want to do with the key you received. For example, you take the key from the stand and go to open the office.

When we start a transaction, PostgreSQL saves events to log files. Log files within PostgreSQL are alerting files: WAL files or Write-Ahead Log files. They first write to themselves the change you want to make, and only then is that change applied to the table itself, located in memory or on disk.

Log files are needed in order to ensure the operability of relational databases in terms of ACID. Let me remind you that ACID is a set of relational database properties that guarantee, among other things, the reliability of the transaction:

Atomicity.
Consistency.
Isolation.
Durability.

These four letters also have to do with how a relational database can recover from errors.

The log files provide the possibility of data recovery if for some reason the server or process has failed. When the process is reactivated, the relational database restores itself by reading the log files. To do this, Postgres uses an algorithm of the ARIES family (Algorithms for Recovery and Isolation Exploiting Semantics), which:

Generates undo and redo lists.
Plays transactions from the redo list, with the condition that the occurred checkpoint is executed/not executed.
Rolls back transactions from the undo list to preserve data consistency.

Another reason for using log files is data replication from the master to independent StandBy servers. With the help of preemptive recording files, we can replay events on dependent replicas that occurred on the database master server.

Moreover, log files can act as a source of an event model for reproducing changes in a corporate data warehouse, using, for example, the Debezium tool.

Let's look at a simple example:

BEGIN;
  INSERT INTO test.log VALUES (1, 'Peter');
COMMIT | ROLLBACK ;

I start the transaction with the word BEGIN, then I insert it into a pre-created Logged table test.log. And I insert a test entry with Peter and a 1 as the identifier. After that, I do either COMMIT or ROLLBACK.

As a result, our simple transaction records its every step at the log file level:

Unlogged Tables in PostgreSQL

An unlogged table is a type of table for which changes are not recorded in the write-ahead log. So, let's consider the same test case with an unlogged table to see the changes.

BEGIN;
  INSERT INTO test.unlog VALUES (1, 'Peter');
COMMIT | ROLLBACK ;

Unlike the previous example, data changes using INSERT, UPDATE or DELETE operations don't get into the log files. They go directly to the data storage segment that corresponds to our table:

To better understand the difference between logged and unlogged tables, let's move on to a more illustrative example.

Logged table. I'm creating a standard log table, logged by default:

CREATE TABLE test.log
(id       INTEGER, 
 name VARCHAR);

BEGIN;
   SELECT txid_current( );
   SELECT pg_current_wal_lsn( );
   -- 1/AC0AFC8

   SELECT pg_walfile_name(pg_current_wal_lsn( ));
   -- 00000001000000010000000A

   INSERT INTO test.log VALUES (1,'test1');
   INSERT INTO test.log VALUES (2,'test2');
   INSERT INTO test.log VALUES (3,'test3');

COMMIT;

The first function txid_current returns the transaction number. Next, I get the current LSN (with a comment in the code, I specified the number that was returned to me after executing the pg_current_wal_lsn function). I also get this number for the pg_walfile_name function, and as a result of its execution I get the name of the current file that corresponds to this LSN record.

To understand how to interpret this, let's go back to the analogy with the key and the log at the reception desk. I made an entry in the log that it was I who took the key. This is nothing other than the LSN. If a lot of people often take keys, then the log (like a notebook) will end sooner or later. Then you'll need to get a new log (like a notebook). In the world of logged tables, a new log is a new WAL file; it appears when the current one is filled up. By default, the size of the preemptive write file in PostgreSQL is 16 MB. You can vary the size within the postgresql.conf configuration file.

Then in the test I sequentially perform simple inserts into the table, one after another, and at the end I commit the data with COMMIT. Let's see what happened as a result:

Using the pg_waldump utility, which is included in the PostgreSQL installation packages, we can get references to our session and our transaction from the preemptive record file. This is done with various keys for the command line. I specify the LSN from which I want to start dumping this binary file, and specify the name of the WAL file that I received as part of my session.

Then we find that the log file for our logged table contains the specified order of INSERT commands from our transaction. There are three of them here:

INSERT+INIT.
INSERT with offset 2.
INSERT with offset 3.

At the end we find COMMIT - our recording of the data. The data is logged and then published for general access to all users.

Unlogged tables. Now let's see how an unlogged table behaves. The test is the same, only previously I created a table with the unlogged option. The code tells the database that my table is unlogged.

CREATE TABLE test.unlog
(id       INTEGER, 
 name VARCHAR);

BEGIN;
   SELECT txid_current( );
   SELECT pg_current_wal_lsn( );
   -- 1/AC0B0E0

   SELECT pg_walfile_name(pg_current_wal_lsn( ));
   -- 00000001000000010000000A

   INSERT INTO test.unlog VALUES (1,'test1');
   INSERT INTO test.unlog VALUES (2,'test2');
   INSERT INTO test.unlog VALUES (3,'test3');

COMMIT;

If we run the same test, make a dump of it and read it through the pg_dump utility, we'll see only COMMIT. There will be no inserts in the results:

In the figure above, the indicator means there are no corresponding events.

An unlogged table is a data set that isn't controlled by a WAL file.

When Unlogged Tables are Suitable

Unlogged tables are useful when the data in them isn't critical, but the filling speed is critical.

The simplest example is OLAP queries for generating data marts. If the marts are filled based on a process, you aren't afraid of the risk of data loss, because there's that same ETL (sometimes ELT) process that transforms the data at the SQL query level (possibly not SQL). You can simply restart the process and get the data again.

Headers of Logged and Unlogged Tables

To continue the comparison, let's look at the headers of our two tables. To do this, I use the pageinspect extension, which provides methods for working with pages within a specific table. So, let's consider the result of executing the page_header function, which returns page headers number 0 for both tables.

In a comparative analysis, the table headers are exactly the same, because the data in them are identical. The only thing that's different is the first column storing the LSN. There's a number in the logged table, there's simply no number in the unlogged table. This means that the unlogged table is represented by data in its own data file, but not in the logging files.

If the PostgreSQL process crashes for some reason or the electricity is turned off, this will be an extreme shutdown of your PostgreSQL instance. In this case, when restoring the system, PostgreSQL will apply the TRUNCATE command on the unlogged table. This command deletes all rows from the table, which means your data will be lost. It should be noted that if there was a "correct" stop of the PostgreSQL server, for example, through the operation systemctl stop postgres, then the data in the unlogged tables will be saved at the next start.

So, with unlogged tables, you need to keep the following in mind:

There is a risk of data loss.
High speed of filling / changing data.

How to Сonvert a Logged Table into an Unlogged One or Vice Versa

You can convert the logged and unlogged states by applying the following commands:

ALTER TABLE test.t1 SET LOGGED; 

ALTER TABLE test.t1 SET UNLOGGED;

It's important to know that when you try to make a table logged from unlogged, the change process temporarily blocks SQL traffic that uses the table as part of CRUD operations. These are all operations that are possible in a table from the user's point of view: INSERT, UPDATE, DELETE, SELECT.

User sessions that attempt to access the table will wait until the ALTER TABLE command completes. That is, you will have sessions waiting on the backend and/or frontend. Of course, if you have set up a timeout for the session, then it will wait for the set time.

When you convert a table to a logged one, then all the contents begin to be written to the log files. This is necessary so that the preemptive write file has all the data in the event of a failure. If the table is large, the process of filling in the WAL files will take much longer than a couple of seconds.

The SET UNLOGGED operation to convert a table to an unlogged one is fast. It simply changes the status at the PostgreSQL metadata level that your table is no longer logged, and it no longer needs to write to the preemptive write files. The SET UNLOGGED command temporarily blocks the metadata level, so you need to make a small window in CRUD operations for the change to apply.

Pros and Сons of Unlogged Tables

Let's summarize unlogged tables in PostgreSQL.

Pros:

High speed for UPDATE, DELETE, INSERT operations. I conducted my test on my PostgreSQL server, and the coefficient for TPS - the number of transactions per second - for the unlogged table was 9 times higher than the logged one.
Any indexes and TOAST tables will automatically be unlogged, since these are dependent objects that "inherit" the properties of the main object.

Let me remind you that TOAST tables are satellite tables. They help to store long strings and divide them into pieces within an additional table that supports the original one. I talked about this in more detail in another article.

Cons:

Automatic clearing of data in the table after a crash.
The table content is not replicated on the StandBy server, since there are simply no events based on the data in the master server's WAL files.

Temporary Tables in PostgreSQL

Let us turn our attention to temporary tables. Temporary tables in PostgreSQL are a type of table that exists only for the duration of a database session or transaction.

CREATE TEMPORARY TABLE tmp
(id       INTEGER,
 name VARCHAR)
ON COMMIT DELETE ROWS
[ON COMMIT PRESERVE ROWS]
[ON COMMIT DROP];

When creating a temporary table, you can't explicitly specify a schema for the table. All temporary tables in PostgreSQL fall into the corresponding schema named pg_temp_xxx, where xxx is your session number (for example pg_temp_8 or pg_temp_165). The schema for the session is created automatically by PostgreSQL itself.

The creation operation has rules for forming the life cycle of a temporary table at the code level:

ON COMMIT PRESERVE ROWS is set by default. This means that if you COMMIT or ROLLBACK data within transactions, the rows in the table will be saved until you end the session. Only after that will the table be automatically deleted.
The ON COMMIT DROP option means that if there is a COMMIT or ROLLBACK of data within the transaction, the table will be automatically deleted from your session immediately.
The ON COMMIT DELETE ROWS option means that the structure will be preserved at the data commit level during COMMIT or ROLLBACK, but the rows will be deleted. But in any case, when the session is closed, the table will be deleted automatically.

The life cycle of a temporary table looks schematically as shown in the figure below:

When Temporary Tables are Suitable and When Not

Temporary tables are suitable as a temporary buffer for storing your data set during a mart rebuild. They are convenient to use for analytical traffic.

I've encountered temporary tables in situations where a mart that collects a large stack of data is a kind of monolith within an SQL query, and this SQL query can run for several hours. In order to optimize such a query in some way, you can divide it into SQL parts for preliminary data preparation and store the prepared data parts in temporary tables.

You can temporarily store data within your entire pipeline (a series of processes) and fill in a temporary table as part of the data retrieval and transformation. At the end of the process, you can use a temporary table, for example, in JOIN operations with permanent tables. After the data is committed and you exit the session, all temporary tables will be automatically deleted. The next time the process is called, they are created again in a new way and then it goes round in a circle.

A temporary table can also contain indexes. You can create a temporary structure, upload data there, create an index on them, and then make a query to this temporary table and get your queries accelerated.

I don't recommend using temporary tables if they're needed on the backend as part of OLTP traffic. An example of such traffic is the purchase of concert tickets or bank transactions. In other words, it's a fast business transaction that has to get your data, save them and give you a response.

A large uncontrolled flow of creating temporary tables, and hence temporary schemes, interferes with the optimizer, the process of rebuilding statistics. You have to be careful with this.

Tests with Examples of Creating a Temporary Table

Let's look at the following test with the creation of a temporary table inside a transaction:

BEGIN;
   SELECT txid_current( );
   SELECT pg_current_wal_lsn( );
   -- 1/AC54128

   SELECT pg_walfile_name(pg_current_wal_lsn( ));
   -- 00000001000000010000000A

   CREATE TEMPORARY TABLE temp 
   (id       INTEGER,
    name VARCHAR)
   ON COMMIT PRESERVE ROWS;  

COMMIT;
-- close connection

Before creating the table itself, I get the same information as in the previous example with the logged table: the LSN and the name of the current WAL file that corresponds to this LSN record.

Let's see what's in the log file:

When creating the temporary table, 120 events occurred. That is, its very creation initiates writing to the log file. And as soon as we close the connection to the database, 40 more events occur about the fact that the table itself has been deleted.

Now we'll preliminarily create a temporary table in the session:

CREATE TEMPORARY TABLE temp 
   (id       INTEGER,
    name VARCHAR)
   ON COMMIT PRESERVE ROWS;  

BEGIN;
   SELECT txid_current( );
   SELECT pg_current_wal_lsn( );
   -- 1/AC90368

SELECT pg_walfile_name(pg_current_wal_lsn( ));
   -- 00000001000000010000000A

   INSERT INTO temp VALUES (1,'test1');
   INSERT INTO temp VALUES (2,'test2');
   INSERT INTO temp VALUES (3,'test3');

COMMIT;

With such a test, the log file will have the same picture as it was for the non-logged table:

Headers of Logged and Temporary Tables

If we compare the headers of the zero pages of a regular logged table and a temporary table, it will look exactly the same as for an unlogged table:

Everything is the same except for the LSN. For a temporary table, it's empty, and because of this, the contents of this table aren't transmitted to StandBy. If there's no record in the preemptive record files, there's nothing to transmit to the dependent servers as physical replication.

Pros and Cons of Temporary Tables

Pros:

Acceleration for UPDATE, DELETE, INSERT operations.
Any indexes and TOAST tables will automatically be temporary.
For temporary tables, you can't manually define the schema of the database, because it's created automatically and defined for your temporary table automatically, too.

Cons:

Table content is not physically replicated to StandBy servers.
It's necessary to create a temporary table each time for a new session.

In the Next Series

We've looked at the difference between logged and unlogged tables, and also talked about cases where temporary tables can be useful. In the following material we'll analyze:

Clustered tables.
Foreign tables.
Partitioned tables.
Inherited tables.

A Retrospective of Working with Bare Metal Kubernetes, or To There and Back

Quadcode team — Mon, 09 Jan 2023 14:36:28 +0000

The Kubernetes Platform team in Quadcode implements, supports and maintains Kubernetes and all processes around it. For five and a half years, our clusters and approaches have been changing and evolving. In the article we'll tell you how we started, what we came to, and how we managed to make the maintenance of bare metal clusters comfortable.

Our Сlusters and Team

Now there are 5 people in our team. But for the entire existence of Kubernetes in the company, about 17 engineers have worked with it.

We have three clusters in three environments:

Prod—the largest cluster.
Preprod—together with prod this is used for most of the company's stateless applications.
Infra—mainly used for short-lived applications, for example, GitLab, Jenkins Runner, browser tests and the like.

Prerequisites. Before 2017

The main prelude for the appearance of Kubernetes was the same as for many other companies: refactoring a monolith into microservices.

Microservices lived on LXC containers sliced by Ansible. Servers in this case were issued by the following process:

The team requests a server through an internal form: it clicks on the configuration and sends a request.
A letter is generated that goes to the data center.
The data center sends the server to the intermediate Infra Support team, which performs the initial configuration. The initial setup should be the same on all company servers.
The server is passed to the team that requested it.
The server is then fine-tuned to the needs of the team.

A test platform is being created that allows you to raise microservices packaged in Docker on a temporary machine in DigitalOcean. Along with the microservices run all its dependencies, and on whom it is dependent. The IP is given to the developer, and the developer goes to a temporary build and tests their microservice.

We really like how Docker performs in terms of identical supply units. And we want Docker to become a single runtime for all our microservices: it seems convenient.

But to transfer everything to Docker and run bare Docker containers on hardware or in the cloud is the way into the abyss of manual work. In addition to restarts, you also need to solve the issue of fault tolerance of virtual machines running Docker, and you need to manage containers. Therefore, we begin to select an orchestration tool, and the choice falls on Kubernetes.

Choice of an Orchestration Instrument. Q2 2017

Having chosen Kubernetes as an orchestration tool, we begin to decide where to deploy it: on hardware or in the cloud. Except there's no expertise in it at all. We begin to think and evaluate what would be best for us.

Self-hosted Kubernetes would be understandable for us in that it can be touched. We know the servers on which we'll run it, we know how they're issued, how they're configured, and what they have inside. Plus, these servers will be delivered directly to our data center, which satisfies the wishes of the business and security: everything should be in the internal network perimeter and have low latency, which is important to us because we're a fintech company.

You can't touch the cloud. On the one hand, it's unclear what kind of hardware is there. On the other hand, it's clear that there'll be no need to rush around with red eyes when someone pulls the power from the rack. At least it won't be you running around.

There are also marketing rumors that it's easier to manage the cloud. The community has little real experience on loaded projects, so it's difficult to assess how true these rumors are. At the same time the demands of the business and security don't go away. To ensure an internal network perimeter and low latency, you need to go to the NOC's, set them a task and work with them. Not to say that they're evil, on the contrary, but the task is to make an orchestrator, not a hybrid infrastructure. Plus, at this point we don't have any prod infrastructure and deep expertise in the cloud yet.

And the cost, even if estimated now:

Surprise (not), it's cheaper to rent hardware. Plus, in the case of the cloud, in addition to computing resources, you will need:

VPC;
TGW;
Direct connect;
ALB, ELB, NLB, NLP;
VPC peering;
and so on.

And people with background not only in K8s, but in all these abbreviations. As a result, it turns out that the questions are as follows:

Whose expertise are we willing to invest in? In internal Kubernetes, or in the expertise in conditional AWS?
When something goes wrong after the application enters K8S, and something always goes wrong, to whom will the question be addressed as to what exactly the problem is? To people inside the company or to the technical support of the conditional Amazon?

Regardless of whether we have a bare metal or a cloud cluster, we'll need to develop processes for deployment, application integration, application maintenance, authorizations, access control. Again, it'll be necessary to pay people who'll do this, regardless of what kind of cluster we have.

Having considered all the reasons, we decide that it's more interesting for us to invest in an internal team and internal expertise. And we raise our bare metal cluster.

Next, we need to choose which bare metal cluster. We find four solutions:

kOps;
Kubernetes The Hard Way;
Rancher;
Kubespray.

We dismiss KOps immediately, since at that moment it raises the cluster in the cloud on EC2. Kubernetes The Hard Way is a wonderful manual that we learn from: command by command, you raise the cluster, figure out what you need for this and how it works. But it's inconvenient to support a cluster with command after command, and in general we're all fashionable and want automation. And if we need to automate, then why not take a ready-made tool.

That leaves Rancher and Kubespray. Kubespray wins, which automates The Hard Way for the most part. Our main arguments for Kubespray are the following:

It's written in Ansible. We've been working on Ansible for a long time, it's convenient for us to read it, and it's more convenient than Rancher source code.
Kubespray even then allows you to flexibly complete the hodgepodge of components that make up Kubernetes.
Kubespray works transparently with the environment, just like Ansible.
Kubespray was being developed by Mirantis, one of the pioneers of Kubernetes at the time. We trusted their expertise and relied on it.

Implementation. Q3 2017

We roll out three of our clusters through Kubespray and deploy the first microservices. For us, the golden time of cluster maintenance is coming, which includes all standard operations for Kubernetes and for bare metal:

Cluster rollout.
Scale cluster.
Development of a process for entering applications into K8S, authorization, recommended deployment pipelines, etc.
Upgrade of node software: Docker, Kernel, etc.
Upgrade of the cluster version.
Renewal of K8s, CNI certificates, etc.

Prerequisites for Refactoring. 2017–2019

For the next two years, clusters grow from 5 to ~20 nodes. And the first problems begin.

One of the problems manifested itself when moving to a new data center. The story is as follows. Engineers from our data center came to us and said: "The old premises is being closed; there'll be a new one." It would seem that we shouldn't have any difficulties because of this: we'll scale the nodes in the new room, we'll give them up in the old one, we'll scale the masters in the new room, we'll give them up in the old one. In theory, there shouldn't be any downtime either. In practice, everything worked with the nodes, but not with the masters, because at that time Kubespray didn't support the scaling of masters.

We thought we could scale the masters ourselves. We already had two years of experience with Kubernetes, which should help. In the test environment everything works. On Preprod we allow downtime and realize that we're not ready to take such a risk with Prod. So we stop at the idea that the simplest solution of physically moving all the masters from one room to another will at the same time be the most reliable.

We pull out the servers in one room, load them onto the car and move them to the new room. In the end, all nine masters were moved successfully, but there was a lingering unpleasant aftertaste of having to physically move servers from one location to another in progressive times.

But this isn't the only problem that reared its head. Kubernetes maintenance operations began to grow strongly over time. Here are some examples:

The scaling of a cluster began to take 4 hours. This is due to the fact that Kubespray allows you to raise almost any hodgepodge of Kubernetes components, and a large number of tasks are simply skipped. When the number of nodes grows, the playbook takes a long time to skip these tasks.

Node's software upgrade took several days, because this is an operation that sometimes requires interaction with the data center, with the same intermediate team. For example, to update the operating system, you need to redeploy the server. It's time-consuming, tedious and extremely unpleasant.

We had to upgrade from version 1.12 to 1.14 by rolling from one cluster to another. It took a whole quarter, because it was necessary to set all the teams within the company tasks for a redeploy, and wait until they did it.

So as not to be unsubstantiated, let's take a look at the execution time of Kubespray in a sample configuration. For measurements we used the latest version of Kubespray and ran it on a different number of EC2 instances. Here is a graph of the execution time depending on the number of nodes:

Unlike in 2019, now the execution time has decreased from 4 hours to 2 hours, but there's still growth, and this growth is tens of minutes for every 3 nodes. That's a lot. Imagine what will happen if you have a cluster of 50 nodes, and you need to roll out Kubespray. Rolling out Ansible playbook on Kubespray for 4 hours is annoying. Plus, Ansible may return an error, and you'll need to spend another 4 hours.

At this stage, we came to the following conclusions:

Kubespray takes a long time to skip tasks.
We don't need a large number of additional Kubespray tasks.
Kubespray doesn't support all operations that we need. An example is the scaling of masters.
The rich variability of cluster configurations is more harmful to us than necessary. We don't need any but our own, configured, understandable, working cluster with our variables.

At that time, we had already figured out the Kubespray variables, how CNI and Kubernetes itself works. This knowledge should be enough to write the operations themselves, implement them and maintain them. Therefore, we decide to write our own playbook and remove all unnecessary tasks for our infrastructure that Kubespray does. And if some binary appears which will take care of some more additional tasks related to cluster initialization and node join, that's great.

Tool Selection. Q3 2019

Here we enter the second round of the round trip, and return to the choice of instrument.

Next, we look for a binary that will take over the tasks associated with init and join. And almost immediately we find Kubeadm. It suits us because:

It was out of beta.
Our team got the context, and realized that we can delegate some processes.
Kubeadm is becoming the recommended way to operate a K8s cluster, including in the official documentation.

Implementation. Q4 2019

We are writing our playbook, replacing a large number of tasks with exec and Kubeadm. Implement and roll to version 1.16 from the Kubespray cluster to Kubeadm, which we wrote ourselves.

The playbook looked something like this:

# Create audit policy files on each master (for apiserver and falco)
- hosts: kube-master
  become: yes
  roles:
    - { role: kubernetes-audit-policy, tags: "kubernetes-audit-policy" }
  tags: ["setup-cluster", "k8s-audit-policy"]

# Setup first master (kubeadm init). It is executed only on first play on firs master.
- hosts: kube-master[0]
  become: yes
  roles:
    - { role: kubeadm-init, tags: "kubeadm-init" }
  tags: ["setup-cluster", "kubeadm-init"]

# Generate join tokens and join new masters/nodes in cluster.
- hosts: kube-master[0]
  become: yes
  roles:
    - { role: kubeadm-join, tags: "kubeadm-join"}
  tags: ["setup-cluster", "kubeadm-join"]

# Setup calico using helm
- hosts: kube-master[0]
  become: yes
  roles:
    - { role: kubernetes-networking, tags: "kubernetes-networking"}
  tags: ["setup-cluster", "kubernetes-networking"]

# Add labels and anotations on nodes for manage taints using helm:
# app.kubernetes.io/managed-by: Helm
# meta.helm.sh/release-name: node-taints-labels
# meta.helm.sh/release-namespace: kube-system
- hosts: kube-master[0]
  become: yes
  roles:
    - {role: kubernetes-annotations-labels, tags: "kubernetes-annotations-labels"}
  tags: ["setup-cluster", "kubernetes-annotations-labels"]

################################### Install and configure rsyslog ##################################
# Install and configure rsyslog binary
- hosts: all

Operation includes all the same standard operations. At the same time, our own playbook allowed us to drastically reduce the cluster scaling time: from 4 hours to 9–10 minutes for 15 nodes:

At that moment we thought:

Wow, victory! When the next 4 hours will be, is not very clear yet. We've grown by 15 nodes in 2 years, and the current solution will probably be enough for us for a long time.

And we'd never been so wrong before. Kubernetes gets a reputation as a very stable platform and for trust in the company. Therefore, the task comes from the business that now almost all the stateless microservices, if possible, need to be rolled into Kubernetes.

Another 2 years pass. The cluster size grows from 20 to ~100 nodes. The time of operations increases again, and the cluster scaling begins to take 1.5 hours. If we need to upgrade kernels or Docker, then it takes even more time: if we previously had to reboot 20 nodes or reboot Docker, now we need to do the same on a hundred nodes.

Kubeadm didn’t support upgrading the cluster version from 1.16. We moved to version 1.17, which took six months, because the number of microservices in the cluster became simply titanic. With version 1.17, we've already written our upgrade playbook, and since then, up to version 1.22, we've upgraded jobs in Jenkins by pressing a button and periodically watching the process.

It Seems Like We've Been Through All This. Q3 2021

An attentive reader will notice that the increase in the execution time of standard operations was discussed literally a section ago. We also knew perfectly well that we were back where we started from.

Here our team has a logical question: is there a need to change something? Kubernetes works stably. An hour and a half for performing the operation is unpleasant, but so far it's not four hours, after all. Making changes for the sake of changes is pointless, and it's worth finding a good motivation.

Motivation came from the business:

You have done a great job in the last four years. Most of the microservices are in Docker, and we've achieved the goal of a single runtime. But there's a problem. We have Kubernetes in the prod, and Docker Swarm with Docker containers on the test platform. This leads to the fact that people deploy in different ways and the behavior of microservices on the test platform and in the prod is different. We'd like everything to be the same.

There are two options: either remove Kubernetes from the prod, or add Kubernetes to test builds. We don't want to remove K8s, because it's performing very well. We decide that we need to roll out K8s builds, but already on the test platform, which has become cluttered with additional clouds, including Amazon, by our efforts. And we decide that we'll raise temporary Kuber clusters on EC2 instances as a prototype. But we need technical specifications: for how long should it take to raise these clusters, how many, how many nodes.

The business gives the specifications:

Time to raise < 3 minutes on EC2.
Number of nodes—discretionary.
Number of clusters—discretionary.

The number of builds on the test platform is from 200 to 1000 per day. Here we understand that the current flow won't suit us: it'll take 200–1000 times to create EC2 instances, start inventory, go to Jenkins, press the playbook button and give the IP of the cluster to the developer. We could make a webhook with Jenkins jobs, automate inventory in Ansible and achieve some kind of automation. But why automate the rollout of jobs and Ansible playbook if you need to automate the creation of a cluster?

We reject EKS/GKE at once. Whoever created it knows that it doesn't even take three minutes.

Our team's interest in the coming changes is two-fold:

It's possible to provide constant and daily verification of the configuration on running test environments.
Somewhere in here must be the answer to the question of how not to roll out hardware for three hours, picking up an Ansible error.

Decomposition of the Process

For four years we had been doing what pushed playbook onto our hardware, and now we had to do something else. To figure out how to do something else, we leave all the accumulated knowledge and return to the sources, namely Kubernetes The Hard Way.

The Hard Way allocates nine stages to raise the K8s cluster. Each stage has its own number of Bash commands, only about a hundred for each master or node:

We can automate all these hundreds of commands for each node with some kind of script, but we want something simpler. And there's a way to make it easier in the Kubernetes documentation. Kuber offers Kubeadm for cluster management. And if you look at what Kubeadm does, you can conclude that for the first master, the five steps from The Hard Way are wrapped up into a Kubeadm init command with a pre-generated config. For additional masters, all operations are wrapped up into four commands:

If it seems that even four commands is a lot, there's a life hack. You can generate certificates for kubelet in advance:

client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem

Add them to the node together with:

"clientCAFile": "/etc/kubernetes/pki/ca.crt"
/etc/kubernetes/kubelet/kubelet-config.json

And run kubelet with the key -- register-node. It will register in the cluster as a node, and even Kubeadm join in this case doesn't have to be done (although you'll have to do commands to generate certificates and place them, but maybe for some people it's more transparent).

So, we have four commands that need to be executed correctly, at the right time, with the right config. In order to execute them correctly, we begin to make a prototype.

Prototyping

Step 1. First of all, we go to our playbook, make a Dev environment, and for variables in group vars, we put placeholders instead of static values. We roll out the configs on an EC2 instance without Kubeadm execs. We get the following configs:

Step 2. We make a snapshot from this EC2-instance and use the test platform to run new machines from this snapshot, for which we prescribe Amazon metadata. The metadata looks something like this:

~ cat ud-decoded
{
 "domainsEnv": "BASE_DOMAIN=* API_SERVER_DOMAIN=* ETCD_DOMAIN=*",
 "masterAddr": "master01.build126.*",
 "kubenodesAddr": "node01.build126.* node02.build126.* node03.build126.*",
 "myDomain": "node03.build126.*",
 "myRole": "node"
}

In the AWS metadata, there are just variable values for placeholders, and the role of the node is master or node.

Step 3. We write a script in cloud-init, which goes to the Amazon metadata server, takes metadata and substitutes variable values in placeholders. Depending on the role of the node (main master node/additional master/workload node), the script runs the corresponding operations and raises the cluster on each node independently of other nodes:

We launch cluster instances from a snapshot, where we put configs and a script, with metadata affixed, and wait 180 seconds for init and join of all nodes—PROFIT.

We get the following prototyping results:

The time to raise on EC2 is truly < 3 minutes. We have a wait of 180 seconds. If the master didn't wait for all the nodes that were transferred in the metadata in 3 minutes, then the script crashes with exit code 1.
However, there are as many clusters as you want, as long as there are instances in the cloud.
It is unclear how to scale and update everything on the fly, but now such a task isn't worth it, because such builds don't live longer than a week.

We think it turned out well for us. And this is strange, because when does it turn out well the first time? And we decide to see how the vendors have done it.

And How about the Vendors?

We take SSH in one hand, we take grep in the other hand, and go to the EKS nodes to start.

EKS. We grep the word eks in cloud-init-output.log and find the script /etc/eks/bootstrap.sh:

In the screenshot is the main function of the script, if you remove all the if's. If you look closely, you can see that:

Certificates and variables are taken from metadata via AWS CLI.
Sed swaps out these variables in configs with placeholders.
The systemctl start cubelet service is launched.

Basically, this is an automation of The Hard Way.

DigitalOcean. There's nothing in the cloud-init DO logs, but if you look at the cloud-init scripts, you can find the 000-k8saas script:

Inside it you can see the following:

Curl takes metadata and adds it to a file on the file system.
From there, metadata is exported to env variables.
If a node has the master role, the bootstrap-master script is run; if the kubelet role, then bootstrap-kubelet is run.

You can't see the scripts themselves, but the 10-kubeadm.conf file and the Kubeadm binary on the node itself hint at how the control plane, and the node in particular, is raised.

The metadata looks like this. We were very surprised when we saw a similar logic with our roles:

GKE. So, we saw about the same architecture for two vendors. We went to GKE, and there we saw, at first glance, something different. We grepped kube in cloud-init-output.log and saw the following services:

We thought, what are these interesting services? We went to systemctl, found kubernetes.target, which has kube-node-configuration and kube-node-installation services:

Here we relied on know-how, but in the end these are oneshot services that lead to the scripts:

The functions in both scripts are about the same. As an example, let's look at the download-kube-masters-certs function:

Here's what happens in it:

Curl takes metadata from the metadata.google.internal server.
Using a one-line Python from yaml-format, metadata, in this case certificates, are exported to an env-variable.
If you look further along the script, there'll be the same sed's and systemctl start Kubelet.

In general, the bricks to raise the cluster in this way are the same everywhere, more or less:

There's some kind of metadata server—in the case of the cloud, these are their internal servers.
There's some kind of binary that will pick up this metadata. Some have AWS CLI, some have curl.
There's some kind of binary that will replace metadata. Generally, this is sed.
There's something that will raise the cluster either in the form of Kubeadm or in the Hard Way style.
There are a bunch of systems that generate this metadata that are scattered across the cloud, and therefore the time to raise the control plane is about half an hour everywhere. But in our case, everything is faster, because there aren't so many of them yet.

Implementation (but Different)

Our HW clusters with their problems are still in place. We want them to operate in the future in much the same way as cloud clusters. For this we need:

It's possible to write all this, but you need to decide how and what tools to use. As always, this is the most difficult question. We're going in that direction, and it's already clear that when we get there, we're going to run into new problems.

The first thing that suggests itself is that it'll be a pull system with all its problems, where someone made a mistake, and it rolled out without our knowledge, something broke, but it's unclear where and how. Regarding the fail fast A/B configuration testing approach, we hope to have this out of the box, because bare metal clusters in this case become a special case of our cloud clusters, which are raised on EC2 instances.

At the same time, the system itself will objectively become more complicated. We used to have—added inventory and went to see in the log what Jenkins and Ansible playbook were doing there. Now there are some provisioner instances, metadata providers, a snapshot provider if we need them, a script that goes to the metadata server. How they all interact with each other is a separate question.

On the one hand, these are all new services; on the other hand, no one prevents them from being built on the principle of microservices. We know how to work with them: impose monitoring, logging, create observability, and all this will turn from minuses into pluses. At this stage, microservices reach infrastructure maintenance. It's probably a good thing.

Tasks for the Future

If we move away from theory and talk about the tasks that we plan to accomplish, then we really want to:

Updating cores for 100 nodes per hour.
Confidence in disaster recovery.
So that all environments roll out the same way.
To get away from dependence on the size or number of clusters.

If we talk about the order in which we'll go further, then it's approximately the following:

Add scale and upgrade to EC2 cluster.
Describe all the additional systems needed around bare metal.
Coordinate the toolkit with other teams and businesses.
Implement.
Restart the cluster nodes.

Retrospective Conclusions

Looking back 5.5 years ago, we believe that by and large our path was right. The decision to invest in the internal expertise of the team allowed us to eventually do something that's similar to the vendors' solution, but on Bare Metal, combining both the ability to independently control the configuration of K8s and the ability not to spend a lot of time supporting K8S itself.

But what would we do if we were implementing clusters from scratch now? The business, most likely, would say: "Let's go to EKS, there are experts in the community, and everything seems to be working." However, all the questions about who'll figure things out when something goes wrong—tech support or internal people—they remain. And it seems that in the process of working things out, we would have reached the choice of our own solution.

Most likely, this would also have been a push configuration. If you look from scratch at all these systems that are needed to operate a cluster in three minutes, it can get scary. Perhaps we would be afraid that we need metadata, scripts, and we need to write something around a cluster when we just want to get a cluster. But the iteration itself between a push configuration and a pull configuration would be faster now: there's already community experience and there are people who can share it.

TOAST tables in PostgreSQL

Quadcode team — Wed, 21 Dec 2022 14:32:00 +0000

In the previous article we learned about the structure of a Heap table and a page in PostgreSQL. In this part we'll deal with TOAST tables, also known as satellite tables. They help to split the data in terms of the length of rows.

TOAST tables

TOAST means The Oversized-Attribure Storage Technique. A TOAST table is a regular Heap table, and it essentially inherits the properties you specify on the original table.

SELECT relname, relpages, oid
  FROM pg_class,
             (SELECT reltoastrelid
                 FROM pg_class
              WHERE relname = 't') AS ss
 WHERE oid = ss.reltoastrelid OR
               oid = (SELECT indexrelid
                            FROM pg_index
                         WHERE indrelid = ss.reltoastrelid)

Let's go back to our table t from the previous article:

CREATE TABLE public.t
(
 A INTEGER,
 B INTEGER,
 C VARCHAR
);

BEGIN;
INSERT INTO t (A, B, C) VALUES (1, 1, 'String #1');
COMMIT;

On the left we see a link to our table 16559—this is the identifier of the table stored in the meta layer:

If we rewrite the query WHERE class.oid = 16559, specifying the identifier, we'll see our TOAST table with the name "pg_toast_16556". That is, the pattern of its creation is pg_toast plus a suffix defining the OID of the parent table.

It's worth noting that the TOAST table doesn't contain a link to another TOAST. That is, there's not an infinite number of toasts; there's the first level and that's it.

To understand what a TOAST table is, we'll conduct a test. I set the string size to more than 2 kilobytes and insert it into my custom table t. I have generated a number here, and this string turned out to be very long:

Let's see what happens. If we go back and look at page #0 of my original table, we'll see that the page is practically empty. Where did the 2 kB data go?

If I make a direct query to my TOAST table, it turns out that the data are lying here:

SELECT ctid, * 
FROM pg_toast.pg_toast_16556

They're divided into chunks, and each chunk contains its own part of the metadata. They're essentially separate. What does this mean? If you have long strings, then by making a query SELECT ctid, * FROM table, you force Postgres to access the TOAST table, read these chunks, glue them together for you, and return this information outward.

My personal opinion: if you do SELECT ctid with an asterisk, beware of the asterisk and specify which columns you want. Most likely, in most cases you won't need long strings. But when long strings are needed, then Postgres works like this: it'll consolidate your data and store it in a separate structure that helps the original table function.

If you look at the TOAST table from the point of view of metadata, using the pageinspect extension, then absolutely it has a sequence number, lower- and upper-case, offsets. That is, everything is the same as in a regular Heap table.

SELECT *
FROM page_header(get_raw_page('pg_toast.pg_toast_16556',0));

TOAST strategies

The default Postgres strategy is extended. It says that you have a TOAST table and you also archive data. Postgres tries to compress this data inside the page.

The strategy can be changed. In addition to extended there are:

Plain—prevents compression and out-of-line storage.
External—allows out-of-line storage but prevents compression.
Main—allows out-of-line storage if will not be choice and allows compression.

They're applied depending on the combinations of parameters that you want to achieve. For example, do you need compression or not, do you need to store it in a TOAST table or not?

For example, there's a table t. I query the metadata and get a history of the attributes that saturate my table:

SELECT attr.attname, 
              t.typname, 
              CASE
                WHEN attstorage = 'p' THEN 'plain'
                WHEN attstorage = 'x' THEN 'extended'
                WHEN attstorage = 'e' THEN 'external'
                WHEN attstorage = 'm' THEN 'main'
              END AS attstorage
FROM pg_attribute attr INNER JOIN 
            pg_type t ON t.OID = attr.atttypid
WHERE attrelid =16556
ORDER BY attr.attnum

In pink, I highlighted the system attributes that are mapped in my table. The green ones are the ones I created myself, custom ones. My columns are a, b, c.

As you can see in the last column of the table in the figure 7, this is basically a plain strategy specifically for types not from the string family (plus they must be of fixed length).

The C attribute is VARCHAR type; it has extended set forth by default. This means that the strategy allows compression and also storage in TOAST. Roughly speaking, by writing a SELECT * query, you get a small negative, since you need to go to TOAST, and glue the lines that need to be unzipped beforehand.

Different combinations of strategies can, possibly, improve your model. Why possibly? Because you need to check benchmarks and do load testing on your database and a specific model.

If you use an external strategy, which means that you can store it on a TOAST table, but there won't be compression, then the downside here is that the disks will be eaten up. But in the modern world, disks are cheap, everything should be fast, optimal, and responses should be received within milliseconds. You can start your research with an external strategy.

How can the strategy be changed? The command is usually:

ALTER TABLE public.t 
ALTER COLUMN c 
SET STORAGE PLAIN;

TOAST types are the usual types of working with strings, text formats, CHAR and VARCHAR. Perhaps you'll create your own custom types, which will be a so-called wrapper of the system type.

When choosing a particular strategy, it's necessary to test it using a benchmark, TPS metrics (transaction per second), latency, and only after that draw conclusions about whether to switch the strategy or not.

Book recommendations

Finally, I want to recommend interesting books from the point of view of what's inside a database. Not only Postgres is here, but other things you'll find useful:

"Database Internals: A Deep Dive into How Distributed Data Systems Work" — Alex Petrov.
"Readings in Database Systems" — Peter Bailis, Joseph M. Hellerstein, Michael Stonebraker.
"PostgreSQL Notes for Professionals", a PDF course compiled from Stack Overflow Documentation.
"Understanding EXPLAIN".

Structure of Heap Table in PostgreSQL

Quadcode team — Mon, 12 Dec 2022 14:17:13 +0000

My name is Azat Yakupov. I’m a Quadcode Data Architect. In parallel with my work, I give lectures at the university and conduct courses in data engineering, data architecture and data analytics. I’ve been in the IT industry for more than 20 years, more than 6 of them in architecture. I want to share some of the extensive experience that I’ve accumulated during this time.

This article is about Heap tables, aka sandbox tables. In it we’ll look at:

Metadata and how it’s arranged in Postgres.
What is Table page and its physical representation on disk.
Fillfactor parameters that affect system performance.
Object identifiers.

Heap Tables

All data engineering begins at the moment when we create a table, describe attributes and add content there. Therefore, let’s start our analysis with a regular table. How is it stored and logically represented on disk?

Many people think that it’s a real structure that is stored as a set of files, and of ordered strings. But actually, PostgreSQL has a little randomness that represents our table in an order where data can be stored on different pages and different places within the page itself:

It happens because there is a VACUUM mechanism in Postgres. It redistributes data, cleaning out dead records and bringing a little chaos to the rows.

The useful ORDER BY construction helps to prevent such a random data set. If you want to sort the data so that it doesn’t appear randomly, it’s better to use ORDER BY, knowing that you need accurate sorting by certain attributes and data. For example, it helps if there was one data snapshot without ORDER BY an hour ago and now another snapshot appears, because some internal Postgres process started as VACUUM and made some transformations in your pages.

Let’s look at the usual table syntax. Here I create a table with the name t. The table has three columns: A, B, C.

CREATE TABLE public.t
(
 A INTEGER,
 B INTEGER,
 C VARCHAR
);

Having created the table, we can access its metadata. The request seems scary, but in fact it just accesses the Postgres metadata, getting the necessary information for further investigation:

The metadata contains the following information:

OID — the object identifier that’s created inside Postgres. It is based on a systemic sequence. This sequence returns something unique each time for a new object that we’re creating: a column, a function, a trigger, a virtual table, and so on. By the OID we can refer to the object.
Relation name — the name of the object.
Schema — public.
Object owner — the person who created the table and is responsible for distributing accesses to it.
Tablespace — the default for Postgres is pg_default.
Amount pages, amount tuples — columns responsible for the number of pages and rows.
TOAST tables — satellite tables. They allow the main table to function and allow you to split long rows if they don’t fit on the basis of a strategy or a selected policy on a particular page.
Type — table type. We’re now considering a regular logged table, which is created based on the CREATE TABLE command.
File path — the path that is mapped at the operating system level in pg_home, where your Postgres instance is located.
Relation size — the total size of the table in bytes. Since we haven’t inserted any data into the table, Relation size = 0 bytes.

In the path line, you can see that we have a 16387 file. This is a binary file that will contain the content of our table if we do the filling using INSERT, and then do UPDATEs or DELETEs. Note that this file matches the OID of the table.

Let’s conduct a little experiment:

Insert data into the table.
Generate a row.
Recalculate the table statistics.
Look at the contents of the table, file and meta layer.

As a result, we have changes in the metadata. In Figure 3, these changes are highlighted in turquoise:

The number of pages is now 1, the number of records is 1, and the volume of relationships has dramatically increased to 8 kilobytes. What’s the reason for this? The fact is that when we insert 1 row, the minimum atomic element is immediately allocated. Within Postgres storage, it’s a page, and it’s always 8 kilobytes. The exception is when you’ve completely rebuilt Postgres and increased the size of the page or decreased it by a factor of two.

How do Heap tables work? Imagine that we have a set of tables: Table 1 and Table 2 are smaller than a gigabyte and Postgres understands through its metadata which files these tables refer to through the logical layer. This strongly resembles the ANSI-SPARC architecture, when it’s assumed that we have an external and internal conceptual data layer for working with structures. There’s also an inner layer; in Figure 4 it’s highlighted with a green outline, and the outer layer is tables.

Table 1 corresponds to the red file, with the same name as its internal identifier. Table 2 corresponds to the blue square. But there’s a nuance. If the table is larger than a gigabyte, then Postgres begins to split it into separate files with a size of 1 gigabyte. Thus, suffixes appear, indicating at the operating system level in which order to read parts of our table. One table at the metadata level corresponds to several files at the operating system level, and the suffix helps to organize the assignment of the content.

Table Page

Let’s now consider the page itself.

The Postgres page structure strongly resembles that used in Oracle and MySQL. These structures simultaneously grow from top to bottom and from bottom to top. When you do INSERTs, T1, T2, T3 or tuples are born, which are written from bottom to left. They grow until they meet somewhere closer to the beginning of the page with the values of pointers (I1, I2, I3).

Pointers point to specific tuples that are stored below in the file. This structure has gone through fire, water and copper pipes in terms of optimization and proper storage of information. It helps the Postgres optimizer to use its advantages in terms of working with offsets and pointers.

The page has a header, which is a meta layer that stores an interesting structure showing the total amount of space in the page. Also in the page structure there are transaction pointers, and a meta layer for each tuple, for each row. There’s a special zone, but it’s not used for standard tables, since it’s not necessary for index structures. Why? Because this page is an atomic element not only for tables, but also for indexes. One common structure is summed up for all kinds of storage.

The page has Fillfactor, which we’ll talk more about in the next chapter. By default, Fillfactor is set to 100%. This means that Postgres will fill the page as much as possible. What does it mean to fill it as much as possible? Roughly speaking, until the lower zone meets the upper one. As soon as they touch, a new page will be born. The situation will repeat until 8 kilobytes are filled, then the next 8 kilobytes will be born, and so on. This is how the table grows.

Let’s look at examples.

CREATE EXTENSION pgstattuple;

SELECT *
FROM pgstattuple(16387);

I used the extension pgstattuple. It allows you to view statistics without resorting to massive metadata selectors. Using the extension API, I transmit the internal ID of my table. The result is a cross-section of statistics: by table length, number of live records, dead records — or zombies, as I call them — and so on.

I also want to study the page itself and see what’s inside. I’m interested to see if the page matches what we’ll see through pageinspect. For example, examine the header. I send the table name t, and give the number of the page — 0, because we only have one page.

CREATE EXTENSION pageinspect;

SELECT *
FROM page_header(get_raw_page('public.t',0));

The lsn number is a unique sequence number that was allocated to my change when it was inserted. This number is used both for data recovery and replication.

If we consider the page header, which is 24 bytes, then what structures are there?

The structures are presented and marked up byte by byte: how much and what it occupies. The header is clearly described in terms of what elements are included in it, so that the Postgres optimizer can go into the header before reading the entire page and understand what it’s dealing with. And when you first start reading, you can tell right away how much room there is in that page.

In the next figure, you can look at our header, and pointer (I1), which points to our tuple (T1), which starts with an offset of 8144 bytes. This T1 tuple contains the string that we generated at the insert level.

If you look not at the metadata, but at the string itself, we can see interesting attributes regarding not only the length of the tuple itself, but also by the transaction number that was applied.

Our string (T1) is 42 bytes. How’s it all stored? The fact is that each string also stores metadata. And these 42 bytes are divided into two parts:

The structure drawn on the left in the next figure. It determines the byte size for storing each field of this metadata.
Tuple data, and they’re 18 bytes.

It turns out that we have 18 out of 42 bytes — this is user data. Note that metadata is so important that it takes up more than half of the space.

Another metadata slice is the CTID pointer. It contains the address of the page plus the address of the tuple inside this page.

This structure is very similar to an array, a record, which indicates that this string “1 1 string #1” is stored in the null page in the first tuple. If we add OID relations here, we’ll essentially get the coordinates of the information search:

a data file;
a page in this data file;
a tuple where we need to get this information from a specific page.

B-Tree indexes are built on this principle, when the leaves in this B-Tree index point to a particular CTID specified in a particular file.

Fillfactor

Fillfactor is the ability of a page to store information only up to a certain level. As I’ve already said, its default value of 100% means that the page will be filled to the end.

In the following example, I set Fillfactor = 50%. It means that by bringing the ALTER TABLE type of table to 50%, new pages will be born so that 50% can’t be touched with inserts. That is, when you insert a new tupple, it won’t cross the red borders in the figure: a new page will be born every time the tuples reach the thin red line.

ALTER TABLE public.t SET (FILLFACTOR = 50);

Fillfactor is needed for future UPDATEs of the table. The fact is that when you make an UPDATE, Postgres tries to save it as much as possible in the same page where the original string was stored, and not mutate it somewhere else. This is a good practice for Postgres.

To help Postgres and free up space within the page, you can leave it a space for each page and say: “Let’s install Fillfactor 90%; it will be filled 90%, and 10% will remain for future possible UPDATEs.” But we need to understand that if these are operational tables that are updated and changed very often, then yes, it’s worth trying with a threshold of 90%. If these are static tables like directories or dimensions, then you should immediately set Fillfactor to 100%, because the data is first cleared and then refilled. But if they’re updated, then you need to immediately think about the correct policy for Fillfactor.

Fillfactor can be set as for a table that already exists:

ALTER TABLE public.t SET (FILLFACTOR = 100);

And for a newly compiled table, explicitly specifying in the options that Fillfactor is equal to such and such a value:

CREATE TABLE public.t
(
 A INTEGER,
 B INTEGER,
 C VARCHAR
) WITH (FILLFACTOR = 100);

But once I set the Fillfactor on an existing table, and I want to recalculate the historical pages too, then I have to do a VACUUM FULL operation. In essence, this operation will copy data from one file to another:

VACUUM FULL public.t

VACUUM FULL is an operation that is quite critical for highload. It blocks read and write traffic for the table until the operation itself is finished. I advise you to be careful with this and read the documentation that VACUUM can lead to blockages. Blocking the table in turn causes “connection refused” and “connection lost”, if you set the timeout to a small enough value.

Object identifier

In the end of the article I want to pay special attention to one property of the OID. Using the OID for the indications in the table is bad.

OID is a system sequence that generates a number every time you create a new object: a table, column, function, procedure, trigger, etc. But if you specify the OID in the table, it will mean that each row insertion will get the ID value from the system sequence.

It would seem that there’s nothing terrible in that. You could say, “Why not use an internal system identifier, essentially a primary key, a primary key that’s generated each time for each line?” But there’s a problem. Our OID is a particular type. And the problem comes when the system sequence ends and the counter has no choice but to start from scratch.

The thing is that you already have objects starting from scratch in your database. In that case Postgres will collapse and everything will fold up and close. And that’s where we have to decide:

How and by what approaches to restore this data.
How to deal with such a cluster.

The OID is used only to generate the structure object number, but not for the content.

A summary and what’s next

Data in Heap tables can be stored randomly on different pages and different places within the page. If you want to sort the data, use ORDER BY.
Each table has its own metadata with lots of information.
If the table is larger than a gigabyte, then Postgres begins to split it into separate files with a size of 1 gigabyte.
PostgreSQL uses a fixed page size, commonly 8 kB. The page structure simultaneously grows from top to bottom and from bottom to top.
Each page has a header, transaction pointers, a meta layer for each tuple and for each row.
The page has Fillfactor — the ability to store information only up to a certain level.
The object identifier is used only to generate the structure object number, but not for the content.

In the next article we’ll look at TOAST tables or so-called satellite tables. They help to split the data in terms of the length of rows.

Evolution of the Quadcode Internet Telephony Architecture

Quadcode team — Fri, 25 Nov 2022 11:02:09 +0000

Over the course of several years, various functionalities have been added piece by piece to our Internet telephony. As a result, we found ourselves at a point where the only advantage of the existing VoIP architecture was its reliability. At the same time, making any changes caused headaches, and a successful result of their implementation in the product was never guaranteed. We decided to fix it and wrote a new solution in a year.

Quadcode Telephony Service

We make Internet telephony for our B2B clients. VoIP components that we use:

FreeSWITCH as telephone exchanges.
VoIP telephony providers from our customers.
More than 200 phone numbers from all over the world.
SIP phones and GSM gateways.
WebRTC clients in CRM and our WebRTC Webphone.
Integration with CRM.
Monitoring via Telegraf, Influx and Prometheus.
Homer to store SIP traces.
Ceph and AWS S3 for storing audio recordings of conversations.
DDoS and UDP protocol protection.

VoIP telephony only at first glance seems simple and understandable, but in fact it turns out that only VoIP engineers understand how SIP and WebRTC work with all the nuances of these protocols. Therefore, let's start with a brief explanation so that we can talk further about architectural changes.

What is SIP protocol?

Session Initiation Protocol—a client-server application layer protocol. It provides organization, modification and termination of communication sessions: multimedia conferences, telephone connections and distribution of multimedia information. The interaction of clients within SIP is most often carried out in the form of a dialog—this is a sequence of SIP messages.

SIP can be divided into two interrelated parts:

Signaling.
Media transmission.

Dialog and other SIP messages are transmitted in the signal. The SDP protocol is embedded inside the SIP protocol, which is responsible for establishing a media connection. SDP transmits information about the audio and video stream.

The SIP message itself looks something like this:

INVITE sip:1234567890@192.168.1.1:5080;transport=tcp;gw=123 SIP/2.0
Record-Route: <sip:192.168.1.2;lr=on;ftag=as1f5e1177;vsf=AAAAAAAAAAAAAAAAAAAAAAAACAwAAAcBAAAEAAYCODo1MDgw;vst=AAAAAA8HAAYAdAEPDBtxAAAbGAIfABkxMzQuMjUy>
Via: SIP/2.0/UDP 192.168.1.2;branch=z9hG4bK413d.eb4e995427afcd56704ae5c59c0e9fd9.0
Max-Forwards: 69
From: <sip:0987654321@192.168.1.2>;tag=as1f5e1177
To: 1234567890 <sip:1234567890@192.168.1.1>
Contact: <sip:0987654321@192.168.2.3:5080>
Call-ID: 6366d9bb789ea6f1235f83297564388d@192.168.2.3:5080
CSeq: 102 INVITE
Content-Type: application/sdp
Content-Length: 348
User-Agent: Provider SBC

v=0
o=provider 1719357914 1719357914 IN IP4 192.168.2.3
s=Provider MGW
c=IN IP4 192.168.2.3
t=0 0
m=audio 12554 RTP/AVP 8 0 9 18 3 101
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:9 G722/8000
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:3 GSM/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-16
a=ptime:20
a=sendrecv

Like HTTP, SIP is a text protocol. Their response codes are similar—SIP also has 200s, 400s, 500s, but the protocol itself is a little different. In HTTP we sent a request, got a response and settled down. But in SIP, everything is complicated: one message, a response to it, another message, another response to it. These are the dialogues that we talked about a little bit above. But dialogues are also not always there: at some points they're used, at others they aren't. There's also a difference in the processing of response codes; for example, in HTTP 500 is a clear error, and in SIP—maybe, maybe not.

Another difficulty is the processing of SDP information. The SDP is passed to a SIP body, where codecs, ports, IP addresses, etc. are listed. All this information is coordinated between the server and the client—everything is quite complicated and with a lot of nuances. For example, if the client is behind a NAT, it needs additional means to determine which external IP address to specify in the SDP, because the client only knows its IP address inside the network. The STUN protocol is usually used for this.

It's no coincidence that the FreeSWITCH developer wiki says:

SIP is a crazy protocol and it will make you crazy too if you aren't careful.

Nevertheless, almost all modern VoIP telephony is based on the SIP protocol. In general, VoIP SIP can be represented as a set of various protocols: STUN, TURN, SIP, TLS, SDP, (S)RTP, etc.

What is WebRTC?

Web Real Time Communications—a standard that describes the transmission of signal information, streaming audio data, video data and other content. It's applicable not only to telephony: WebRTC can transmit both SIP and XMPP and other protocols. It's needed for real-time data exchange between browsers directly, or between a server and a browser.

The convenience of WebRTC is that it doesn't need installation of additional plugins or extensions for its operation—it's a built-in functionality. And on the server side there are various software solutions for interaction using WebRTC:

FreeSWITCH.
Asterisk.
Flashphoner.
Kurento.
Etc.

WebRTC provides absolute cross-platform compatibility for applications and complete independence from both hardware and operating system tools. For developers, the technology is available in the form of various JavaScript API libraries. You can take these libraries, import them to yourself and write a full-fledged front end based on them. It'll work without problems and be supported by Google Chrome, Safari, Firefox and other browsers. For example, we use the library sip.js.

But there's one big disadvantage in using WebRTC—large resources are needed for transcoding media, as well as transmitting media streams. But this disadvantage is compensated for by the fact that there's no need to develop software and any of its own protocols for the telephony core, which would require a lot of man-hours.

In our case, WebRTC is used specifically for phone calls. For them, only SIP remains at the upper signaling level, and for media—SRTP plus the media control protocol:

Old VoIP Architecture

In 2020, the architecture of our telephony looked like this:

On the left side of the diagram are VoIP providers connected via the internet. Clients connect to telephony via an internal network. As clients we have:

CRM users—connections from the browsers of support service and client managers.
Users who use Zoiper as a SIP phone from their smartphones or laptops.
Internal office telephony with Wi-Fi handsets and GSM gateways.

Telephony works in two environments: Production and Integration. Integration or Int—a mini-copy of the production; in this environment we conduct integration testing and verification.

Advantages of the old architecture. There was only one advantage in the old scheme—reliability. Firstly, these are proven providers who are always ready to help in solving issues. Secondly, everything worked well, but with a caveat. When it was necessary to change something, we inevitably faced difficulties.

Disadvantages of the old architecture. There were many more disadvantages. Firstly, there was no CI/CD in the old scheme, and we really didn't want to implement CI/CD under the current architecture, since the architecture was far from perfect and had many problems. That's why it was painful to change anything: this was some kind of manual manipulation and getting into Git. Secondly, the architecture didn't scale: if something new was needed, we were forced to reinvent everything.

A bonus of this whole affair was the poor documentation that we inherited. Unfortunately, when the company was still a startup, they didn't get around to documentation, and then it was too late. To the extent possible, we brought it in line with how everything works, but no one knew all the pitfalls.

Overall, it was a classic version of internal telephony, as in any company that uses open-source VoIP telephony solutions.

Cherry on top: The Int solution didn't coincide with the Prod solution. This was a huge disadvantage because all the telephony tests on Int didn't guarantee that everything would work in production.

We conduct development and functional testing in a third environment—Sandbox. And in the old architecture, this environment was kludgy: Sandbox was integrated with Int; it allowed you to do something, but again it didn't guarantee that everything would work the same way in production.

If we consider each FreeSWITCH inside, then the architecture of the old telephony and its integration with CRM looked something like this:

The red zone is the responsibility of the back office developers, and the gray zone is the responsibility of the VoIP telephony department. At some point, these areas of responsibility intermixed.

It was a nightmare. Everything was built as needed: functionality was added piece by piece, resulting in a rattling mix of areas of responsibility. It became impossible to understand what changes at what point may affect what.

Integration took place by using the FreeSWITCH dialplans and making changes to the FreeSWITCH SQL cache; fs_curl was also used with some custom changes, which hasn’t been developed for 6 years. That is, the CRM developers needed to know how to configure FreeSWITCH, how it works, and how its entire binding works.

Moreover: integration took place through Go applications, which were written by the back office, but at the same time were supported and launched on the telephony server. And on top of all this, large numbers of Lua scripts worked in the telephony area of responsibility, the launch of which even became impossible to track in the integration logic. If there was a need to make any change in the existing scheme, it was very complicated, and each new change could bring even more new problems.

At some point, we were faced with the question of expanding the capabilities of B2B telephony for our customers. We realized that it was impossible to continue living with the old architecture, and began to improve it.

New VoIP Architecture

It took a year to create a new solution. As a result, the new architecture looks like this:

We implemented the division of FreeSWITCH by tasks: there are SBCs that are responsible only for telephony and connection of telecom operators, and there are PBXs that are responsible for connecting clients. The diagram has SBC0X and PBX0X, which means that we can increase the number of both SBC and PBX as needed—SBC01, SBC02, PBX01, PBX02, etc.

If there's a need to use Kamailio or any other system for SIP routing on SBC instead of FreeSWITCH—no problem.

Advantages of the new architecture. VoIP reliability has remained at the same level: providers and FreeSWITCH haven't changed. But now we have CI/CD, and everything's controlled through GitLab. The new architecture also added high security, because we implemented and created integration with the DDoS protection service, created UDP protocol protection, and implemented WebSocket connection protection using tokens.

As I said, scalability has appeared. Now we can increase the number of servers depending on the tasks and goals. Every step, every change in telephony is documented; we try to maintain maximum documentation.

The architecture for the Int environment has also changed. It's now almost identical to the prod solution:

The only difference is in the connected providers; placeholders are connected instead. Now we don't need to buy individual numbers on Int—we can emulate the production numbers on the provider and work with them in the same way as in the Prod environment. For the Sandbox environment, we wrapped the Int schema in Docker.

Now the same FreeSWITCH configuration works in Prod, Int, and Sandbox. For each developer, a different environment can be activated in Sandbox, one-to-one with the Int environment. This ensures both testing and development, and the confidence that everything will work the same everywhere.

The new FreeSWITCH architecture and its integration with CRM looks like this:

The areas of responsibility are now clearly divided. There are far fewer interaction points. Now the main point of interaction between FreeSWITCH and CRM is HTTP requests from FreeSWITCH to the CRM API (via a PHP JSON to XML converter), and the exchange takes place by receiving a JSON. That is, when prompted by FreeSWITCH, the PHP converter requests a pre-standardized JSON from the API and generates, based on the JSON response, an XML for FreeSWITCH with a configuration or dialplan. In the new architecture, there's no need to know the principles of configuring and working with FreeSWITCH in order to interact with it.

The second point of contact is FreeSWITCH's event socket, through which the API can receive data from FreeSWITCH, create a callback via originate, or ask FreeSWITCH to update its configuration.

Software Tools for Working with Telephony

To develop and work with telephony, we've created two software products.

The first is Web_fs_cli. FreeSWITCH has fs_cli; it's a FreeSWITCH command line client. The client works through the event socket and allows you to make requests to FreeSWITCH and receive responses. Through fs_cli, you can view the number of registrations or the list of FreeSWITCH users, and execute other commands.

Initially, this interface works only via the console; that is, we connect to the FreeSWITCH event socket in fs_cli, log in and work. But this isn't very convenient, because fs_cli is needed by both testers and developers, and sometimes you need to give SSH access to the server, and at times it's simply impossible to work with fs_cli remotely. And testers often don't like to make requests to the console. That's why we developed the fs_cli interface on the web. The web interface allows you to make requests to the FreeSWITCH event socket by entering a command, or using preset commands in Action. The responses are output directly to the browser.

For the Int environment, the Test originate mod is available in the Web fs_cli interface. This is the generation of calls from provider emulators to an Int SBC. There we indicate who is calling to where and after what time the call ends, and we click "generate call". Through the event socket, an originate request is made to the providers' emulators, and as a result, an incoming call to the SBC is generated in the same way as it would have come to Prod. An auto-informator is connected to the call, which tells from where the call came from and with what numbers.

This interface was launched at the end of 2021, and performs its tasks perfectly.

The second software product is Webphone. This is a WebRTC phone that works in a browser. It uses the sip.js library, which we also use for chat rooms and CRM. The phone supports working with websocket tokens and very often helps with debugging.

Why Our Solution Was Needed

Why deal with legacy in our solution if B2B clients can use any of the ones already available on the market? Here's an example.

One of our B2B clients is a banking service. It had its own infrastructure and its own telephony solution—Twilio. It’s a provider of cloud software for business communications. Twilio isn't an independent dialer, and it could be integrated into a narrow number of other platforms. The client had integration with Zendesk set up.

The Twilio service is cheap, and the customer had no problems with the system itself. However, Zendesk was very expensive, so it was decided to abandon both products. From that moment, the business began to search for the perfect telephony.

bOnline. The first option was bOnline. It satisfied the needs at a reasonable cost. But after a couple of months of operation, regular technical problems with communication began, which the telephony system couldn't solve. Even if the difficulties could be eliminated, it took a lot of time. We had to look for another platform.

Acefone. The next solution was Acephone, but with it the number of technical problems didn't lessen. One day, banking received a not very pleasant comment from the platform that the ability to hear customers is blocked on their side, and the administrators of the company itself should solve this problem.

Quadcode had already worked with the banking service itself, so the client turned to us for a VoIP solution. The telephony team gathered business and technical requirements for the service and was able to offer a satisfying solution. Our Webphone also went into production for this client as a phone in the browser. It looks simple, but fits the current needs.

The full deployment period from receiving the task from the customer to the product solution was one week. With the old architecture, we wouldn't have been able to do anything. So, the decision to support and update our VoIP telephony was right for us.

Here's the feedback we received from the customer:

Possible technical problems are reported immediately, and they're solved very quickly. We can also safely expand the telephony functionality to meet new requirements. For example, relatively recently we added the ability to track who is currently online in the system. Switching to Quadcode telephony was the best solution.

Results of Architecture Refinement

The reliability of the new telephony architecture has remained the same. At the same time, we've increased security, added CI/CD, scalability and regularly maintain documentation. Now VoIP complies with Quadcode architecture standards and supports all three environments: Sandbox, Int and Prod.

Integration with external services is also standardized—this is JSON, where it's clearly spelled out what to use, and how and when. And finally, there's no need to know FreeSWITCH in order to integrate with telephony. The development and implementation of new VoIP products in the new architecture takes one week, and our customers have the opportunity to expand functionality to meet new business requirements.

Tips for New Team Leaders

Quadcode team — Tue, 08 Nov 2022 14:00:00 +0000

My name is Nikita Pimoshenko, and I'm the IT Head of the General Development in Quadcode. Today I'll share tips and case studies from practice that will be useful for new team leaders: how to work with a team, get out of the trap of a toxic specialist and not go crazy.

I've been in IT for more than ten years. And I've been working in Quadcode for the last four years: in September 2018 I moved into a developer position, and in April 2019 I became a team leader of the Billing API team, from where I was promoted to IT Head.

Bruce Tuckman's Model

In the article I'll mention the Bruce Tuckman model. According to it all teams go through five stages: forming, storming, norming, performing and adjourning.

I'll tell you how my team went through the stages of forming and storming.

Team Leadership: the Beginning

In my opinion, the most important things for the team leader are:

A ton of energy and initiative aimed at the product (if we're talking about product teams). An amorphous team leader is the worst thing that can happen to a team.
Willingness to be responsible for the results (their own and that of their colleagues).
Leadership qualities (you can lead people and motivate them).
The ability to speak, listen, hear and explain your thoughts to others so that they understand you.

When I became a team leader, I wasn't familiar with the Tuckman model and similar tools. I had the experience of forming remote teams for my partners, and that's probably all. But the promotion itself seemed logical to me: for six months I led and launched technical features, approved architectural changes, dragged the team to events and other team buildings.

And one day my boss comes to a standup and says: "Folks, Nikita will be your team leader. His probation period is three months, and if everything's OK for everyone, then he'll stay." Your expectations from the team: hurrah, applause, cheering crowd. Reality: awareness, distrust, thoughts about what will happen next. Because yesterday there was a "Nikita programmer", and today there's already a "Nikita manager". Fears and concerns arise: how should I behave with him, and can I now "kick questions around" with him at lunch, and is it worth inviting him to the bar? Our colleagues' picture of the world has changed dramatically and turned upside down.

The story went something like this: "Nikita, we've all seen that you're proactive, moving your ideas and making an effort to make them become code/documentation/tests; in short, adding value to the product that we all developed and love together. But you're a step higher now, so how should we behave with you?" And I have often heard phrases like: "You aren't hearing me," "I want to tell you this, but you don't understand me," "If you don't listen to me and respect me, I'll quit," and so on.

It didn't fit in with my picture of the world at all. I worked with these guys, I considered them very reasonable and adult people, but some kind of childhood began here. To be honest, this situation reminded me of my friend's son's behavior, and I thought, "Wow, this is so similar to the way Lev was testing his dad’s weakness and strength points." At the time, I felt like the father of a large and capricious family in which children probe the boundaries of what's permissible and are often acting offended.

What a New Team Leader Should Do

1. Be in the collective, not above it. For myself, I chose the model "Don't distance yourself from the team." I stayed with my team as much as possible; I didn't build walls between us. I didn't say things like, "I'm the boss, and you're my employees, so we can't have friendly interactions."

On the contrary, I promoted the following idea: "Dudes, we work together; I'm a specialist just like you. Yes, I'm a team leader now, but that's great: you can come to me at any time, and I'll relay your interests. Through me, you can ask our bosses questions that you hesitate to ask in person, and solve problems."

2. Hold 1:1 meetings. This is necessary to clarify expectations and learn about the agreements that were made with the previous team lead. So you'll understand what commitments your predecessors made and how to work with them in the future.

At that time, I didn't fully realize the importance of 1:1 meetings, I just realized that they were needed and started holding them. I tried to find out what concerns the folks have, how they see our cooperation, what they expect from me, what I want from them.

3. Organize informal gatherings and team buildings. According to Tuckman, informal activities work best at the forming stage. We have a tradition in our company—pizza and wine are brought to us at the end of the week, and we stay Friday and hang out. And, I admit, at that time I was really looking forward to those Fridays: in an informal and relaxed atmosphere, the team opened up more and made contact. These get-togethers were even more useful than 1:1 meetings; they allowed us to identify more problems and "blind spots".

4. Don't become a "hostage of fake power". One mistake of new team leaders is to become hostage to information management. When you're a supervisor, you have access to a lot more information than the team. And many novice team leaders believe that it's more correct to somehow hold onto this information, and then use it for their own purposes.

I admit that I also almost fell into this trap, but then I came to my senses and decided to stay in the team—the friend and comrade I was when I worked as a developer. And I began to transparently relay all the information to the team so that people wouldn't feel that I was hiding something from them. They know everything I know. Perhaps they won't find out at the same moment, since I need to work out some information myself first, clarify all the controversial points and understand how I'll answer my team's questions.

In short, hanging onto information and enjoying the feeling "I know more than anyone, here it is—the power of the team leader" is a game and heresy. And it can lead to problems: in large companies, information flows are often uncontrolled, and one way or another the folks will find out the news. And if they realize that you already knew about it and didn't tell them anything, it'll reduce their trust.

5. Agree on new metrics. I felt like a blind kitten on this issue. I didn't understand what metrics developers could have, although I was one myself, and if we talk about QA, then there's generally a backout for former devs. I tried to work with people more impromptu, and approach them with some kind of template. I started from the individuality of the person and tried to work with the personality.

Speaking in a substantive way: I didn't count the tasks that each employee completed. Once a quarter, we planned what the team wanted to do: for example, to develop such and such features, conduct research, speak at a conference and share their experience. Progress was tracked at 1:1 meetings (this helps to understand whether there are blockers, whether something needs to be changed), and the final result was at the end of the quarter.

In a word, the metric was binary: we fulfilled the agreements—well done everyone; something didn't work out—we figured out what happened. The system works on independence and responsibility, and the team understands what is expected of them and what needs to be done.

6. Avoid emotional outbursts. The whole story with complaints and misunderstandings in the team that I described above lasted about three months, during my entire probation period. Of course, we came to a consensus, learned to hear each other and built a dialogue—I showed that I'm part of the pack.

But I'm an emotional person, so outbursts have happened. And I've been working on this for a long time, developing emotional intelligence (tracking other people's emotions in myself and determining how they affect me). And even now, four years later, I can still "fly off the handle" from other people's emotions. But I try to keep my temper.

A simple tip: if you want to say something emotionally, be quiet, take a deep breath, count to 10 and think about whether what you want to say is useful and important. Or is it just an emotional thing that you want to "dump" on the person?

7. Understand the importance of praise and gratitude. Never hesitate to praise and thank a person, especially in public. And it's important never to scold a person in public. Of course, there are very emotional situations, and you can flare up. This happened to me once: I'm still ashamed, I apologized for a long time and brought the situation back to normal.

Sometimes a person can lose their cool toward you with their emotions and attacks. It's important to remember that if you give corrective feedback, never do it in public. It should be one-on-one with the person. It's improper to comment on the work of a specialist in front of the team—this is the worst thing a team leader can do.

8. Prepare for difficult cases. Unfortunately, there are a lot of them in the team leader’s work. For example, one day my improper supervisory decision led to the dismissal of a lead QA. And I want to tell you more about one case—the dismissal of a toxic specialist.

A Case from the Life of a Team Leader: How to Work with a Toxic Specialist

Team Values

When you're a hiring manager, remember that the main thing is for the new person to coincide with the spirit of the team. To do this, you need to formulate team values. The team members themselves will help you with this. Get together and ask them: "Folks, what do you value in our work, what do you think is the coolest, what goal are we pursuing?" The goal can be as simple as possible: for example, we write code for money, this is a high for us. Such a story is a normal position in many outsourcing companies, because projects change quickly.

There's another story: product development (our case). We have a service that we maintain and develop. We love it, nurture and cherish it, do everything to make it great and always stable. We want to write awesome code, and develop a high-quality and transparent architecture so that everything is covered with tests. In short, we love our service. The manager isn't arrogantly unapproachable: if you want to toss a feature, think it over; we'll ask you questions, make suggestions for improvement.

Yes, we also work for money, but the value and quality of our product is at the forefront, not the financial component. We want people to come to us with features that need to be thought through, so that there's a dialogue with us and we come up with an implementation. We want to influence the product and feel the importance of this influence. Of course, we take full responsibility for this position.

We have to understand that if the team is focused on quality, the value of feedback and open communication, then we have to hire someone with similar values.

We've sorted out values, now let's move on to toxicity.

What is Toxicity?

Many people think that toxic means a person who tells everyone to go you know where, etc. But in fact, this may not be considered toxic: if everyone in the team speaks with each other in a rude manner then it's completely normal behavior. Folks will say, "He's ok". And the opposite situation: when a person is very responsible, thinks through everything to the smallest detail, takes a long time to perform tasks, but does it excellently. Such a person can also be toxic, despite all these qualities. They'll be toxic for a team of (Adizes) producers who want to release features quickly. The person will stifle them with their long thought process, and there'll be a clash of values.

Interviews with Toxic Specialists

An important skill is to learn how to probe motivational factors and values of a person: what's important to them in their work, what drives them, etc. And compare that to the expectations of the team. Interviews are the most effective stage when you can weed out a toxic professional. And this task falls on the shoulders of the hiring manager or recruiter. Not all team leaders are able to probe motivation, so you need to discuss this point with the recruiter; ask them to prepare specific questions and compose a motivational portrait of the candidate.

There are also tricky candidates: they try to understand what kind of specialist you're looking for, and at the interview will say exactly what you want to hear or use socially acceptable answers. However, everything they say often doesn't match the real picture of their values. Mostly these are people who don't care about anything but money.

Working with a Toxic Specialist and their Dismissal: My Experience

I had the case described above: the person was well prepared and got through the interview perfectly. I was already thinking about motivational factors then, asking him questions. It was a transfer within the company; at that time recruiters didn't participate in such interviews. The candidate seemed to be very sensible. He said that he likes to think everything through, and he's passionate about the quality of the product. But in the end he turned out to be toxic to my team that shared these values.

For example, we don't always write the User Story from A to Z; our devs have some autonomy in implementing the solution. This colleague literally demanded that all tasks must be written in as much detail as possible, saying that he was not interested in bothering with business logic and solutions: he came to write code according to clear technical requirements. The folks in the team argued with him and said that developers aren't just typewriters for code, they need tasks in which they can explore the product and user behavior in order to offer an interesting and effective solution. These conversations ended up going nowhere, with everyone sticking to their own opinion.

The guy showed toxicity not only in relation to his team, but also to other colleagues. As an example, not proper and unpolite statements about their salaries when the colleagues discussed promotions among themselves.

Sometimes toxic people may not show their toxicity in dealing with their supervisor, but they will definitely demonstrate it in relation to the team. And this is where 1:1 meetings and informal get togethers are very helpful: it's where you can ask the team how the person is behaving, how they work together in general, and how they communicate and what their impressions are. Of course, in the most critical situations, you'll find out everything without it: the team may organize a real front of resistance and come to you with an intervention in the middle of the day.

For me it was my first experience of working with a toxic person. I fell into the "Mother Teresa trap" (as I called it): I thought I could talk to him, work out the issues (I'm the supervisor, after all) and achieve an acceptable level of interaction between him and the team. He wrote quite decent code; he didn't have many comments on the code review. In short, he was a great fit for the Middle+ Developer position; he was a good developer, but he was like a bone in the throat for the team.

I had thoughts like: "I don't seem to be a fool, and he's a person who hears and listens. I'll try to explain our point of view to him, perhaps he'll meet me halfway." The story dragged on for six months: the guy came, and two months later the team began to complain about his toxicity. I heard them, held a meeting with the guy: we worked out the issues of values, discussed communication in the team. A few more months passed, and there were no improvements. After that, I wanted to say that we should part ways, but he beat me to it and resigned himself.

So about the "Mother Teresa trap". Its essence is that a novice team leader is trying hard to find a compromise in a situation where the employee deliberately refuses to make contact with the team. There's a certain sense of sacrifice and belief that things will get better. At the same time, you close your eyes to obvious inconsistencies and try (as it seems to you) to "draw the person out" to the very end.

Another important point: if you're a team leader, you share the values of the team. That is, this guy demotivates you as well as the team. But you try to stay as neutral as possible and remove it all. At the same time, all your attempts to identify and articulate team values may not be successful: a toxic person will most often understand your values, but will refuse to accept them. This position confuses novice supervisors, as it seems to many that understanding and acceptance are related things. So the following is needed:

At the meeting, identify your own and the team's position as concretely as possible, say what the problem is, and stick to the facts.
Listen to the person's opinion in order to understand how they see the current situation.
Ask if they're ready to work with it.
Work through their words, bring them to the surface and make sure you understand each other: "I heard you, I understand this and that, tell me, is that right?"
If the person is willing to work on it, then agree on a deadline. Usually one month is enough.
After that, look at the results: if nothing changes, you need to part ways.

In some situations, everything ends at the third stage—when you ask if the person is ready to work towards the acceptance of team values. Because toxicity in the team is precisely the rejection of the values of the team; toxic colleagues show it directly, which makes the team members push them away, and adds pressure to the team atmosphere. Here there's psychological pressure, and an unwillingness to build a constructive dialogue with others, and fundamental disagreement with your position. Sometimes it's important for them to demonstratively broadcast their point of view. And there's a lot of personality in toxicity; it's not just a work dispute that can be resolved painlessly.

According to Tuckman, the appearance of a new member in the team (including a toxic one) is a forming stage that quickly turns into storming (in general, any newcomer returns the team to the forming stage); this is where the clash of values occurs. At this stage it's better not to linger for a long time: if the team has been storming for more than six months, the situation can get critical. It will take a long time to get out of it, and after a prolonged storming, the team members are often apathetic, sluggish, and closed for changes.

How Can a New Team Leader Not Go Crazy

If you're an emotional person, then you need some kind of emotional release in addition to work. Sports and simple introspection techniques (for example, keeping a diary) can help. Of course, proper nutrition. During the case described above, I was on a strict diet, and lost 15 kg in a month and a half. And it was hard, one thing overlapped with the other.

One of the simplest and surest solutions is to discuss problems (including emotional ones) with your managers. They are able to support you, dispel your worries, and together you can make a plan for further actions. Remember that even the most experienced specialist almost always becomes a novice manager upon a promotion. Don't be afraid to learn from your supervisors. This won't be a sign of weakness; on the contrary, it'll show your desire for development and purposefulness.

If you have experts in the field of coaching and psychology in your company (we have a HR Business Partners team at Quadcode), then you can work out difficult cases with them.
It may be worth spending more time with friends and family, unless there's a high emotional intensity in communication with them. If you feel that you're not coping yourself, then don't hesitate to contact a psychoanalyst; this is completely normal.

And praise and pamper yourself more often, but exclude coffee, energy drinks and everything else that agitates your nervous system (it will get it enough of that without stimulants).

If you're a less emotional person, then most likely everything will go smoothly for you. But there's another point here: you need to pay more attention to the team, monitor its emotional background, try to stand up for people, understand and support them. Without this work, there's a great risk of losing teammates. In management, it's people first.

That's it, I'll be happy to answer questions in the comments.

Fintech in Practice: How Quadcode Develops Technology for Trading and Banking

Quadcode team — Tue, 25 Oct 2022 13:30:15 +0000

Hello, DEV community! I am Aleksandr Ivanov, CTO at Quadcode. We came here to show the fintech “kitchen”; we’ve been at it for 8 years, so we can already share our experience. In our blog we will talk about architectures, technologies, tools and management.

This post is the first on the list and can be considered an introduction. I will tell you about the structure of our team, about Quadcode products — a platform for trading, banking and our internal development tools, as well as about our first steps towards an IPO.

Our Team

The Quadcode team has been working in fintech for eight years. The company’s goal is to create user-friendly tools for B2B clients from all over the world.

In development we are guided by Agile principles, and generally we are inclined to flexible methodologies. They allow us to achieve a balance in the speed and quality of product development, so development is a set of Agile trains.

A train is a set of agile teams that work on a specific business area of the company. We launch trains for long-term roadmaps, when we are sure that the implementation of tasks will lead to business development and growth of business metrics. The train includes all the specialists that are needed to implement projects and features in a particular product or infrastructure area, for example:

Developers.
QA engineers.
Team leaders.
Analysts.
Product managers.
Designers.
Technical writers.
HR business partners.

Now we have 260+ experienced specialists in the development department. We rarely take on juniors. But every year it becomes more and more difficult to find experienced professionals, so we are looking more and more in this direction.

A roadmap is built for each product and can be made for different time periods: six months, a year, three years, etc. From ready-made product roadmaps we build a general plan: when, what features and for what products should be made. A roadmap, as we understand it, is a link between business, product and development.

Every quarter there is an important event in the life of the company — quarterly planning. From the overall timeline, we highlight the features that we need to implement first. The result is a blueprint for what we can really do with current resources. We don’t welcome the emergency style, so we learn to catch the zen and find the optimal balance between work and personal life. Every employee can enter and view the company’s roadmap to properly plan and allocate their time to important and interesting tasks.

We work remotely or in the offices in Cyprus, The United Arab Emirates, Australia and The United Kingdom; everyone chooses for themselves.

Technological Stack

Our main languages are Golang and C++. Among аdditional technologies on the back end — PHP, Python, NodeJS; on the front end — JavaScript (ReactJS); in analytics — Python, Scala; and Java for autotests.

The infrastructure in the company is hybrid. We rent our own servers in data centers. We try to use all stateless applications in Kubernetes, if there are no restrictions for this, although this also happens. Kubernetes clusters also predominantly run on our servers. Things that require guaranteed resources, such as loaded databases, we operate on “hardware”. Of course, we also use the cloud where it is useful. For example, in tasks where you need to process a large amount of data to provide a report to the customer. For such tasks, you need to get resources for analysis temporarily, but you don’t need them after you get the result.

For pinpoint purposes, we apply technologies that allow us to solve specific problems. For example, our Desktop application for Windows, Mac and Web is written in C++ and has the same code base. In this case, C++ gives us cross-platform advantages and excellent graphics rendering performance. However, we hardly ever use C++ for back-end development because it is expensive. Our main language for the back end is Go. At the same time, we do not use it as a testing tool. For these purposes, we use Java, because it is much more convenient and is already practically an industry standard.

Our Products

Our flagship product is the trading platform. In seven years of development, the number of platform users has grown from 950,000 to 88 million in 150+ countries.

Since 2020, we have been developing the trading platform as a SaaS solution, on the basis of which anyone can organize their own broker. And we already have our first clients in this area. In addition, we plan to enter the American and Australian markets. To do this, we are updating the platform taking into account all the technical and legal requirements of these regions.

And now for a brief summary of our products.

SaaS Trading Platform

The team has developed a platform with 99.5% uptime from scratch, on the basis of which the broker has been successfully functioning for more than 7 years.

The platform provides clients for Windows, MacOS, Android, iOS, as well as web trade room.

On the platform you can trade the following instruments:

CFD.
Forex.
Digital options.
FX options.
CFDs on crypto.

The main programming language for platform development is Golang. The platform started with a monolithic architecture of the classical stack for its time: PHP + PostgreSQL + Redis + JS.

After 3 years of operation, we decided to switch to a microservice architecture, as the monolith was no longer flexible and could not provide the necessary development pace. With the migration to a microservice architecture, we also left PHP for Go, and we don’t regret it.

Go is an excellent language that provides good performance with a fairly low entry threshold. For the development of microservices it’s just the perfect candidate.

As I said before, since 2020 our platform has been developing as a SaaS solution. Based on the solution, anyone can organize their own broker without much effort; everything is in a box: trading service, KYC procedures, billing, support, CRM. In short, everything to get the business off to a quick start. Any new broker can be up in a month. To provide variability in functionality, we develop a flexible system of modules for the SaaS solution.

The Quadcode team provides internal risk management, back office and antifraud services, so the broker doesn’t have to think about all that. As a result, the client gets a fully working cross-platform solution. We also solved the registration and authorization issue. All IT maintenance rests on our shoulders. We help with building a legal structure and much more.

Now we want the platform ecosystem to have the widest possible range of instruments: Forex, CFD and investment products in a user-friendly form. The ideal option is to make the platform suitable for both banks and their customers. We assemble the product puzzle from the smallest details. It’s not a quick process, either legally or technologically, but so far it’s working out.

Examples of the challenges facing the team this year:

Platform configurator. As an example, let’s take the BMW car assembly configurator. You can go to the site and assemble “your” car, that is, choose the color, rims, additional features. We want to make exactly the same functionality, only with the ability to build your own broker right on the site.
The development of our own Margin Forex & MCFD engine.
Elaboration of Prediction Churn. The feature is based on data analysis and predicts the moment when the user decides to leave. Now the Prediction Churn result is 82% reliable. When the system predicts that the user is ready to leave the platform, managers are involved to create a comfortable environment for the trader on the platform. This allows you to extend the period of work with the trader. The further along, the more accurately Prediction Churn will work, and the better we will be able to keep in touch with the user.

Banking

This is our second product. At the heart of the product line is its own licensed provider of financial services, which is registered in the UK. The product provides the following features to B2B and B2C customers:

Remote onboarding for individuals and legal entities.
Access to the account via mobile application and online banking.
Multicurrency accounts in IBAN format.
SEPA, FSP and SWIFT transfers.
Issuance of plastic and virtual cards.

The technology stack is classic: the core of the system runs on Java. It also uses PHP+JS to implement administrative management interfaces and web applications.

Near future plans — to complement the product with new local and alternative methods of payment, build additional banking products, build partnership programs, including reward and bonus programs for existing and potential customers. Expanded functionality will allow building an extensive network of channels in order to simplify cross-border payment processing around the world and provide customers with the most comprehensive range of services.

Internal developments

In addition to projects directly tied to profits and clients, we invest in developing our own solutions to provide a comfortable and flexible professional environment for employees to work in.

Among the most interesting ones are the following:

IQ Bus. We practise microservice architecture. In the beginning, when the question arose as to what to choose to provide communication between microservices, we decided to create our own solution — the IQ Bus. It is a data bus that abstracts services from the transport layer and gives them a simple unified protocol to communicate.
Sandbox. In complex multi-component systems, and in our case systems with a large number of services, there is always a problem with testing. It is important to be able to get a reproducible environment for testing. At the very beginning of our journey, we created a Sandbox system with which you can build copies of the platform with various configurations. This is a kind of constructor, where you can go, choose what functionality you need, and the build will be created, the necessary microservices will be launched and you can test. It all runs on Docker + Kubernetes.
Central Information System. There is always a need for a tool that can integrate all the systems of the company. It’s not just about development, but also about personnel records management, HR, Finance. Such a system should help find answers to various questions. For example, what kind of team “A” is, what kind of employees it has, who the manager is, what its payroll is, what it did in the last quarter. And then there’s all sorts of individual wants. Finding such a product that has everything in it is quite problematic, and such systems look pretty monstrous. A good example is SAP. We invest in our own development of such a system, which realizes all the needs of various departments and integrates with other systems: Gitlab, task tracker, financial systems.

In lieu of a conclusion

We have come a long way to develop a SAAS solution and implement a new banking product, and now there are a few more important goals. The company is using strategic planning, we are moving towards being present in all markets, doubling EBITDA and going for an IPO.

In future articles, we’ll talk in more detail about our approach to development, planning, and working with teams.