Forem

CodingBlocks

Picking the Right Database Type – Tougher than You Think

You asked, we listened! A request from one of our Slack channels was to go over the various types of databases and why you might choose one over another. Join us in another information filled episode where Joe won’t be attending the event he’s been promoting and Allen tries to keep his voice together for the entirety of the episode, and almost succeeded.

News

Reviews

  • iTunes: ivan.kuchin, MikeW717
  • Spotify: Darren Pruitt, chutney3000

Upcoming Events

Miscellaneous

  • Kudos to Dell Support on their monitors
  • The Cat 8 journey will be beginning soon
  • Home offices – random desires

Database Types

Primary resource we used

Some terminology we’ll be using

  • Schema on write – the schema for the data is determined before writing the record
  • Schema on read – the schema of the data is understood by the client using the data

Relational DBMS

  • Popular – 1. Oracle, 2. mySQL, 3. Microsoft SQL Server, 4. PostgreSQL, 8. IBM DB2, 9. Snowflake, 11. Microsoft Access
  • Schema on write
  • Primary language / form of access is SQL
  • Schema is defined by named tables with named columns and specific data types
  • Data exists as rows in the table that conform to the columns/types that are defined in the schema
  • Scalability – typically vertical scaling (increasing available CPU/RAM) is the preferred way
  • Can be very performant but requires knowledge on how to index and store data properly
    • Even with excellent design and indexing, performance can suffer as size of data grows
  • Some fun Instragram posts on scaling their databases

Key-value stores

  • Popular: 6. Redis, 15. Amazon Dynamo DB, 27. Azure Cosmos DB, 35. Memcached, 54. etcd
  • Schema on read
  • No real language – usually an API to put and get documents
  • Depending on the key value store, complex data structures may be stored and ability to query in various ways
  • Scalability – horizontally scalable – massively
  • Very performant
  • Many have built in extended functionality beyond looking up by a single key – for instance, Redis allows search engine type of filtering
  • Why’s Hadoop not on the list? 
    https://db-engines.com/en/blog_post/16

Document Stores

  • Popular: 5. MongoDB, 15. Amazon DynamoDB, 17. Databricks, 27. Azure Cosmos DB, 34. Couchbase
  • Schema on read
  • DBMS specific querying – usually offer a SQL capability but often times is not the most powerful way to query the data
  • Documents do not need to conform to any schema
    • Multiple documents in the same collection can have completely different fields/properties, OR they have have the same properties with different data types
    • Documents can contain collections in fields or even nest other documents
    • Typically stores data in JSON like documents
  • Can be very performant but may require care to create proper indexes, manage connections, etc

Time Series DBMS

Graph DBMS

Search engine

  • Popular: 7. Elasticsearch, 14. Splunk, 24. Solr, 40. OpenSearch, 58. MarkLogic
  • Extensions of NoSQL databases
  • Schema on read
  • Complex search expressions
  • Full text search
  • Stemming – reducing words to their root forms so that searches can be more accurate with similar word searches
  • Ranking and grouping of search results
  • Built for scalability
  • Incredibly performant for the use case
  • Not great with relationship data
  • Why choose over something like a relational or document database?

Resources

Tips of the Episode

Episode source