Forem

CodingBlocks

Multi-Value, Spatial, and Event Store Databases

We are mixing it up on you again, no Outlaw this week, but we can offer you some talk of exotic databases. Also, Joe pronounces everything correctly and Allen leaves you with a riddle.

The full show notes are available on the website at https://www.codingblocks.net/episode229

News

  • Thanks for the reviews!
    • ivan.kuchin (has taken the lead!), Yoondoggy, cykoduck, nehoraigold
    • Want to help us out? Leave a review! (reviews)

Multivalue DBMS

  • Popular: 86. Adabas, 87. UniData/UniVerse, 147. JBase
  • Similar to RDBMS – store data in tables
    • Store multiple values to a particular record’s attribute
      • Some RDBMS’s can do this as well, BUT it’s typically an exception to the rule when you’d store an array on an attribute
      • In a MultiValue DBMS – that’s how you SHOULD do it
      • Part of the reason it’s done this way is these database systems are not optimized for JOINS
    • Looked at the Adabas and UniData sites – the primary selling points seem to be rapid application development / ease of learning and getting up to speed as well as data modeling that closely mirrors your application data structures
  • I BELIEVE it’s a schema on write (docs.rocketsoftware.com)
  • Supposed to be very performant as you access the data the way your application expects it
  • Per the docs, it’s easy to maintain (Wikipedia)

Spatial DBMS

  • Popular: 29. PostGIS, 59. Aerospike, 136. SpatiaLite
  • Provides the ability to efficiently store, modify, and query spatial data – data that appears in a geometrical space (maps, polygons, etc)
  • Generally have custom data types for storing the spatial data
  • Indices that allow for quick retrieval of spatial data about other spatial data
  • Also allow for performing spatial-specific operations on data, such as computing distances, merging or intersecting objects or even calculating areas
  • Geospatial data is a subset of spatial data – they represent places / spatial data on the Earth’s surface
  • Spatio-temporal data is another variation – spatial data combined with timestamps
  • PostGIS – basically a plugin for PostgreSQL that allows for storing of spatial data
    • Additionally supports raster data – data for things like weather and elevation
    • If you want to learn how to use it and understand the data and what’s stored (postgis.net)
      • Spatial data types are: point, line, polygon, and more…basically shapes
      • Rather than using b-tree indexes for sorting data for fast retrieval, spatial indexes that are bounding boxes – rectangles that identify what is contained within them
        • Typically accomplished with R-Tree and Quadtree implementations
        • RedFin – a real estate competitor to realtor.com and others, uses PostgreSQL / PostGIS
        • Quite a bit of software that supports OpenGIS so may be a good place to start if you’re interested in storing/querying spatial data

Event Stores

  • Popular: 178. EventStoreDB, 336. IBM DB2 Event Store, 338. NEventStore
  • Used for implementing the concept of Event Sourcing
    • Event Sourcing – an application/data store where the current state of an object is obtained by “replaying” all the events that got it to its current state
      • This contrasts with RDBMS’s in that relational typically store the current state of an object – historical state CAN be stored, but that’s an implementation detail that has to be implemented, such as temporal tables in SQL Server or “history tables”
    • Only support adding new events and querying the order of events
      • Not allowed to update or delete an event
      • For performance reasons, many Event Store databases support snapshots for holding materialized states at points in time
  • EventStoreDB – https://www.eventstore.com/eventstoredb
    • Defined as an “immutable log”
    • Features: guaranteed writes, concurrency model, granulated stream and stream APIs
    • Many client interfaces: .NET, Java, Go, Node, Rust, and Python
    • Runs on just about all OSes – Windows, Mac, Linux
    • Highly available – can run in a cluster
    • Optimistic concurrency checks that will return an error if a check fails
    • “Projections” allow you to generate new events based off “interesting” occurrences in your existing data
    • For example. You are looking for how many Twitter users said “happy” within 5 minutes of the word “foo coffee shop” and within 2 minutes of saying “London”.
    • Highly performant – 15k writes and 50k reads per second

Resources we like

Tip of the Week

  • If your internet connection is good, but your cell phone service is bad then you might want to consider Ooma. Ooma sells devices that plug into your network or connect wireless and provide a phone number, and a phone jack so you can hook up an an old school home telephone. We’ve using it for about a week now with no problems and it’s been a breeze to set up. The devices range from $99 to $129 and there’s a monthly “premier” plan you can buy with nifty features like a secondary phone line, advanced call blocking, and call forwarding. (ooma.com)
  • Why use “git reset –hard” when you can “git stash -u” instead? Reset is destructive, but stashing keeps your changes just in case you need them. Because sometimes, your “sometimes” is now!
    • 🚫 “git reset –hard”.
    • ✅ “git stash -u”


Episode source