Skip to main content

Command Palette

Search for a command to run...

Building a data tech-stack

A breakdown of the building blocks of a data stack

Updated
3 min readView as Markdown
Building a data tech-stack

Data stack

A data stack is a set of tools and technologies built to facilitate the movement, storage and access to data in a business and transforming the data into actionable insights.

Components of a data stack

  1. Data source-This is where the data originates from. Businesses usually have numerous sources of data. This can be a database in the business, real-time measurements from physical equipment or tools or scraped online data.
  2. Data pipeline-This is a set of tools that enable the ingestion of data from the source and movement of the data through a series of steps to its destination.
  3. Data storage-This is where all the data from the various sources is stored. It is usually a data warehouse since it allows data from various sources to be stored.
  4. Data modelling and transformation-These are tools that take the raw data stored in a data warehouse and convert the data into user-friendly models.
  5. Data analytics-This is where the data that has been collected, structured, and modelled is turned into actionable insight.
  6. Data activation- this is also known as Reverse-ELT. This is the process of making the data operational by moving it from the data warehouse, validating it, and loading it into applications or third-party business tools .

Data platform

This is an integrated set of technologies that enables implementation of the data stack into infrastructure. It is usually in form of a diagram that shows how each component of the data stack relates with the other.

Data architecture

A framework that consists of the underlying computer system that powers the data stack. It is a plan for ingesting, storing and delivering the data.

A diagrammatic representation of a basic data architecture

Data architecture (2).jpg Data Architecture

  1. Cloud Data Warehouses (DWH)- traditional data warehouse solutions are gradually paving way for cloud solutions. This because cloud solutions offer faster execution of SQL queries, easier connection between data sources and data warehouse, accessibility and usability by all users, are more affordable, flexible and scalable compared to traditional solutions.
  2. Transition from ETL (Extract-Transform-Load) to ELT: In a modern data stack, data is fed into the master database before being transformed using cloud ELT solutions
  3. Self-service analytics solutions- these are easy to use solutions designed for business users to do Business Intelligence, generate reports, and other data visualizations.

Tools used in each stage of modern data stack

The following is a list of technologies that could be used to build a data stack at each stage:

  • Data ingestion-Fivetran, Stitch
  • Data storage/data warehouse- Snowflake, BigQuery, Redshift, Azure Synapse Analytics
  • Data modelling and transformation-dbt, LookML, Matillion
  • Data analytics-Power BI, Looker, Tableau, Chartio, Metabase
  • Reverse ETL- Census, hightouch

modern-data-stack.png Photo by: dataiku

References