In the table below is another magically created comparison between technologies in full end-to-end pipelines.
I think I actually prefer this view to an overwhelming social media shared diagram plastered with brands’ logos.
The flow highlights the potential stages and optional tools/technologies involved.
For now, it serves as a useful template to view the various pipeline options and for future study.
Technology data flow
Code data flow
Technology Data flow
Stage | Path 1 — Microsoft / Fabric | Path 2 — Snowflake + dbt (Cloud-agnostic) | Path 3 — Google Cloud (GCP) |
---|---|---|---|
Sources & Ingestion | Azure Data Factory (ADF) Fabric Dataflows Gen2 Event Hubs / IoT Hub (stream) ADF Copy Activity, REST, ODBC/JDBC |
Snowpipe (auto-ingest) + Stages Fivetran / Stitch / Airbyte Kafka / Kinesis via connectors AWS Glue jobs (optional) |
Cloud Data Fusion (GUI ETL) Pub/Sub (stream) Dataflow (Beam) ingestion Storage Transfer / Transfer Service |
Raw Landing / Data Lake | Azure Data Lake Storage Gen2 OneLake (Fabric) Delta/Parquet zones: /raw /bronze |
External Stages on S3/Azure/GCS Internal Stages (Snowflake-managed) Raw files (CSV/JSON/Parquet) |
Google Cloud Storage (GCS) Raw buckets (landing) Formats: Avro/Parquet/JSON |
Orchestration | ADF Pipelines & Triggers Fabric Pipelines Azure Functions (events) Azure DevOps/GitHub Actions (runs) |
Airflow / Dagster / Prefect Snowflake Tasks & Streams dbt Cloud scheduler CI via GitHub Actions |
Cloud Composer (Airflow) Workflows / Cloud Scheduler Dataform (dbt-like) scheduling |
Transform (ELT / ETL) | Fabric Data Engineering (Spark) Azure Databricks (Delta) T-SQL in Fabric Warehouse Synapse SQL/Spark (legacy) |
dbt models (SQL + Jinja) Snowflake SQL (MERGE/Tasks) Snowpark (Python/Scala) Streams for CDC |
BigQuery SQL (ELT) Dataflow (Beam) for heavy lift Dataproc (Spark) when needed Dataform/dbt for modeling |
Curated / Serving Warehouse | Fabric Warehouse / Lakehouse Dedicated SQL Pools (Synapse) Delta tables (silver/gold) |
Snowflake (Databases/Schemas) Time Travel, Cloning Materialized Views |
BigQuery Datasets Partitioned & clustered tables Materialized Views |
Semantic Layer / Modeling | Power BI Datasets (Tabular) Calculation Groups (TE) Row-Level Security (RLS) Power BI Deployment Pipelines |
dbt semantic models & metrics Headless BI (Cube/Virt.) RLS via Snowflake roles/policies DirectQuery/Live connections |
Looker (LookML semantic layer) Looker Explore/Views/Models BigQuery Authorized Views Row/column policy tags |
BI / Visualization & Analysis | Power BI (Desktop/Service) Paginated Reports (RDL) Excel over Power BI |
Power BI / Tableau / Looker Studio Sigma / Mode (optional) Embedded analytics |
Looker (first-class) Looker Studio (lightweight) Data Catalog-linked exploration |
Data Science / ML | Azure ML (AutoML, MLOps) Databricks ML + MLflow SynapseML / ONNX |
Snowpark ML / UDFs External: SageMaker / Databricks Feature Store via Snowflake/Feast |
Vertex AI (AutoML, pipelines) BigQuery ML (in-SQL models) Feature Store (Vertex) |
Data Quality / Governance | Microsoft Purview (Catalog/Lineage) Power BI lineage & sensitivity Great Expectations (optional) |
Snowflake RBAC, Tags, Masking dbt tests, Great Expectations Monte Carlo/Bigeye (obs.) |
Dataplex (governance) Data Catalog (metadata) DQ via Dataform tests / GE |
DevOps / CI-CD & Infra | Azure DevOps / GitHub Actions Power BI Deployment Pipelines IaC: Bicep / Terraform |
GitHub Actions + dbt CI schemachange / SnowChange IaC: Terraform / Pulumi |
Cloud Build / Cloud Deploy Dataform CI, dbt CI IaC: Terraform |
Monitoring / Cost Control | Azure Monitor / Log Analytics Fabric Workspace metrics Cost Mgmt + Budgets |
Snowflake Resource Monitors Query History, Access History 3rd-party cost dashboards |
Cloud Monitoring & Logging BigQuery INFORMATION_SCHEMA Budgets + Alerts |
Code Data Flow
Stage | Microsoft / Fabric | Snowflake + dbt | Google Cloud (GCP) |
---|---|---|---|
Ingestion Code | Python ETL (requests, pyodbc) ADF / Fabric pipeline JSON Dataflow Gen2 JSON |
CREATE PIPE / CREATE STAGE Airbyte / Fivetran configs (YAML) COPY OPTIONS |
Apache Beam (Py/Java) Cloud Data Fusion JSON Pub/Sub schema JSON |
Raw Landing Config | ADLS / OneLake folder layout Parquet / Delta write options Access policies (JSON) |
Stages & File format DDL CSV / JSON / Parquet Grants & policies |
GCS bucket layout Lifecycle rules JSON BQ external table DDL |
Orchestration Code | ADF pipeline JSON + triggers Fabric Pipeline YAML Azure Functions (Python) |
Airflow DAGs (Python) Prefect flows (Python) Snowflake TASKS SQL |
Cloud Composer DAGs (Python) Cloud Scheduler jobs Dataform schedules |
Transform / Modeling | Databricks notebooks (Py/Spark) Delta Live Tables pipelines T-SQL stored procs |
dbt models (*.sql) dbt Jinja macros (*.sql) Snowpark (Python) UDFs |
BigQuery SQL models (*.sql) Dataform/dbt *.sqlx + yaml Dataproc Spark notebooks |
CDC / Merge to Curated | MERGE INTO (T-SQL) PySpark notebook jobs Delta OPTIMIZE/VACUUM |
MERGE INTO curated.* SQL Streams for CDC Materialized Views |
MERGE INTO USING staging Partition / Cluster DDL Stored procedures |
Semantic Layer | Tabular model (TMDL) Calc groups (TE script) RLS DAX expressions |
dbt semantic models (YAML) metrics.yaml / exposures Masking policies (SQL) |
LookML view/model files Explores & joins Policy tags |
BI / Report Code | Power BI PBIX / PBIT Paginated RDL XML PowerQuery M scripts |
Tableau / Power BI BI SQL views Sigma workbooks |
Looker dashboards (lkml) Looker Studio reports BQ UDFs (JS) |
Data Science Code | Azure ML notebooks (Python) MLflow tracking code ONNX export |
Snowpark-ML notebooks (Py) UDF registration SQL MLflow registry |
Vertex AI notebooks (Python) BQML CREATE MODEL SQL Vertex pipelines (YAML) |
Tests & Data Quality | Great Expectations suites Power BI model tests (DAX) Custom pytest checks |
dbt tests (schema.yml) Great Expectations suites SQL anomaly checks |
Dataform tests (assertions) Great Expectations in Beam INFORMATION_SCHEMA queries |
CI/CD Config | GitHub Actions YAML Power BI Deployment Pipelines Bicep steps |
dbt Cloud job YAML GitHub Actions for dbt Terraform scripts |
Cloud Build YAML BQ deploy scripts Terraform modules |
Infra as Code | Bicep / Terraform templates Azure DevOps variable groups |
Terraform (Snowflake provider) SnowChange / schemachange |
Terraform (GCS, BQ, VPC) IAM/Secrets configs |