fivetran and dbt gear up for war
the final chapter of the modern data stack
everyone on the internet is saying the dbt + fivetran merger means the modern data stack is dead. but most people don’t actually know what that means.
so let me tell you a story about the greatest distribution hack in enterprise software history, and how it’s all coming apart.
for years, companies like dbt, fivetran, hex, and sigma had a beautiful deal with snowflake: drive compute consumption, get free access to snowflake’s entire salesforce. they plugged into snowflake’s go-to-market and grew faster than any enterprise software company in history.
it was symbiotic. snowflake got massive consumption. the modern data stack got to ride their rocket ship. everyone won.
then the warehouses realized they didn’t need to share anymore.
dbt and fivetran are two of the last ones standing. and they just merged.
this is the final chapter.
the good old days of the modern data stack
here’s what made the dbt/snowflake relationship work for years: dbt builds business logic on top of snowflake. stripping out middle names to do entity matching, deduping, all the work to make data ready. dbt does this inside snowflake, which means every transformation is compute.
tristan handy said the quiet part out loud on a hex podcast: “my rough estimate is that a dbt customer pays dbt about 10% of the compute that dbt is driving on the downstream platform.”
every $1 to dbt means $10 to snowflake. i’ve heard numerous snowflake account execs confirm that while dbt only recently crossed $100m in revenue, they’re driving $500m in snowflake costs. which makes snowflake reps love to sell dbt in all of their transactions. this caused the dbt community to grow massively. which means snowflake account execs were basically dbt’s commission-only sales team. except instead of splitting the money, dbt got 2% and snowflake kept 98%. very generous partnership structure.

for years, this was beautiful. symbiotic. dbt got to grow fast on snowflake’s sales engine. snowflake got massive consumption. everyone won.
but this wasn’t just dbt. the entire modern data stack was built on this transaction: drive compute to snowflake, and snowflake’s sales reps will get you users. you’ll grow faster than the non-modern-data-stack companies ever could.
metaplane built a more open/plg version of alteryx.
dbt and fivetran built more open/plg versions of informatica and talend
atlan and datahub built more open/plg versions of alation and collibra.
hex and lightdash built more open/plg versions of looker and tableau.
all of these companies gladly gave up the need for top-down sales and let snowflake’s salesforce drive huge amounts of volume to them. the deal: deliver more compute to snowflake—more inefficient and lucrative compute that helped snowflake gouge customers—and you could tap into the entire snowflake salesforce.
the more you gouged customers for, the more snowflake reps would shill your stuff.
the bi layer took this to its logical extreme. sigma, hex, and omni built their entire go-to-market around being ruthlessly inefficient compute drivers. they weren’t just driving compute as a side effect - they were optimizing for it.
f500 data teams all know the UI premium they’re paying on their compute, and they always run:
most expensive workloads in sagemaker (cost efficient)
medium workloads in snowflake (convenience)
cheapest workloads in hex (just for the ui)
when asked why not run everything in hex: “i’d rather, but it would be way too expensive. like 100x more expensive for the same workload.”
the bi tools competing with each other to see who could make queries more expensive. inefficient compute = more snowflake consumption = more salesforce support.
this lasted exactly as long as cloud budgets kept expanding. then the music stopped.
snowflake and databricks commoditize the modern data stack
snowflake and databricks saw the game clearly: own the C-suite relationships, commoditize everything else.
the more fivetrans and dbts and lookers that exist, the better. each one drives more data into the warehouse, transforms it, queries it, sends it back out. the entire stack becomes a consumption machine. but here’s the genius: it also means each dbt and fivetran is entirely replaceable, forced to compete each other’s margins to nothing. the bi layer was especially good at this—tools like hex and sigma made queries intentionally expensive because inefficient compute meant more snowflake consumption, which meant more salesforce support.
dbt cannot charge for model execution bc snowflake reps will immediately switch to shilling tobiko or dagster if they did
fivetran adding features like cataloging only hurts atlan and secoda, not snowflake
how vcs enabled this: snowflake convinced investors to fund on community metrics instead of revenue. slack member counts, github stars, anything but dollars. community growth became the ultimate vanity metric. ‘sure we have no revenue, but look at our discord!’ said every series b pitch deck. snowflake reps loved this because they could sell to the actual revenue while everyone else played with toys.
the plg trap: these companies went bottoms-up. they sold to directors of data and managers of data science, not the cios and cfos that informatica and alteryx kept close. they got faster growth. but they lost the defensive moat: executive relationships. without those relationships, they had no one in the room when budgets got tight. snowflake and databricks had a monopoly on the C-suite conversations.
vcs bought into this fully. airbyte raised $1.5b on $1m in revenue—pure faith in turning community into dollars. the growth rates looked incredible on paper, but they were denominated in slack members, not revenue.
contrast this with informatica, alteryx, and looker:
sold top-down to cios
price points comparable to the data warehouses themselves
executive relationships = defensive moat that prevented teradata from destroying them
the modern data stack gave up that moat for growth velocity.
and it worked beautifully for snowflake. they went from zero to $3b in revenue in a decade. databricks hit $2.4b. the layers above and below stayed commoditized and cheap. everyone got funded, everyone got to grow, but the real margins all pooled in one place: the compute layer. ultimately, the lack of those cio relationships would destroy them.
the downfall of the modern data stack
rate hikes killed the party. customers have only so much to spend, and snowflake already took it all. every dollar paid to snowflake is a dollar that doesn’t go to dbt, fivetran, or anyone else.
moreover, snowflake’s marketing has convinced everyone:
paying $10m for compute? totally reasonable
paying $100k for orchestration? outrageous
this was a common objection raised among dbt customers when they tried to raise prices — despite paying 10-100x more on snowflake.
the modern data stack companies had no one in the room w/ the cfo to argue otherwise. they’d given up those C-suite relationships to tap into snowflake’s salesforce. now they were paying for it.
dbt tried to fix this. multiple times they reconfigured pricing to be more usage-based, trying to capture the compute volume. the customer base balked every time. this same story played out across the entire modern data stack. vcs realized the pie wasn’t infinitely large, that these companies would never grow into their valuations if they only captured 10% of the value they created. those same investors pivoted to databricks and snowflake instead.
every vc who bet on the modern data stack watched their investments get acquired for pennies or go to zero. the only survivors: the warehouses themselves, or the companies the warehouses bought to strengthen their moats.
dbt is one of the last standing, and even at $4b valuation, the math doesn’t work. capturing 10% of the value you create means growing into a $4b valuation requires adding $40b to snowflake’s market cap. at some point, snowflake thinks “why are we paying them at all?” and builds it themselves. which they’re already doing. snowflake’s building native transformation tools. databricks has their own orchestration. the squeeze is on.
so what’s dbt and fivetran’s counter-move? they’re trying to flip the script entirely.
the counter-move
fivetran and dbt, the two largest players outside the data warehouses, are merging to flip the script: move value capture back into business logic. what is business logic? it’s the finite set of if/else statements that define your business. sql pipelines are endless conditionals that say “if customer did x, then calculate y, and route to z.” those transformations, rules, definitions of what revenue means and who counts as an active user and how to segment customers - that’s your actual business encoded in code.

this isn’t radical. historically, this is how software worked. salesforce charges $300/user/month. you know how much compute that uses on aws? maybe $3. where’s the other $297 going? business logic. the if/else statements that encode how your sales team actually works. nobody complains because that’s just how software works. but somehow in data, we let the warehouses convince us storage should cost 100x more than the logic.
every dollar salesforce makes, only 5% goes to compute. the rest goes to business logic. larry ellison figured this out moving oracle from databases to erps. every successful crm, erp, and hris product made margins on if/else statements, not storage. the data warehousing stack is actually anomalous. snowflake and databricks pulled off an incredible psyop, convincing everyone that data infrastructure was different, that compute should capture all value, that business logic should be free. maybe data isn’t different. maybe we just got psyopped into accepting terrible unit economics because the warehouses grew so fast.
by merging, they’re trying to have more firepower to make compute cheap and commoditized, and move value capture back where it belongs: business logic.
notice where it says “open compute” in that diagram. that’s the key. they want to:
own the control plane, pipelines, catalog, governance (all the business logic layers)
make compute open, cheap, interchangeable
make snowflake and databricks fight on price for compute while dbt and fivetran capture margins on the logic that actually defines your business.
for dbt and fivetran to succeed, they need to win hearts and minds and rebuild those cio/cfo relationships they lost when they went plg. remember, they gave up those relationships to tap into snowflake’s salesforce. now they need them back.
they need to convince a cio that:
fivetran charging a 5x markup over airflow isn’t more outrageous than databricks charging a 40x markup on ec2
paying dbt $100k isn’t crazier than paying snowflake $20m
snowflake’s marketing has gaslit everyone into thinking compute markups are normal and business logic markups are outrageous. dbt and fivetran need to flip that narrative, which means selling to the same executives that informatica and alteryx kept relationships with all along.
what comes next?
two camps are forming:
one side: dbt, fivetran, aws, gcp, azure - trying to commoditize the warehouse and move value into business logic
other side: snowflake, databricks, and the bi tools - fighting to keep compute expensive and margins in infrastructure
the cloud providers are picking sides because the economics are obvious. when customers pay 40x markups to snowflake, only 1 of those 40 dollars goes to the underlying cloud provider. if dbt and fivetran can commoditize the warehouse layer, suddenly there’s 20x more revenue flowing to aws. plus, snowflake and databricks getting into inference disrupts aws’s margins on bedrock. meanwhile, all the open orchestration stuff like airflow plays into snowflake’s hands - more open source tools driving compute consumption.
the bi tools are staying loyal to the warehouses:
omni’s only aiming to be a bit larger than looker was
hex is still valued under a billion
they can stay comfortable in the $100m-$500m revenue range and never fight this war
meanwhile, snowflake and databricks are fighting back:
shipping native transformation tools
building their own orchestration
giving away free features to pull business logic back
every feature dbt and fivetran charge for, the warehouses are commoditizing
the current state is unsustainable. dbt and fivetran make $1 while their customers pay $10 to someone else. either they flip value capture back to business logic, or they get squeezed out. business logic has won in every other software category. or maybe the infrastructure layer already won and it’s over.
thank you george fraser, benn stancil, and aman kishore for reviewing










I don't agree with the assement that compute can be commoditized and be cheaper on SF and DBX side. This is the fundamental difference between OLTP (eg. salesforce) and OLAP (Warehouse). One is a simple operation and the other takes tons of processing.
If you want to go back to cheap compute, we can go back to commodity hardware in Hadoop. There is a reason that spark and MPP processing works well in a cloud world. You can get workloads done at massive scale with pay by the second.
I wish the dbt and fivetran side luck
Love this!