2/19/26
by
Jeremy Fraenkel CEO & Co-founder, Fundamental
The Trillion-Dollar AI Blindspot: Why LLMs Haven’t Cracked the Code on Your Most Valuable Data

Since launching Fundamental two weeks ago, the most common reaction has been a combination of relief and excitement that someone has finally decided to bring the AI revolution to tabular data.
AI has transformed how we process and produce text, images, and code. But it largely overlooked the most valuable data asset in the world.
That data lives in tables. And general-purpose language models were never built to understand it.
LLMs like ChatGPT and Claude can write poetry and code software, yet they consistently fail when confronted with the spreadsheets and databases that power every critical enterprise decision. Understanding why reveals something fundamental about how these systems work. And why a completely different approach is needed.
LLMs are trained on billions of text sequences, learning to predict what comes next based on patterns in language. This works brilliantly for essays and code, and even images. It struggles with tables, especially when scale, structure, and precision matter.
A spreadsheet isn't a linear narrative. It's a structured matrix where schema, constraints, and numerical precision matter immensely. Cell B7 is meaningless without understanding its relationship to the header in B1, the category in A7, and comparative values across the entire column.
The first problem is tokenization. LLMs break text into fragments for processing. This works beautifully for sentences. It’s catastrophic for tables. Quarterly sales data gets shredded: column headers separated from values, multi-digit numbers split across tokens, and spatial relationships reduced to a one-dimensional sequence. The model sees a stream of fragments instead of a structured grid.
Then there's precision. LLMs excel at language tasks but struggle with exact calculation. Ask one to analyze revenue trends across 50 product lines and it might generate plausible-sounding insights while making calculation errors. In enterprise contexts where a 2% demand forecasting error means hundreds of millions in lost revenue, plausibility isn't good enough, accuracy is.
Take this simple example to illustrate why tokenization leads to lack of precision; despite several prompts, ChatGPT is insistent that 8.11 is a higher number than 8.8:
Tables encode meaning through structure: column names and types, conditional and hierarchical relationships between columns, and potential time dependencies between rows. An LLM might read a "Date" column without understanding that rows may depend on prior entries. It might process "Customer_ID" without recognizing it as a foreign key. The implicit schemas that make databases powerful remain invisible to models trained on text.
And context windows impose hard limits. LLMs can only process a fixed amount of records at once, but business datasets routinely contain millions or billions of rows. When analysis requires understanding patterns across 1,000,000 transactions, critical details get lost; LLMs must chunk, sample, or summarize large datasets, which risks losing important global structure..
The hidden language of tables has remained hidden because general-purpose language models were never built for this application.
That's why we built NEXUS. Not an LLM adapted for tables, but a Large Tabular Model purpose-built from the ground-up to tackle structure data. Trained on billions of tabular datasets to natively understand the non-linear relationships and hidden patterns across any table, in any industry.
Banks can now predict institutional failures under stress. Manufacturers can prevent billion-dollar equipment breakdowns. Retailers can anticipate inventory shortfalls. Not through months of custom model development, but through a foundation model that just works.
The answer was never retrofitting language models to handle structured data. It was building models that speak tables as their native language.


