Palo Alto-based startup Inception has emerged from stealth with a new type of AI model that combines diffusion technology with large language model (LLM) capabilities. The company, founded by Stanford professor Stefano Ermon, calls this approach a Diffusion-Based Large Language Model (DLM), aiming to deliver faster text generation with lower computing costs.
What Makes Inception’s AI Model Different?
Most generative AI models fall into two categories:
LLMs (e.g., GPT-4) generate text sequentially.
Diffusion models (e.g., Midjourney, OpenAI’s Sora) create images, video, and audio by refining an initial rough estimate.
Inception’s DLM applies diffusion to text, allowing it to generate and modify large blocks of text in parallel rather than one word at a time. This breakthrough enables significantly faster processing.
The Key Advantages of DLMs
Up to 10x faster than traditional LLMs
10x lower computing costs
More efficient GPU utilization
On-premises & edge deployment options
Support for fine-tuning & custom APIs
How Inception’s Models Compare
Small coding model = Comparable to GPT-4o Mini, but 10x faster
Mini model = Outperforms Meta’s Llama 3.1 8B, generating 1,000+ tokens per second
Backing & Future Plans
While funding details remain undisclosed, TechCrunch reports that Mayfield Fund has invested in Inception. The company has already secured Fortune 100 customers seeking reduced AI latency and increased processing speed.
Why This Matters
If Inception’s claims hold up, its DLMs could disrupt the AI landscape by offering high-speed, cost-efficient alternatives to traditional LLMs. This could lead to faster AI applications across industries, from chatbots and code generation to real-time language translation.