// convergence

Three forces are converging to reshape DSL design: the rise of large language models that can interpret natural language as a DSL; the democratisation push of no-code/low-code platforms; and the increasing complexity of problem domains demanding ever-more-specialised languages.

// the language expressiveness spectrum
Shell commands
ls -la | grep .txt
CSS / SQL
Pure DSLs
Python DSL libs
SQLAlchemy, Pandas
General langs
Python, Rust
Natural Language
via LLMs
← more specialised more general →

The blurring boundary between DSL and natural language is the defining tension of the 2020s

// the llm frontier

LLMs: Natural Language as a DSL Interface

Large language models are creating a novel category: natural language DSLs. Instead of learning SQL, a data analyst describes what they want in plain English, and the LLM translates it to SQL. The language is still a DSL — but the interface to write it has changed fundamentally.

DSPy — Programming LLMs Declaratively

Stanford's DSPy treats LLM programs as optimisable pipelines. Signatures declare inputs and outputs; modules compose them; an optimiser finds the best prompts automatically. It's a DSL for reasoning about LLM computation.

# DSPy — a DSL for LLM pipelines
class RAG(dspy.Module):
    def __init__(self):
        self.retrieve = dspy.Retrieve(k=5)
        self.generate = dspy.ChainOfThought(
            "context, question -> answer"
        )

    def forward(self, question):
        ctx = self.retrieve(question).passages
        return self.generate(
            context=ctx,
            question=question
        )

LMQL — SQL for Language Models

LMQL (Language Model Query Language) lets you write constrained generation programs: specify token constraints, logical conditions, and multi-turn structure in a language that compiles to optimised LLM calls. It's SQL's "describe what" philosophy applied to language generation.

# LMQL — constrained LLM generation
argmax
  "Q: What is the capital of France?
   A: [ANSWER]"
from
  "gpt-4"
where
  len(ANSWER) < 20 and
  ANSWER in known_capitals

Guidance — Structured Generation

Microsoft's Guidance library provides a template DSL for interleaving generation and constraints. Rather than post-processing LLM output, Guidance steers generation token-by-token — ensuring outputs conform to schemas, grammars, or logical conditions.

# Guidance — constrained structured output
with guidance.models.GPT4() as lm:
    lm += f"""\
    Extract the following:
    Name: {gen('name', stop='\n')}
    Age: {gen('age', regex=r'\d+')}
    City: {select(['London','Paris','Berlin'],
                   name='city')}
    """

Text-to-SQL: The Killer App

Tools like GitHub Copilot, Cursor, and various BI platforms now translate natural language to SQL with high accuracy. This is DSL interpretation at scale — the semantic gap between domain expert and language is being bridged by AI. But the underlying DSL (SQL) remains essential: the LLM still needs a precise target language to generate.

User: "Show me the top 10 customers by revenue
       this year who haven't ordered in 30 days"

→ Generated SQL:
SELECT c.name, SUM(o.total) revenue
FROM customers c JOIN orders o ...
WHERE o.date >= '2024-01-01'
  AND c.id NOT IN (
    SELECT customer_id FROM orders
    WHERE date > NOW() - INTERVAL '30 days'
  )
GROUP BY c.name
ORDER BY revenue DESC LIMIT 10;
// horizons

Emerging Directions

01
Near-term

Probabilistic Programming Languages

PPLs like Stan, Pyro, and Gen allow statisticians to express probabilistic models declaratively and run exact or approximate inference automatically. They are DSLs where the domain is Bayesian reasoning — and they're becoming central to machine learning, epidemiology, and quantitative finance. Stan's modelling language, for instance, reads like a statistics textbook.

// Stan — a DSL for Bayesian inference
model {
  mu ~ normal(0, 10);   // prior
  sigma ~ exponential(1);
  y ~ normal(mu, sigma); // likelihood
}
02
Near-term

Domain-Specific Type Systems

Languages like Idris 2 and Lean 4 allow types to encode domain constraints with mathematical precision — proving at compile time that a physical simulation respects dimensional analysis, or that a financial model cannot produce negative balances. The domain is encoded in the type system itself, making incorrect programs literally inexpressible.

03
Mid-term

Language-Oriented Programming

Language Workbenches (MPS, Racket, Spoofax) allow developers to create new DSLs as easily as they create new classes. The dream: for every problem domain, spin up a precise language in hours, not months. JetBrains MPS already does this for industrial systems — Airbus uses it to define avionics software specifications in a DSL that domain engineers (not programmers) write directly.

04
Mid-term

No-Code DSLs

Platforms like Zapier, Airtable formulas, and Notion formulas are visual DSLs — domain-specific languages expressed through GUIs rather than text. The underlying language semantics are identical to traditional DSLs (declarative, domain-constrained, compositional), but the concrete syntax is visual drag-and-drop or form-based. The line between "language" and "tool" dissolves.

05
Long-term

Neurally-Guided Program Synthesis

Program synthesis research aims to automatically generate DSL programs from examples or natural language. DeepMind's AlphaCode and OpenAI's Codex are early steps. The long-term vision: a user provides examples of inputs and desired outputs, and the system induces the DSL program that produces them. This inverts DSL design — the language evolves to fit the discovered patterns.

06
Long-term

Self-Modifying Domain Languages

An active research frontier explores DSLs that can extend themselves based on use patterns — observing how users push against the language's limits and automatically proposing grammar extensions. Racket's macro system is a step in this direction. Combined with LLMs, a DSL might iteratively suggest its own evolution to better serve the domain it was designed for.

The Convergence

DSLs, LLMs, type theory, and visual tools are converging toward a single vision: letting domain experts speak to computers in the language of their domain.

// informed speculation

Predictions for DSL Evolution

Trend Driver Horizon Confidence
SQL survives another 50 years
Its declarative model is too valuable; dialects will gain new features
LLMs lower SQL barrier; cloud databases multiply Already happening
Natural language becomes a DSL front-end
LLMs translate intent → precise DSL programs
GPT-4, Copilot already doing this 2024–2026
Language Workbenches go mainstream
MPS-style tools adopted outside aerospace/automotive
GitHub Copilot + LLM-assisted DSL creation 2026–2030
Dependent types enter mainstream DSLs
Domain constraints encoded in types, proved at compile time
Lean, Idris influence industry languages 2028–2035
DSL proliferation via AI generation
AI tools generate custom DSL grammars for niche domains
LLM code generation + ANTLR / tree-sitter 2025–2030
Next: Tidbits & Curiosities → ← Applications