DSL Tidbits — Curiosities & Fun Facts

// fascinating facts

// 001

SQL

SQL Was Almost Called SEQUEL

IBM's original name was SEQUEL — Structured English Query Language. They had to rename it to SQL because a British aircraft manufacturer, Hawker Siddeley, already held the trademark on "SEQUEL." The name change is why old-timers still pronounce it "SEE-kwel" rather than "ess-cue-ell." Both pronunciations are technically correct; SQL has no official pronunciation.

// 002

COBOL

COBOL Processes More Money Daily Than the Entire Web

An estimated 95 billion lines of COBOL code are still running today. The US Federal Reserve processes $3 trillion in daily transactions through COBOL systems. According to Reuters, 95% of ATM swipes and 80% of in-person transactions touch COBOL code. Despite being declared "dead" dozens of times, COBOL is processing more transactions than ever.

// 003

Regex

The Catastrophic Backtracking Problem

In 2016, a single regex caused a 27-minute global outage at Cloudflare. The pattern (?:(?:\"|'|\]|\}|\\|\d|(?:nan|infinity|true|false|null|undefined|symbol|math)|\`|-|\+)+[)]*;?((?:\s|-|~|!|{}|\|\||\+)*.*(?:.*=.*))) triggered catastrophic backtracking on certain inputs — the engine had to explore an exponentially large number of possible match paths. This is a known hazard of regex engines that use backtracking rather than NFA simulation.

// 004

CSS

CSS Was Originally Proposed by Two People Simultaneously

Håkon Wium Lie proposed CSS in October 1994. Bert Bos was simultaneously developing a similar language called "Stream-based Style Sheet Proposal." They joined forces at CERN to merge their ideas. There was fierce early competition from JavaScript Style Sheets (JSSS), DSSSL, and others. The W3C chose CSS partly for its simplicity. Early browsers like Internet Explorer 3 implemented only parts of CSS 1 — incorrectly — leading to the browser wars of the late 1990s.

// 005

Make

Make Was Written in a Weekend. The Tab Bug Lives Forever.

Stuart Feldman wrote the original Make in one weekend in 1976. The infamous requirement that Makefile recipes must be indented with a tab character (not spaces) was a known bug from day one — Feldman himself called it a mistake. But Make was already in use by the time he caught it, so fixing it would break existing files. The tab requirement has caused developer pain for nearly 50 years.

// 006

LaTeX

Knuth Paid $2.56 for Each Bug in TeX

Donald Knuth offered reward cheques for bugs found in TeX, starting at $2.56 (2^8 cents) and doubling with each major version. The cheques became so famous as collector's items that few recipients ever cash them. The current reward is $327.68. TeX's version numbers converge asymptotically to π (currently 3.14159265…); METAFONT converges to e.

// 007

SQL

NULL: The Billion-Dollar Mistake in a DSL

Tony Hoare — inventor of NULL in programming — called it "my billion-dollar mistake." SQL's NULL has particularly tortured semantics: NULL ≠ NULL (because "unknown" ≠ "unknown"), NULL compared to anything returns NULL (not FALSE), and NOT IN clauses silently return no rows if any value in the list is NULL. SQL developers have been bitten by NULL semantics for fifty years.

-- Counterintuitive: returns 0 rows when ids contains NULL!
SELECT * FROM users
WHERE id NOT IN (SELECT id FROM banned_users);

// 008

FORTRAN

The Hypen That Nearly Crashed a Rocket

In 1962, NASA's Mariner 1 rocket was destroyed 294 seconds after launch. The cause: a missing overbar notation (effectively a hyphen) in a handwritten FORTRAN specification that was incorrectly transcribed into code. The omission caused the control program to interpret normal velocity fluctuations as errors, issuing incorrect steering commands. Arthur C. Clarke called it "the most expensive hyphen in history."

// design paradoxes

The Paradoxes of DSL Design

⚖️

The Expressiveness–Analysability Trade-off

The more expressive a DSL becomes, the harder it is to analyse, optimise, or statically verify. SQL is highly optimisable because it's not Turing-complete — the query optimiser can reason about all possible execution plans. Add Turing-completeness (stored procedures, triggers) and optimisability collapses. This is why general-purpose languages can't be optimised as aggressively as DSLs.

🔄

The Greenspun's Tenth Rule

Philip Greenspun observed: "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp." The corollary for DSLs: any sufficiently powerful DSL eventually grows general-purpose features (stored procedures in SQL, JavaScript in CSS, scripting in Make) — undermining the very constraints that made it analysable.

🎯

The Learnability Paradox

DSLs are supposed to be easier than general-purpose languages for domain experts. Yet the most powerful DSLs — SQL, Regex, LaTeX — have extremely steep mastery curves. A domain expert can learn basic SQL in an hour; becoming fluent in window functions, CTEs, and query optimisation takes years. The simplicity of entry masks the depth of mastery required.

🧬

The Vocabulary Proliferation Problem

Every organisation with specific domain needs tends to create its own DSL — or extend an existing one with proprietary constructs. The result: dozens of incompatible SQL dialects, hundreds of CSS preprocessors (SASS, LESS, Stylus, PostCSS), and configuration formats beyond counting (YAML, TOML, HCL, INI, JSON, HOCON…). DSLs solve the expressiveness problem and create an ecosystem fragmentation problem.

// hall of infamy

Famous DSL Bugs & Misfeatures

Even the most carefully designed domain languages carry unexpected costs.

CSS `float` — The Layout Hack That Became Standard

CSS float was designed for wrapping text around images (like magazine layout). For over a decade, web developers used it as the primary mechanism for multi-column layouts — a purpose it was never designed for, causing endless "clearfix" hacks. It took until Flexbox (2012) and Grid (2017) for CSS to have proper layout primitives.

SQL's `GROUP BY` — Not What You Think

Many developers misunderstand SQL's execution order: GROUP BY executes before HAVING, which executes before SELECT aliases. This means you cannot use a SELECT alias in a WHERE clause, but you can in ORDER BY — an inconsistency that has confused developers for fifty years and varies across database systems.

YAML's "Norway Problem"

YAML's implicit type coercion is notorious. The ISO country code for Norway is "NO." In YAML 1.1, bare NO is interpreted as boolean false. So a configuration file listing countries including Norway would silently turn Norway into false. YAML 1.2 fixed this, but countless systems still use 1.1 parsers.

countries: [GB, DE, NO, FR]
# YAML 1.1 parses NO as: false

// test your knowledge

DSL Trivia

Which DSL's version numbers converge to π (pi)?

A FORTRAN — each version adds one decimal place of π

B SQL — named after E. F. Codd's fondness for mathematics

C TeX — Donald Knuth's typesetting language

D Python — inspired by Guido's love of mathematics

✓ Correct! TeX versions converge to π (currently 3.14159265…). METAFONT converges to e. Knuth decreed that on his death, the versions will be finalised at their current values.

✗ Not quite. TeX — Donald Knuth's typesetting language — has versions that converge to π. Currently at 3.14159265…

What was the original name of SQL?

A QUEL — Query Language

B SEQUEL — Structured English Query Language

C SQUEL — Simple Query Language for English Users

D QBE — Query By Example

✓ Correct! IBM's original name was SEQUEL (Structured English Query Language). It was renamed to SQL due to a trademark conflict with Hawker Siddeley.

✗ The original name was SEQUEL — Structured English Query Language. It was renamed due to a trademark held by Hawker Siddeley, a British aircraft company.

// reference

DSL Glossary

Abstract Syntax Tree The tree data structure produced by parsing a DSL program. Represents the hierarchical structure of the program's grammar, stripping away whitespace and punctuation. The AST is the primary data structure for DSL analysis, transformation, and compilation.

BNF / EBNF Backus-Naur Form (and Extended BNF) — the metalanguage used to formally specify DSL grammars. Invented by John Backus to describe ALGOL 60. Nearly every formal DSL specification uses BNF or a derivative (ANTLR grammars, PEG, railroad diagrams).

Chomsky Hierarchy A classification of formal grammars by expressive power: Regular (regex, FSMs) < Context-Free (most programming languages, parsed by PDAs) < Context-Sensitive < Recursively Enumerable (Turing machines). Most DSLs are context-free or simpler.

Declarative DSL A DSL where programs describe what is desired, not how to achieve it. SQL, CSS, HTML, and Terraform are declarative. The underlying system (query optimiser, browser, provisioner) determines the execution strategy.

External DSL A DSL with its own distinct syntax, parser, and toolchain — independent of any host language. SQL, CSS, Regex, and LaTeX are external DSLs. Require more implementation effort but can have cleaner, purpose-designed syntax.

Fluent Interface An internal DSL pattern where method calls are chained to read like natural language: query.select("name").from("users").where("age > 18").orderBy("name"). Popularised by Martin Fowler. jQuery is the canonical example.

Internal DSL A DSL built within a host general-purpose language using the host's syntax. Rails routes in Ruby, Gradle build files in Kotlin, and RSpec tests in Ruby are internal DSLs. Leverages host language tooling but is constrained by its syntax.

Language Workbench A tool for building DSLs rapidly — providing grammar editing, IDE generation, and debugging support. JetBrains MPS, Xtext, and Spoofax are language workbenches. They lower the cost of DSL creation from months to days.

PEG Grammar Parsing Expression Grammar — an alternative to BNF that describes grammars procedurally rather than generatively. PEGs are deterministic (no ambiguity), making them popular for DSL parsers. Used by tools like LPEG, Pest (Rust), and pest.rs.

Semantic Gap The distance between the concepts in a problem domain and the concepts available in a programming language. DSLs exist to reduce this gap — making it possible to express domain concepts directly without translation to lower-level abstractions.