What if your computer could translate programs as it loads them, making software run on any machine — instantly?
In the early 1990s — just as today — different computers spoke different "machine languages." A program compiled for an Intel chip couldn't run on a Motorola chip. This meant software had to be recompiled for every platform, costing time and money. Michael Franz asked: What if we could delay the final step of compilation until the exact moment a program is loaded?
Compiled programs are tied to one processor architecture. Sharing software across platforms requires recompilation from source — if you even have the source.
Software vendors must maintain separate binaries for every target machine, multiplying development and testing effort.
Previous portable solutions used interpreters (like Java's JVM ancestor, Pascal P-code), but these ran 5–20× slower than native code.
Franz invented a new way to store programs called Semantic Dictionary Encoding. Instead of machine instructions or abstract-machine bytecodes, SDE stores a compressed representation of the program's meaning — its syntax tree — as a series of indices into a growing dictionary. Think of it like a ZIP file, but specifically designed for code, that can be turned into native machine instructions at blazing speed.
Click each stage to learn what happens at every step of the process.
The programmer writes code in the Oberon programming language — a clean, modular language designed by Niklaus Wirth. The source text is a human-readable description of what the program does.
The dictionary starts with templates for every language operation (addition, assignment, etc.) and an entry for every variable. As the program is encoded, new entries are added for combinations seen before — like autocomplete for code patterns.
Notice statement 4 is identical to statement 1. SDE will exploit this repetition.
Templates have "holes" (shown as dots). Complete entries represent concrete variables. The dictionary knows every possible operation and every declared variable.
Each index is a tiny number pointing into the dictionary. The decoder reads them and knows exactly what code to generate.
The dictionary speculatively grows with new entries based on what was just seen. If similar patterns appear later, they can be encoded much more compactly.
This is the magic. Repeated patterns collapse to single indices. The decoder can also use "code-copying" — just duplicating the machine instructions it already generated for the same expression.
SDE files are dramatically smaller than native object files — about 2.5× more compact on average. Smaller files mean faster disk reads, which partly compensates for the time spent generating code.
Franz tested on four generations of Motorola 680x0 processors. As processors got faster, the gap between native loading and SDE code-generation shrank rapidly — from 40% overhead down to just 7%.
CPU speed grows faster than disk speed improves. Since SDE files are smaller (less disk I/O) and code generation is CPU-bound, the technique gets more competitive over time.
The dynamically generated code matched or beat Apple's optimizing MPW C compiler on 7 out of 10 benchmarks — despite being generated in milliseconds.
Execution times (ms) on a Macintosh Quadra 840AV. Lower is better. Bold values indicate the winner.
| Benchmark | SDE (on-the-fly) | Apple MPW C (optimized) | Winner |
|---|---|---|---|
| Permutation | 83 | 113 | SDE |
| Towers of Hanoi | 83 | 121 | SDE |
| Eight Queens | 50 | 43 | MPW C |
| Integer Matrix Multiply | 150 | 173 | SDE |
| Real Matrix Multiply | 133 | 171 | SDE |
| Puzzle | 800 | 800 | Tie |
| Quicksort | 66 | 61 | MPW C |
| Bubblesort | 117 | 88 | MPW C |
| Treesort | 83 | >1000 | SDE |
| FFT | 133 | 123 | MPW C |
Deferring code generation to load time opens up several surprising capabilities that go well beyond just running on different machines.
Decide when loading whether to include runtime checks (array bounds, nil pointers, type tests). No need to maintain separate "debug" and "release" builds — one SDE file serves both purposes.
When a library module's internals change (but not its interface), SDE clients don't need source-level recompilation. The code-generating loader silently adjusts field offsets and sizes at load time.
Even within the same CPU family (e.g., 68020 vs 68040), the code generator can tailor instruction scheduling to the exact processor model running the code. Existing software benefits from new CPUs automatically.
Franz argued this enables "plug-in" software components — libraries sold in SDE form that work on any compatible machine without recompilation, just like hardware components. (This vision foreshadowed Java bytecodes, released the following year.)
This dissertation was accepted in 1994 — one year before Sun Microsystems released Java with its "write once, run anywhere" promise using JVM bytecodes. Franz's approach was fundamentally different and in some ways more ambitious: instead of interpreting an abstract machine, SDE generates native code at full speed. The core insight — that compact intermediate representations plus fast code generation can replace binary compatibility — remains influential in modern JIT compilation, .NET, WebAssembly, and beyond.
The idea that processors get faster while disks stay slow — so on-the-fly compilation from compact formats will win — is now mainstream (JIT in V8, HotSpot, etc.)
WASM's compact binary format that browsers compile to native code at load time echoes Franz's SDE approach three decades later.
The OSF's Architecture Neutral Distribution Format (ANDF) pursued similar goals, though SDE was 4× more compact and targeted dynamic loading rather than static linking.