WisniaLang: Compiler Project
Can we produce small and fast binaries at the same time?
Oct 17, 2022
c++
elf
compiler
llvm
rust
CONTENTS
Introduction
For the past 3 years, I have been working on the WisniaLang compiler for an experimental programming language that compiles to native machine code. It focuses on delivering tiny Linux binaries (ELF
x86_64
) with no LLVM dependency. As a result, what it actually competes with is the LLVM toolchain, on which a large number of other programming languages rely extensively.
The reason for focusing on the delivery of small binaries with no LLVM dependency is to offer an alternative to the LLVM toolchain. While LLVM is a powerful and widely-used toolchain, it can be quite resource-intensive and may not be suitable for all scenarios. By offering a compiler that can produce efficient machine code without the need for LLVM, my aim is to provide a more lightweight and flexible solution for those who need it.
Architecture
The architecture of the compiler consists of several main phases, which work together to perform this translation. These phases include lexical analysis, which breaks the source code down into smaller units called tokens; syntactic analysis, which builds a representation of the structure of the source code called an abstract syntax tree (AST); semantic analysis, which checks the AST for semantic errors and performs type checking; intermediate representation (IR), which represents the code in a lower-level form that is easier for the compiler to work with; code generation, which generates machine code from the IR; optimization, which improves the performance of the machine code; and linking, which combines the machine code to create a complete executable program in ELF (Executable and Linking Format) format.
Programming languages and LLVM
Before going further, let me get straight to the point:
- Writing compilers is easy
- Optimizing the machine code is hard
- Supporting arbitrary architectures / operating systems is hard
This is where LLVM comes in handy. LLVM uses an intermediate representation language, which is kind of similar to assembly, but with a few higher level constructs. LLVM is good at optimizing this IR language, as well as compiling into different architecture and binary formats. So as a language author using LLVM, I’m really writing a transpiler from my language to LLVM IR, and letting the LLVM compiler do the hard work.

WisniaLang is an amateurish project that takes a more traditional approach to compiler design compared to other LLVM-based programming languages. Despite its amateur status, WisniaLang still follows the same front-end procedures as other programming languages in the LLVM family. However, it differs in the way it handles certain tasks, such as register allocation, machine code generation, and ELF binary format construction. Instead of relying on LLVM or other external tools for these tasks, WisniaLang handles them on its own. This approach allows WisniaLang to have more control over the compilation process, but it also requires more work and expertise on the part of its developers.
Example programs
Let’s take a look at simple programs and see how a handwritten compiler compares to Rust, an LLVM-based programming language. Both compilers generate the identical number sequence, 3000 2997 ... 6 3 1
, for both programs.
WisniaLang | Rust |
---|---|
|
|
Since I was asked to include C in the benchmark, as it has been a standard for performance for almost half a century and it is something that every benchmark of a language attempting to surpass it should include, here is an example program in C:
1#include <stdint.h>
2#include <stdio.h>
3
4void foo(uint16_t base, uint16_t number) {
5 if (number) {
6 printf("%u ", base * number);
7 foo(base, number - 1);
8 }
9}
10
11int main() {
12 foo(3, 1000);
13 printf("1\n");
14}
A dive deeper
Let us now compare the final size of the produced binaries, as well as the time it took to assemble and run them. TLDR: If you wish to skip the lenghty blabberings and view the results as a graphic representation, scroll to the bottom of the page.
WisniaLang
1┌─[tautvydas][kagamin][~/tests]
2└─▪ time ./wisnia test.wsn
3real 0m0.004s
4user 0m0.002s
5sys 0m0.001s
6
7┌─[tautvydas][kagamin][~/tests]
8└─▪ ls -lh a.out
9-rwxrwxrwx 1 tautvydas tautvydas 528 Dec 27 17:20 a.out
10
11┌─[tautvydas][kagamin][~/tests]
12└─▪ time ./a.out
133000 2997 2994 2991 (omitted by the author)
14
15real 0m0.004s
16user 0m0.000s
17sys 0m0.004s
Running 20 times, the results averaged out to:
- Compilation time:
3.3 ms
- Size of the binary:
0.515 KiB
- Runtime speed:
4.25 ms
C
1┌─[tautvydas][kagamin][~/tests]
2└─▪ time gcc test.c
3real 0m0.040s
4user 0m0.025s
5sys 0m0.015s
6
7┌─[tautvydas][kagamin][~/tests]
8└─▪ ls -lh a.out
9-rwxr-xr-x 1 tautvydas tautvydas 16K Dec 27 12:39 a.out
10
11┌─[tautvydas][kagamin][~/tests]
12└─▪ time ./a.out
133000 2997 2994 2991 (omitted by the author)
14
15real 0m0.002s
16user 0m0.001s
17sys 0m0.001s
Running 20 times, the results averaged out to:
- Compilation time:
38.05 ms
- Size of the binary:
16 KiB
- Runtime speed:
1.6 ms
C (optimized for size)
1┌─[tautvydas][kagamin][~/tests]
2└─▪ time gcc -Os -s test.c
3real 0m0.045s
4user 0m0.028s
5sys 0m0.017s
6
7┌─[tautvydas][kagamin][~/tests]
8└─▪ ls -lh a.out
9-rwxr-xr-x 1 tautvydas tautvydas 15K Dec 27 12:41 a.out
10
11┌─[tautvydas][kagamin][~/tests]
12└─▪ time ./a.out
133000 2997 2994 2991 (omitted by the author)
14
15real 0m0.002s
16user 0m0.002s
17sys 0m0.000s
Running 20 times, the results averaged out to:
- Compilation time:
42.15 ms
- Size of the binary:
15 KiB
- Runtime speed:
1.7 ms
C (optimized for speed)
1┌─[tautvydas][kagamin][~/tests]
2└─▪ time gcc -O3 -s test.c
3real 0m0.046s
4user 0m0.033s
5sys 0m0.012s
6
7┌─[tautvydas][kagamin][~/tests]
8└─▪ ls -lh a.out
9-rwxr-xr-x 1 tautvydas tautvydas 15K Dec 27 12:42 a.out
10
11┌─[tautvydas][kagamin][~/tests]
12└─▪ time ./a.out
133000 2997 2994 2991 (omitted by the author)
14
15real 0m0.002s
16user 0m0.000s
17sys 0m0.002s
Running 20 times, the results averaged out to:
- Compilation time:
47.75 ms
- Size of the binary:
15 KiB
- Runtime speed:
1.6 ms
Rust
1┌─[tautvydas][kagamin][~/tests]
2└─▪ rustc --version
3rustc 1.65.0 (897e37553 2022-11-02)
4
5┌─[tautvydas][kagamin][~/tests]
6└─▪ time rustc test.rs
7real 0m0.167s
8user 0m0.131s
9sys 0m0.042s
10
11┌─[tautvydas][kagamin][~/tests]
12└─▪ ls -lh test
13-rwxr-xr-x 1 tautvydas tautvydas 3.9M Dec 27 15:26 test
14
15┌─[tautvydas][kagamin][~/tests]
16└─▪ strip test
17
18┌─[tautvydas][kagamin][~/tests]
19└─▪ ls -lh test
20-rwxr-xr-x 1 tautvydas tautvydas 319K Dec 27 15:26 test
21
22┌─[tautvydas][kagamin][~/tests]
23└─▪ time ./test
243000 2997 2994 2991 (omitted by the author)
25
26real 0m0.003s
27user 0m0.000s
28sys 0m0.003s
At first, Rust took 167 ms
to compile the program, which weighted 3.9 MiB
. After removing debug symbols from the binary, the binary now weighs 319 KiB
, putting it considerably behind WisniaLang.
Running 20 times, the results averaged out to:
- Compilation time:
156.2 ms
- Size of the binary:
319 KiB
- Runtime speed:
2.15 ms
Rust (optimized for size + libc)
This would necessitate a revision of the previously mentioned sample program to utilize libc
rather than the standard library (std::*
). The Cargo.toml
file is provided below, along with a revised example program.
1[package]
2name = "optimized-size"
3version = "0.1.0"
4
5[profile.release]
6panic = "abort"
7lto = true
8strip = true
9codegen-units = 1
10incremental = false
11opt-level = "z"
12
13[dependencies]
14libc = { version = "0.2", default-features = false }
1#![no_std]
2#![no_main]
3
4extern crate libc;
5use libc::c_uint;
6
7fn foo(base: u16, number: u16) {
8 if number > 0 {
9 unsafe {
10 libc::printf("%u \0".as_ptr() as *const libc::c_char, (base * number) as c_uint);
11 }
12 foo(base, number - 1);
13 }
14}
15
16#[no_mangle]
17pub extern "C" fn main() {
18 unsafe {
19 foo(3, 1000);
20 libc::printf("1\n".as_ptr() as *const libc::c_char);
21 libc::exit(0)
22 }
23}
24
25#[panic_handler]
26fn my_panic(_info: &core::panic::PanicInfo) -> ! {
27 loop {}
28}
Let’s see how well it performs now, excluding the time it took to compile libc
.
1┌─[tautvydas][kagamin][~/tests/rust-optim-size]
2└─▪ time cargo build --release
3 Compiling optimized-size v0.1.0 (~/tests/rust-optim-size)
4 Finished release [optimized] target(s) in 0.17s
5
6real 0m0.222s
7user 0m0.188s
8sys 0m0.034s
9
10┌─[tautvydas][kagamin][~/tests/rust-optim-size]
11└─▪ ls -lh target/release/optimized-size
12-rwxr-xr-x 2 tautvydas tautvydas 14K Dec 27 15:27 target/release/optimized-size
13
14┌─[tautvydas][kagamin][~/tests/rust-optim-size]
15└─▪ time ./target/release/optimized-size
163000 2997 2994 2991 (omitted by the author)
17
18real 0m0.002s
19user 0m0.000s
20sys 0m0.002s
Running 20 times, the results averaged out to:
- Compilation time:
225.6 ms
- Size of the binary:
14 KiB
- Runtime speed:
1.75 ms
Rust (optimized for speed + libc)
Same Cargo.toml
file as before, but with opt-level
set to 3
.
1┌─[tautvydas][kagamin][~/tests/rust-optim-speed]
2└─▪ time cargo build --release
3 Compiling optimized-speed v0.1.0 (~/tests/rust-optim-speed)
4 Finished release [optimized] target(s) in 0.17s
5
6real 0m0.226s
7user 0m0.187s
8sys 0m0.039
9
10┌─[tautvydas][kagamin][~/tests/rust-optim-speed]
11└─▪ ls -lh target/release/optimized-speed
12-rwxr-xr-x 2 tautvydas tautvydas 14K Dec 27 15:28 target/release/optimized-speed
13
14┌─[tautvydas][kagamin][~/tests/rust-optim-speed]
15└─▪ time ./target/release/optimized-speed
163000 2997 2994 2991 (omitted by the author)
17
18real 0m0.002s
19user 0m0.000s
20sys 0m0.002s
Running 20 times, the results averaged out to:
- Compilation time:
235.5 ms
- Size of the binary:
14 KiB
- Runtime speed:
1.55 ms
Results
WisniaLang excels in the first two benchmark categories (compilation time and produced binary size), but falls short in the third category (speed of the binary), which remains an area for improvement.
Summary
- If compilation speed and binary size are important, dropping the LLVM toolchain can have a positive impact
- However, doing so means missing out on LLVM optimizations as well as support for arbitrary OSes and architectures
