Stack Allocated Closures: Turning a Bug Into an Optimization #1690

Brian-M-J · 2024-01-27T06:22:05Z

Brian-M-J
Jan 27, 2024

The "Bug"

As mentioned in #217, stack allocated closures were causing weird memory bugs, and this was fixed by allocating their memory on the heap instead. But what if we could fix the memory issues while still allowing programmers a choice in whether closure state is allocated on the stack or heap?

How Would Stack Allocated Closures Benefit Programmers?

The call site is statically known, enabling further compiler analyses and optimizations like function inlining etc.
If this were compiled to target a CPU, there would be no reliance on the CPU's branch target predictor at runtime.

Thus we can increase the execution speed of programs by capturing closure state on the stack. This strategy is employed by Roc, a purely functional programming language that aims to outperform languages like Java and Go.

How is it Done?

Lambda sets underpin the compiler infrastructure to perform defunctionalization, the transformation used to perform this optimization. Defunctionalization replaces higher-order functions with equivalent first-order ones (Explanations are given here and here).

Any Reference Implementations?

I already mentioned Roc, but its implementation isn't complete (they're asking for help). Maybe the Mojo and Roc teams can work together on this so that both languages can benefit.

There's also Morphic, which is a research language mentioned in this paper which inspired Roc.

What are the Tradeoffs?

It's the classic space v.s. time tradeoff. This is why I suggested it as optional instead of mandatory:

If a function is called multiple times with unique functions:

a unique copy of caller must be stamped out for each unique argument (monomorphization, increasing code size)
or, direct calls can be used, but must be guarded by a jump table

Also, it means that the smallest captures take as much space on the stack as the largest captures, which can be exceptionally wasteful if the size discrepancy is large.

This scheme makes incremental and separate compilation more challenging, as the addition of a new function can change the memory representation of values far away.

gryznar · 2024-01-27T22:59:01Z

gryznar
Jan 27, 2024

+1 Moving to heap allocation seems to be like a regression (at least in term of speed). This could really hurt pure functional approach

0 replies

Mogball · 2024-01-29T17:42:33Z

Mogball
Jan 29, 2024

+1. Thanks for bringing this up! We moved the closure state from stack to the heap in order to avoid an obvious breach in safety in the language, so that code that looks safe is safe by default. There were other safety issues with closures before the update, like around how copy constructors and destructors of captured values were called. These are all fixed in 0.7.

In the long term, there are a couple of approaches to keeping closure state on the stack (for optimizations reasons, as you mentioned, but also because if the closure doesn't not escape, heap allocations are more costly). A heap-to-stack optimization pass would perform this by default, or we alter the codegen to guarantee that closures are stack-allocated unless they escape the owning function.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stack Allocated Closures: Turning a Bug Into an Optimization #1690

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Stack Allocated Closures: Turning a Bug Into an Optimization #1690

Brian-M-J Jan 27, 2024

The "Bug"

How Would Stack Allocated Closures Benefit Programmers?

How is it Done?

Any Reference Implementations?

What are the Tradeoffs?

Replies: 2 comments

gryznar Jan 27, 2024

Mogball Jan 29, 2024

Brian-M-J
Jan 27, 2024

gryznar
Jan 27, 2024

Mogball
Jan 29, 2024