wasm64, >4 GiB indexing, and a 64-bit lock-free guarantee #336

sunfishcode · 2015-09-05T18:37:42Z

According to clang, all common 64-bit CPUs, x86-64, arm64, mips64, ppc64, sparcv9, and systemz (and bpf, if that counts) support a 64-bit integer type as "lock free". In the spirit of #327, this is a significant agreement among 64-bit architectures, but not 32-bit architectures.

I propose we think about the >4GiB linear memory feature as belonging to a distinct "architecture" called wasm64, when we need to distinguish it from wasm32. This will allow us to say that wasm64 has up to 64-bit lock-free integers, while wasm32 only has up to 32-bit lock-free integers. If we add signal handling it could also let us say that wasm64 has up to 64-bit signal-atomic integers (sig_atomic_t). It would also make it obvious what types to use around page_size and resize_memory.

Except where it makes sense to make them different, wasm32 and wasm64 would otherwise be kept identical. In particular, wasm32 would still have 64-bit integers.

The main negative consequence of this distinction is that wasm64 code would not be supported on many popular 32-bit CPUs. This is unfortunate, but it would already be the case that code using 64-bit pointers wouldn't run as efficiently as code using 32-bit pointers on 32-bit platforms.

There's a desire to leave open the possibility of having both 32-bit and 64-bit linear memory addressing within a single instance. wasm64 could still be made to support mixing 64-bit indices and 32-bit indices if we choose, for example. We could potentially even permit wasm32 libraries to be linked into wasm64 applications (though there would of course be ABI complications at the C/C++ level, non-C/C++ code might be able to take advantage of this).

According to clang, all common 64-bit CPUs, x86-64, arm64, mips64, ppc64, sparcv9, and systemz (and bpf, if that counts) support a 64-bit integer type as "lock free". In the spirit of #327, this is a significant agreement among 64-bit architectures, but not 32-bit architectures. I propose we think about the >4GiB linear memory feature as belonging to a distinct "architecture" called *wasm64*, when we need to distinguish it from *wasm32*. This will allow us to say that wasm64 has up to 64-bit lock-free integers, while wasm32 only has up to 32-bit lock-free integers. If we add signal handling it could also let us say that wasm64 has up to 64-bit signal-atomic integers (sig_atomic_t). It would also make it obvious what types to use around page_size and resize_memory. Except where it makes sense to make them different, wasm32 and wasm64 would otherwise be kept identical. In particular, wasm32 would still have 64-bit integers. The main negative consequence of this distinction is that wasm64 code would not be supported on many popular 32-bit CPUs. This is unfortunate, but it would already be the case that code using 64-bit pointers wouldn't run as efficiently as code using 32-bit pointers on 32-bit platforms. There's a desire to leave open the possibility of having both 32-bit and 64-bit linear memory addressing within a single instance. wasm64 could still be made to support mixing 64-bit indices and 32-bit indices if we choose, for example. We could potentially even permit wasm32 libraries to be linked into wasm64 applications (though there would of course be ABI complications at the C/C++ level, non-C/C++ code might be able to take advantage of this).

AndrewScheidecker · 2015-09-07T15:32:45Z

I think the biggest downside to this is requiring shared modules to be either wasm32 or wasm64. The instance already has to declare peak address space requirements up front, so it's obvious whether a statically linked program can use 32-bit addresses or not up front, but it's not so obvious for a shared module.

One way to hide the lack of universal 64-bit atomics is by adding a pointer type to WASM that is always 64-bits, but allows 32-bit runtimes to ignore the top 32-bits. Lock-free atomics on pointers could be guaranteed by allowing 32-bit runtimes to use 32-bit atomics on the lower half of the 64-bits allocated for the pointer in memory.

There would be a few downsides:

It doesn't guarantee lock-free atomics on 64-bit ints, but I don't think that's as important as lock-free atomics on pointers
Reinterpreting 64-bit ints as pointers would be lossy (AFAICT this is allowed by the C++ standard, correct me if I'm wrong)
It uses 64-bits of memory for each pointer on 32-bit runtimes. But if this is a problem for an application, 32-bit pointers only postpone it.

But the upside is that then WASM only needs a single architecture. 64-bit runtimes don't need to support 32-bit pointers, and 32-bit runtimes can use any module as long as an instance doesn't need too much memory. I think that's worth some superfluous pointer bits.

kripken · 2015-09-07T19:33:02Z

@AndrewScheidecker: Doubling the pointer size means a significant regression in performance, probably comparable to the x32/x86-64 difference which is 5-8% in throughput, in addition to using significantly more memory.

taisel · 2015-09-07T20:04:41Z

No way to allow feature detection for code to provide fallback or exit early?

AndrewScheidecker · 2015-09-07T20:42:55Z

@kripken Doubling the pointer size means a significant regression in performance, probably comparable to the x32/x86-64 difference which is 5-8% in throughput, in addition to using significantly more memory.

The cost should be less than x32 vs x86-64, since pointers could still be 32-bit values when not stored in memory. I don't know if that saves a lot, but the cost should limited to the additional memory use and less effective use of the cache+memory bandwidth. (I'm assuming that there will be a pointer type for local and intermediate values to distinguish them from specifically sized integers)

This is just my opinion, but I would be fine with losing 10% throughput on 32-bit runtimes if it enabled a single wasm "architecture".

You could also keep the wasm32/wasm64 distinction and just use this trick to support wasm64 on 32-bit runtimes. But I think if wasm64 is universally supported with ok performance, then nearly everybody will use it over wasm32.

ghost · 2015-09-08T04:04:49Z

@AndrewScheidecker wasm64 with index masking already gives you these 'pointers'. When the linear memory size is within 32 bits the top 32 bits are masked off and the compiler can optimize the code when running on 32 bit runtimes to often use 32 bit ops. The storage will still be 64 bits. I expect a lot of apps will not need wasm64 and will continue to use wasm32 if only for lower memory usage.

If there is good support for global constants that can be initialized based on the linear memory size then perhaps the code could adapt to the storage size of pointers, at least in part.

lukewagner · 2015-09-08T15:31:57Z

@sunfishcode Generally lgtm, though I think it'd be good phrase wasm64 as a feature that can be optionally present and feature-tested for. I think it would also be good to mention that, semantically, wasm64 would just be a mode bit present in Ast.module that is checked by check.ml as part of the validation of various operations. Thus, there is just one "wasm" (on semantics, one binary encoding) and "wasm32"/"wasm64" are terms referring to modules that set this mode one way or the other.

jfbastien · 2015-09-08T17:33:48Z

As discussed offline, this lgtm. I expect that we'll refine our thinking as we implement more, but this seems like a good step forward.

sunfishcode · 2015-09-14T19:56:23Z

Updated to include a mention that wasm64 is just a mode bit present in the module.

sunfishcode · 2015-09-14T20:01:22Z

Updated to add a mention of a feature-test API.

lukewagner · 2015-09-14T20:30:20Z

FAQ.md

+
+## Why have wasm32 and wasm64, instead of just using 8 bytes for storing pointers?
+
+A great number of applications that don't ever need as much as 4 GiB of memory.


lukewagner · 2015-09-14T20:31:44Z

Looks great, thanks! The Knuth quote is a nice cherry on top.

jfbastien · 2015-09-14T20:52:20Z

lgtm, I fixed @lukewagner's nit and will now merge. Agreed on the Knuth quote, very nice :-)

wasm64, >4 GiB indexing, and a 64-bit lock-free guarantee

sunfishcode added 2 commits September 5, 2015 11:35

Add an FAQ entry explaining the lack of an abstract size_t.

863d084

sunfishcode added 2 commits September 8, 2015 11:32

Add an FAQ about using 8 bytes for storing pointers.

ac42e69

Clarify that wasm32 vs wasm64 are just a flag in the module header.

6a00938

Add a sentance mentioning APIs for feature-testing wasm64 or wasm32.

838126b

lukewagner reviewed Sep 14, 2015
View reviewed changes

Update FAQ.md

8f649fc

jfbastien added a commit that referenced this pull request Sep 14, 2015

Merge pull request #336 from WebAssembly/wasm64

ae649cd

wasm64, >4 GiB indexing, and a 64-bit lock-free guarantee

jfbastien merged commit ae649cd into master Sep 14, 2015

jfbastien deleted the wasm64 branch September 14, 2015 20:52

sunfishcode mentioned this pull request Mar 19, 2016

Provide a guaranteed-lock-free type? tc39/proposal-ecmascript-sharedmem#46

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wasm64, >4 GiB indexing, and a 64-bit lock-free guarantee #336

wasm64, >4 GiB indexing, and a 64-bit lock-free guarantee #336

sunfishcode commented Sep 5, 2015

AndrewScheidecker commented Sep 7, 2015

kripken commented Sep 7, 2015

taisel commented Sep 7, 2015

AndrewScheidecker commented Sep 7, 2015

ghost commented Sep 8, 2015

lukewagner commented Sep 8, 2015

jfbastien commented Sep 8, 2015

sunfishcode commented Sep 14, 2015

sunfishcode commented Sep 14, 2015

lukewagner Sep 14, 2015

lukewagner commented Sep 14, 2015

jfbastien commented Sep 14, 2015


		## Why have wasm32 and wasm64, instead of just using 8 bytes for storing pointers?

		A great number of applications that don't ever need as much as 4 GiB of memory.

wasm64, >4 GiB indexing, and a 64-bit lock-free guarantee #336

wasm64, >4 GiB indexing, and a 64-bit lock-free guarantee #336

Conversation

sunfishcode commented Sep 5, 2015

AndrewScheidecker commented Sep 7, 2015

kripken commented Sep 7, 2015

taisel commented Sep 7, 2015

AndrewScheidecker commented Sep 7, 2015

ghost commented Sep 8, 2015

lukewagner commented Sep 8, 2015

jfbastien commented Sep 8, 2015

sunfishcode commented Sep 14, 2015

sunfishcode commented Sep 14, 2015

lukewagner Sep 14, 2015

Choose a reason for hiding this comment

lukewagner commented Sep 14, 2015

jfbastien commented Sep 14, 2015