Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cranelift: 32bit div_s, rem_u, rem_s for aarch64 #9850

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

MarinPostma
Copy link
Contributor

followup to #9798 where I did the ground work for 32bit division, this PR extends 32bits optimizations to rem_u, rem_s and div_s.

This should close #9766.

@MarinPostma MarinPostma requested a review from a team as a code owner December 18, 2024 12:14
@MarinPostma MarinPostma requested review from cfallin and removed request for a team December 18, 2024 12:14
@MarinPostma MarinPostma marked this pull request as draft December 18, 2024 12:42
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. labels Dec 18, 2024
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks a bunch! Very nice to see the test expectations get shorter. A few thoughts below but nothing major.

(load_constant64_full $I64 extend n))

;; Fallback for integral 32-bit constants
(rule -1 (imm (integral_ty ty) extend n)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say (fits_in_32 ty) rather than ty here, and make this the higher-priority case (so 64-bit is the fallback)? That seems a little cleaner to me than the implicit "everything not I64 is smaller than 64 bits" here (and less likely to break if we try to do other things like support I128 more fully in the future).

@@ -758,7 +758,7 @@ impl MachInstEmit for Inst {
ALUOp::EorNot => 0b01001010_001,
ALUOp::AddS => 0b00101011_000,
ALUOp::SubS => 0b01101011_000,
ALUOp::SDiv => 0b10011010_110,
ALUOp::SDiv => 0b00011010_110,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge the SDiv and UDiv cases now?

;; Helper for placing a `Value` into a `Reg` and validating that it's nonzero.
(decl put_nonzero_in_reg (Value) Reg)
;; It takes a value and entension type, and perform emits the appropriate checks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/entension/extension/
s/perform/performs/

;; Helper for placing a `Value` into a `Reg` and validating that it's nonzero.
(decl put_nonzero_in_reg (Value) Reg)
;; It takes a value and entension type, and perform emits the appropriate checks.
;; TODO: restore spec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @avanhatt @mmcloughlin -- maybe the first instance of active work on the aarch64 backend that needs to update a spec. I definitely don't think we should block this PR on it (so don't worry about this, @MarinPostma!) but it's worth thinking what our short and medium term approaches will be to this since we're upstreamed but don't have a nice CI-integrated workflow yet -- should we keep a queue of such TODOs somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about that: I meant to restore them at some point, but I don't know how to run the verification

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the integration with the normal dev workflow is still very much an open question; we could have you ramp up on that but I don't think it's at the point that we want to require that of everyone yet.

let value = match extend_to {
OperandSize::Size32 => {
if bits < 32 {
if *extend == generated_code::ImmExtend::Sign {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make this a match (with Sign and Zero cases)?

},
OperandSize::Size64 => {
if bits < 64 {
if *extend == generated_code::ImmExtend::Sign {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here, and we can pull the if bits < 64 in as a guard on one of the match arms as well.

@@ -438,6 +438,7 @@ fn check_addr<'a>(
if ctx.subsumes_fact_optionals(loaded_fact.as_ref(), result_fact) {
Ok(())
} else {
dbg!();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugging code left in?

@MarinPostma
Copy link
Contributor Author

MarinPostma commented Dec 18, 2024

hey @cfallin, fixing a bunch of stuff, that's why I put it in draft, but I'll include your review, as soon I manage to fix the tests :)

@github-actions github-actions bot added the isle Related to the ISLE domain-specific language label Dec 18, 2024
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

winch(aarch64): Improve 32-bit {s,u}div
2 participants