Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement number theoretic transform for large integer multiplication #282

Open
wants to merge 73 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
b633aea
Use number theoretic transform for multiplication
byeongkeunahn Aug 27, 2023
98eba1c
Use 2 primes to multiply short arrays
byeongkeunahn Aug 27, 2023
5d8b725
Speed up unbalanced multiplication (1)
byeongkeunahn Aug 28, 2023
5abc879
Support 32bit BigDigit
byeongkeunahn Aug 28, 2023
5d2bdd5
Fix clippy warnings
byeongkeunahn Aug 28, 2023
09edfac
Fix multiplication overflow on 32bit
byeongkeunahn Aug 28, 2023
6700b64
Speed up unbalanced multiplication (2)
byeongkeunahn Aug 28, 2023
dc73b87
Adjust NTT threshold for u32 digits
byeongkeunahn Aug 28, 2023
c803e43
Add more benchmarks for large integers
byeongkeunahn Aug 28, 2023
701bdbc
Update three-prime threshold (44 -> 43)
byeongkeunahn Aug 28, 2023
a888b8e
Update multiplication.rs
byeongkeunahn Aug 28, 2023
d9970e0
Speed up unbalanced multiplication (3)
byeongkeunahn Aug 29, 2023
e441c92
Add DIF-DIT, optimize CRT, etc.
byeongkeunahn Sep 7, 2023
ebc5f0d
Reduce memory access
byeongkeunahn Sep 7, 2023
7e6f558
Optimize add-with-carry
byeongkeunahn Sep 7, 2023
4d4c0dc
Optimize base case multiplication
byeongkeunahn Sep 8, 2023
253031f
Update ntt.rs
byeongkeunahn Sep 8, 2023
a6ca654
Speed up bit repacking
byeongkeunahn Sep 8, 2023
3b32e93
Share the same Vec for all twiddle factors
byeongkeunahn Sep 8, 2023
d3b478f
Pack more bits per one u64 digit if possible
byeongkeunahn Sep 9, 2023
6d4acd0
Don't use intermediate buffer for conv_base
byeongkeunahn Sep 9, 2023
de73df6
Remove unnecessary operation
byeongkeunahn Sep 12, 2023
5350ae6
Fix NTT planning bug
byeongkeunahn Sep 12, 2023
e60b678
Optimize base case multiplication
byeongkeunahn Sep 12, 2023
4475f49
Replace some addmodopt calls with submod
byeongkeunahn Sep 12, 2023
f782e93
Improve NTT planning
byeongkeunahn Sep 15, 2023
6370aa0
Reduce constant multiplication operations
byeongkeunahn Sep 15, 2023
4343d5f
Simplify code
byeongkeunahn Sep 16, 2023
f0b7f96
Simplify code
byeongkeunahn Sep 16, 2023
aab024a
Update multiplication.rs comment
byeongkeunahn Sep 16, 2023
48fcb57
Improve NTT planning
byeongkeunahn Sep 16, 2023
050ee54
Reduce memory access when repacking
byeongkeunahn Sep 17, 2023
22e352b
Fix potential carry bug
byeongkeunahn Sep 17, 2023
1ea06b9
Update ntt.rs
byeongkeunahn Sep 17, 2023
8bd2ab1
Update ntt.rs
byeongkeunahn Sep 17, 2023
5dbbd8c
Improve NTT planning: fix nonmonotonicity up to 1M
byeongkeunahn Sep 18, 2023
9740fd8
Remove unused definitions from ntt.rs
byeongkeunahn Sep 18, 2023
37e626c
Make ntt.rs shorter
byeongkeunahn Sep 18, 2023
9e3e3ed
Make ntt.rs shorter (simplify egcd)
byeongkeunahn Sep 19, 2023
36222a8
Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
771418b
Fix clippy warnings + Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
52eb321
Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
9a33d19
Merge pull request #1 from byeongkeunahn/dif-dit
byeongkeunahn Sep 19, 2023
0e41192
Fix stale comments
byeongkeunahn Sep 19, 2023
bbb563a
Update ntt.rs
byeongkeunahn Sep 19, 2023
ce87545
Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
2f7f1dd
Update ntt.rs
byeongkeunahn Sep 19, 2023
6871c4d
Refactor & fix potential carry bug
byeongkeunahn Sep 19, 2023
cac1830
Update ntt.rs
byeongkeunahn Sep 19, 2023
1d480ff
Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
69c4ec6
Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
bad433f
Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
7dfa583
Merge pull request #2 from byeongkeunahn/develop
byeongkeunahn Sep 19, 2023
f207402
Make ntt.rs shorter
byeongkeunahn Sep 19, 2023
a127070
Merge pull request #3 from byeongkeunahn/develop-2
byeongkeunahn Sep 19, 2023
d7d3de1
Update ntt.rs
byeongkeunahn Sep 19, 2023
eecdf92
Make ntt.rs shorter
byeongkeunahn Sep 20, 2023
a2426fa
Make ntt.rs shorter
byeongkeunahn Sep 21, 2023
86bb2dc
Merge pull request #4 from byeongkeunahn/develop-3
byeongkeunahn Sep 21, 2023
76b5add
Make ntt.rs shorter
byeongkeunahn Sep 21, 2023
1e41e16
Make ntt.rs shorter
byeongkeunahn Sep 21, 2023
a8ee9ff
Improve compile time
byeongkeunahn Sep 21, 2023
0d2a14b
Merge pull request #5 from byeongkeunahn/develop-4
byeongkeunahn Sep 21, 2023
5f564de
Make ntt.rs shorter
byeongkeunahn Sep 21, 2023
c85db2b
A very slight optimization
byeongkeunahn Sep 21, 2023
deacbed
Make ntt.rs shorter
byeongkeunahn Sep 21, 2023
1e71e01
Merge pull request #6 from byeongkeunahn/develop-5
byeongkeunahn Sep 21, 2023
8ecb65d
Make ntt.rs shorter
byeongkeunahn Sep 21, 2023
3f8c2c5
Make ntt.rs shorter
byeongkeunahn Sep 21, 2023
4320297
Merge pull request #7 from byeongkeunahn/develop-6
byeongkeunahn Sep 21, 2023
2f4460b
Improve NTT planning
byeongkeunahn Sep 22, 2023
3bb14f7
Merge pull request #8 from byeongkeunahn/develop-7
byeongkeunahn Sep 22, 2023
2291466
Fix NTT pack/unpack bug with u32 digits
byeongkeunahn Feb 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions benches/bigint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,21 @@ fn multiply_3(b: &mut Bencher) {
multiply_bench(b, 1 << 16, 1 << 17);
}

#[bench]
fn multiply_4(b: &mut Bencher) {
multiply_bench(b, 100_000, 1_003_741);
}

#[bench]
fn multiply_5(b: &mut Bencher) {
multiply_bench(b, 2_718_328, 2_738_633);
}

#[bench]
fn multiply_6(b: &mut Bencher) {
multiply_bench(b, 27_183_279, 27_386_321);
}

#[bench]
fn divide_0(b: &mut Bencher) {
divide_bench(b, 1 << 8, 1 << 6);
Expand Down
30 changes: 30 additions & 0 deletions benches/factorial.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,36 @@ fn factorial_mul_biguint(b: &mut Bencher) {
});
}

fn factorial_product(l: usize, r: usize) -> BigUint {
if l >= r {
BigUint::from(l)
} else {
let m = (l+r)/2;
factorial_product(l, m) * factorial_product(m+1, r)
}
}

#[bench]
fn factorial_mul_biguint_dnc_10k(b: &mut Bencher) {
b.iter(|| {
factorial_product(1, 10_000);
});
}

#[bench]
fn factorial_mul_biguint_dnc_100k(b: &mut Bencher) {
b.iter(|| {
factorial_product(1, 100_000);
});
}

#[bench]
fn factorial_mul_biguint_dnc_300k(b: &mut Bencher) {
b.iter(|| {
factorial_product(1, 300_000);
});
}

#[bench]
fn factorial_mul_u32(b: &mut Bencher) {
b.iter(|| (1u32..1000).fold(BigUint::one(), Mul::mul));
Expand Down
1 change: 1 addition & 0 deletions src/biguint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ mod bits;
mod convert;
mod iter;
mod monty;
mod ntt;
mod power;
mod shift;

Expand Down
14 changes: 12 additions & 2 deletions src/biguint/multiplication.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ use core::iter::Product;
use core::ops::{Mul, MulAssign};
use num_traits::{CheckedMul, FromPrimitive, One, Zero};

use super::ntt;

#[inline]
pub(super) fn mac_with_carry(
a: BigDigit,
Expand Down Expand Up @@ -97,7 +99,7 @@ fn mac3(mut acc: &mut [BigDigit], mut b: &[BigDigit], mut c: &[BigDigit]) {
// number of operations, but uses more temporary allocations.
//
// The thresholds are somewhat arbitrary, chosen by evaluating the results
// of `cargo bench --bench bigint multiply`.
// of `cargo bench --bench bigint multiply --features rand`.

if x.len() <= 32 {
// Long multiplication:
Expand Down Expand Up @@ -217,7 +219,7 @@ fn mac3(mut acc: &mut [BigDigit], mut b: &[BigDigit], mut c: &[BigDigit]) {
}
NoSign => (),
}
} else {
} else if x.len() <= if cfg!(u64_digit) { 512 } else { 2048 } {
// Toom-3 multiplication:
//
// Toom-3 is like Karatsuba above, but dividing the inputs into three parts.
Expand Down Expand Up @@ -346,6 +348,14 @@ fn mac3(mut acc: &mut [BigDigit], mut b: &[BigDigit], mut c: &[BigDigit]) {
NoSign => {}
}
}
} else {
// Number-theoretic transform (NTT) multiplication:
//
// NTT multiplies two integers by computing the convolution of the arrays
// modulo a prime. Since the result may exceed the prime, we use two or three
// distinct primes and combine the results using the Chinese Remainder
// Theroem (CRT).
ntt::mac3(acc, b, c);
}
}

Expand Down
Loading