Computing n! modulo pᵏ for small p

Правка en27, от adamant, 2023-05-31 14:22:05

Hi everyone!

There is an article on cp-algorithms about how to compute $$$n!$$$ modulo prime number $$$p$$$. Today I was wondering, how to do it if we need to compute modulo $$$p^k$$$ rather than just $$$p$$$. Turns out one can do it efficiently if $$$p$$$ is small.

Motivational example

Xiaoxu Guo Contest 3 — Binomial Coefficient. Given $$$1 \leq n,k \leq 10^{18}$$$, find $$$\binom{n}{k}$$$ modulo $$$2^{32}$$$.

There are also some projecteuler problems that require it, including with several queries of distinct $$$n$$$ and $$$k$$$.

Task formulation and outline of result

To clarify the task a bit, our ultimate goal here is to be able to compute e.g. binomial coefficients modulo $$$p^k$$$. Thus, what we really need is to represent $$$n! = p^t a$$$, where $$$\gcd(a, p)=1$$$, and then report $$$t$$$ and $$$a$$$ modulo $$$p^k$$$. This is sufficient to also compute $$$\binom{n}{r}$$$ modulo $$$p^k$$$.

We will show that, assuming that polynomial multiplication of size $$$n$$$ requires $$$O(n \log n)$$$ operations, and assuming that arithmetic operations modulo $$$p^k$$$ take $$$O(1)$$$, we can find $$$t$$$ and $$$a$$$ in $$$O(d^2 + dk\log k)$$$, where $$$d=\log_p n$$$. It requires $$$O(pk\log^2 k)$$$ pre-computation that takes $$$O(pk \log k)$$$ memory.

Algorithm when $$$p^k$$$ is also small

First, let's solve a simpler version of this problem.

Library Judge — Binomial Coefficient. Given $$$n$$$, $$$k \leq 10^{18}$$$ and $$$m \leq 10^6$$$, print $$$\binom{n}{k} \bmod m$$$ for $$$T \leq 2 \cdot 10^5$$$ test cases.

Let's denote $$$n!_p$$$ as the product of all numbers from $$$1$$$ to $$$n$$$ that are co-prime with $$$p$$$. That is,

$$$ n!_p = \prod\limits_{\substack{1 \leq j \leq n \\ \gcd(j, p)=1}}j. $$$

In this notion, assuming $$$k=\lfloor \frac{n}{p} \rfloor$$$, we can represent $$$n!$$$ as $$$n! = p^k k! n!_p$$$, which expands recursively into

$$$ n! = \prod\limits_{i=0}^\infty p^{\left\lfloor \frac{n}{p^{i+1}}\right\rfloor} \left\lfloor n/p^i \right\rfloor !_p $$$

On practice, we only care about $$$i \leq \log_p n$$$, and we may pre-compute all possible values of $$$a!_p$$$ modulo $$$p^k$$$ in advance in $$$O(p^k)$$$ for all $$$a < p^k$$$, which then allows us to compute the full formula in $$$O(d)$$$, where $$$d = \log_p n$$$. To do this, we note that

$$$ a!_p \equiv (p^k-1)!_p^{\lfloor a/p^k \rfloor} (a \bmod p^k)!_p \pmod{p^k}, $$$

and $$$(p^k-1)!_p \equiv -1 \pmod{p^k}$$$ for $$$p > 2$$$ and $$$p^k = 4$$$. For other values with $$$p=2$$$ it is $$$1$$$ instead of $$$-1$$$.

Please find the implementation in Python below for details:

Code

It works 5.5s on the aforementioned Library Judge problem.

Algorithm with polynomials

Great thanks to Endagorion for sharing this algorithm with me!

Let's look further into the formulas from the section above. There is a part contributed by numbers divisible by $$$p$$$, which can be reduced to computing $$$k!$$$, and a part contributed by numbers not divisible by $$$p$$$. Let $$$n = a_0 + a_1 p + \dots + a_d p^d$$$, then

$$$ \prod\limits_{\substack{1 \leq j \leq n \\ \gcd(j, p)=1}}j = \prod\limits_{i=0}^d \prod\limits_{\substack{1 \leq j \leq a_i p^i \\\gcd(j, p)=1}} \left(\left\lfloor\frac{n}{p^{i+1}} \right\rfloor p^{i+1}+j\right). $$$

To compute it quickly, we can define a family of polynomial $$$P_{i,a}(x)$$$ for $$$0 \leq a \leq p$$$ such that

$$$ P_{i, a}(x) = \prod\limits_{\substack{1 \leq j\leq a p^{i-1} \\ \gcd(j, p)=1}}(xp^{i}+j), $$$

so the value of the factorial would be represented as

$$$ n! = p^k k! \prod\limits_{i=0}^d P_{i+1,a_i}\left(\left\lfloor\frac{n}{p^{i+1}} \right\rfloor\right), $$$

and expanding $$$k!$$$ it then rewrites into

$$$ n! = \prod\limits_{i=0}^{d} p^{\left\lfloor \frac{n}{p^{i+1}} \right\rfloor} \prod\limits_{j=0}^{d-i} P_{j+1,a_{i+j}}\left(\left\lfloor\frac{n}{p^{i+j+1}} \right\rfloor\right), $$$

which simplifies as

$$$ \boxed{n! = \prod\limits_{i=0}^{d} p^{\left\lfloor \frac{n}{p^{i+1}} \right\rfloor} \prod\limits_{j=0}^{i} P_{j+1,a_{i}}\left(\left\lfloor\frac{n}{p^{i+1}} \right\rfloor\right)} $$$

Now, what would it take us to use this setup? First of all, notice that $$$P_{i,a}(x)$$$ can be computed from one another:

$$$ P_{i,a}(x) = \prod\limits_{\substack{1 \leq j\leq a p^{i-1} \\ \gcd(j, p)=1}}(xp^{i}+j) = \prod\limits_{t=0}^{a-1}\prod\limits_{\substack{1 \leq j\leq p^{i-1} \\ \gcd(j, p)=1}}(xp^{i}+tp^{i-1}+j)=\prod\limits_{t=0}^{a-1} P_{i-1,p}(px+t). $$$

Note that for shorter and more consistent implementation, this recurrent formula also mostly works for $$$i=1$$$ if we put $$$P_{0, p}(x) = x+1$$$, but for $$$P_{1,p}$$$ we should go up to $$$p-2$$$ instead of $$$p-1$$$. We should also note that for $$$P_{i,a}(x)$$$, we only care for coefficients up to $$$x^{\lfloor x/i \rfloor}$$$, as the larger ones are divisible by $$$p^k$$$.

This allows to compute $$$P_{i,a}(x)$$$ in $$$O(\frac{pk \log k}{i})$$$ for all $$$a$$$ for a given $$$i$$$. Over all $$$i$$$ from $$$1$$$ to $$$k$$$ it sums up to $$$O(pk \log^2 k)$$$ time and $$$O(p k \log k)$$$ memory. Then, evaluating all the polynomials for any specific $$$a$$$ requires $$$O(d^2 + dk \log k)$$$ operations, where $$$d = \log_p n$$$.

As it requires some manipulations with polynomials, I implemented it in Python with sympy just as a proof of concept:

Code

Algorithm with $$$p$$$-adic logarithms and exponents

The algorithm above requires some heavy machinery on polynomials. I also found out an alternative approach that allows to compute $$$t$$$ and $$$a$$$ for any given $$$n$$$ divisible by $$$p$$$ in $$$O(dk^2)$$$ with $$$O(pk)$$$ precomputation, where $$$d = \log_p n$$$. It doesn't rely on polynomial operations, except for Lagrange interpolation.

Let $$$n=pt+b$$$, where $$$0 \leq b < p$$$, then we can represent its factorial as

$$$ n! = p^t t! \prod\limits_{i=1}^{b} \prod\limits_{j=0}^{t} \left(pj+i\right)\prod\limits_{i=b+1}^{p-1} \prod\limits_{j=0}^{t-1} \left(pj+i\right). $$$

We can further rewrite it as

$$$ n! = p^t t! (p-1)!^t b! \prod\limits_{i=1}^{b} \prod\limits_{j=0}^{t} \left(1+\frac{j}{i}p\right)\prod\limits_{i=b+1}^{p-1} \prod\limits_{j=0}^{t-1} \left(1+\frac{j}{i}p\right) $$$

Let's learn how to compute the product

$$$ A(b, t) = \prod\limits_{i=1}^b \prod\limits_{j=0}^t \left(1+\frac{j}{i}p\right), $$$

as with it we can represent the factorial as

$$$ n! = p^t t! (p-1)!^t b! A(b, t)\frac{A(p-1,t-1)}{A(b,t-1)}. $$$

We can take a $$$p$$$-adic logarithm (please refer to this article for definitions and properties) from the product to get

$$$ \boxed{\log A(b, t) = \sum\limits_{i=1}^{b} \sum\limits_{j=0}^{t} \log\left(1+\frac{j}{i}p\right)=\sum\limits_{z=1}^\infty \frac{(-1)^{z+1}p^z}{z} \sum\limits_{i=1}^{b} i^{-z} \sum\limits_{j=0}^{t} j^z} $$$

We can precompute sums of $$$i^{-z}$$$ for each $$$z$$$ up to $$$k$$$ and $$$b$$$ up to $$$p-1$$$ in $$$O(pk)$$$, and we can find the sum of $$$j^z$$$ for $$$j$$$ from $$$0$$$ to $$$t$$$ in $$$O(z)$$$ via Lagrange interpolation. Therefore, with $$$O(pk)$$$ precomputation, we can compute $$$\log A(b, t)$$$ in $$$O(k^2)$$$, which then allows to find $$$n!$$$ in $$$O(dk^2)$$$, where $$$d = \log_p n$$$. I implemented it in Python, as a proof of concept (incl. fast sums over $$$i$$$ and $$$j$$$):

Code

Note: Due to inherent properties of $$$p$$$-adic logarithms and exponents, the algorithm only works with $$$p>2$$$, and it might return $$$-n!$$$ instead with $$$p=2$$$. This is because $$$\exp \log z = -z$$$ with $$$p=2$$$ when $$$z$$$ has remainder $$$3$$$ modulo $$$4$$$. It is accounted for in the code by tracking the number of integers below $$$n$$$ that have remainder $$$3$$$ modulo $$$4$$$.

I also tested the solution on the motivational problem and got AC in Python: submission.

Some other approaches

There is also a blog by prabowo, based apparently on another blog by Min_25, and a scholarly article by Andrew Granville. I'm not sure whether they talk about the same algorithms as described here.

Теги tutorial, p-adic numbers, factorial

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en27 Английский adamant 2023-05-31 14:22:05 2 Tiny change: '\ \gcd(j, n)=1}}j = \' -> '\ \gcd(j, p)=1}}j = \'
en26 Английский adamant 2023-05-31 14:21:17 2 Tiny change: '\ \gcd(j, n)=1}}j.\n$' -> '\ \gcd(j, p)=1}}j.\n$'
en25 Английский adamant 2023-05-31 14:10:41 60
en24 Английский adamant 2023-05-31 14:10:03 531
en23 Английский adamant 2023-05-31 14:04:45 256
en22 Английский adamant 2023-05-31 13:58:15 2269
en21 Английский adamant 2023-05-31 02:45:28 41
en20 Английский adamant 2023-05-31 02:29:49 308
en19 Английский adamant 2023-05-31 02:26:05 308
en18 Английский adamant 2023-05-31 02:20:22 1037 Tiny change: 'lgorithm w' -> 'lgorithm when $p^k$ is also small\n\n\n\n#### Algorithm w'
en17 Английский adamant 2023-05-23 18:48:21 57
en16 Английский adamant 2023-05-23 17:46:45 41
en15 Английский adamant 2023-05-23 17:45:16 42
en14 Английский adamant 2023-05-23 17:44:45 1268
en13 Английский adamant 2023-05-23 03:22:28 22 Tiny change: 'omputation. It doesn' -> 'omputation, where $d = \log_p n$. It doesn'
en12 Английский adamant 2023-05-23 03:21:59 30 Tiny change: '$p$ in $O(pk)$. It doesn' -> '$p$ in $O(dk^2)$ with $O(pk)$ precomputation. It doesn'
en11 Английский adamant 2023-05-23 03:20:54 216
en10 Английский adamant 2023-05-23 03:19:19 175
en9 Английский adamant 2023-05-23 03:14:42 10
en8 Английский adamant 2023-05-23 03:07:15 184
en7 Английский adamant 2023-05-23 02:57:31 5 Tiny change: 'k, a = fct_test(t)\n ' -> 'k, a = fct(t)\n '
en6 Английский adamant 2023-05-23 00:40:56 153
en5 Английский adamant 2023-05-23 00:04:31 106
en4 Английский adamant 2023-05-22 23:23:37 625
en3 Английский adamant 2023-05-22 23:23:03 62
en2 Английский adamant 2023-05-22 23:22:24 339
en1 Английский adamant 2023-05-22 23:16:26 7960 Initial revision (published)