On reducing the times of Moebius transform for subset convolution

Hi, Codeforces! I want to introduce a simple way to understand the subset convolution and a simple optimization from scratch.

Actually, I do not know this trick is well-known or just out-of-date.

If there are any mistakes, I am happy to fix them (if I am able to) and thanks for pointing out.

We know that subset convolution is equivalent to truncated multivariate polynomial multiplication like

$$$ A(x_1,\dots ,x_n)B(x_1,\dots ,x_n)\bmod (x_1^2,\dots ,x_n^2) $$$

simply written

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\mathbf{x}^2} $$$

we know that we could compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}\right)} $$$

by using the FFT-like way (Computing $$$A(\mathbf{x})\bmod{(\lbrace x_1,x_1-1\rbrace \times \cdots \times\lbrace x_n,x_n-1\rbrace)}$$$ and using Chinese remainder algorithm to restore the result. This is called the Zeta transform and the Moebius transform respectively). But the coefficients are somehow “mixed”, we can not tell the coefficients of $$$\mathbf{x}^2$$$ and $$$\mathbf{x}$$$. An idea is that we could introduce a new variable $$$t$$$ and computing $$$A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}t\right)}$$$, simply let $$$t\gets 0$$$ to get the result of subset convolution. Here is the main idea of the whole algorithm we may familar with:

First, we define $$$s$$$ for each term of $$$A(\mathbf{x}),B(\mathbf{x})$$$:

$$$ s\left(c x_1^{d_1}x_2^{d_2}\cdots x_n^{d_n}\right):=\sum_{k=1}^nd_k $$$

and

$$$ A_k(\mathbf{x})\text{ such that }s(y)=k\text{ for any term }y\text{ of }A_k(\mathbf{x}) $$$

then we compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}=\left(\sum_i A_i(\mathbf{x})\right)\left(\sum_i B_i(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)} $$$

For the $$$O(n^2)$$$ times of computation of form $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, we have an invariant that $$$s(y)+\deg_t(y)=i+j$$$ for any term $$$y$$$ of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$. Even we only do the computation of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x})}$$$, we could restore the result of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, and we do not use these coefficients that with variable $$$t$$$.

Another observation is that $$$\deg_t(A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)})\leq (i+j)/2$$$. For example, if we have $$$\left(\sum_{i+j=4}A_i(\mathbf{x})B_j(\mathbf{x})\right)+\left(\sum_{i+j=9}A_i(\mathbf{x})B_j(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, it is still possible for us to tell apart. With this simple trick, we could reduce the times of Moebius transform for about a half.

This is a submission for the optimized code. I did not implement it carefully.

upd: submission reinplemented.

№	Пользователь	Рейтинг
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

№	Пользователь	Вклад
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	157
6	maroonrk	155
7	-is-this-fft-	152
8	Petr	146
8	orz	146
10	BledDest	145

Блог пользователя hly1204