hly1204's blog

By hly1204, history, 2 years ago, In English

A simple way to understand the transposed algo of multi-eval

A simple way to understand the transposed algorithm of multi-point evaluation of polynomials in Tellegen’s Principle into Practice.

I learned this from Bernstein's paper Scaled Remainder Trees. If there are any mistakes, please tell me, thanks! (Sorry for my poor English)

Let $$$\mathbb{R}((x))$$$ denote the ring of formal Laurent series, $$$\mathbb{R}\lbrack x\rbrack$$$ denote the ring of polynomial. $$$f(x)\bmod 1:=f_{-1}x^{-1}+f_{-2}x^{-2}+\cdots$$$.

Bernstein showed that division in $$$\mathbb{R}\lbrack x\rbrack$$$ is division in $$$\mathbb{R}((x^{-1}))$$$. Here is a concrete example. Let $$$A(x):=1+x+4x^2+5x^3+x^4+4x^5\in\mathbb{R}\lbrack x\rbrack$$$ and $$$B(x):=1+9x+8x^2+x^3\in\mathbb{R}\lbrack x\rbrack$$$, we want to compute $$$Q(x),R(x)\in\mathbb{R}\lbrack x\rbrack$$$ such that $$$A(x)=B(x)Q(x)+R(x)$$$ and $$$\deg R(x)\lt \deg B(x)$$$ (with $$$\deg 0=-\infty$$$). Let's do this in $$$\mathbb{R}((x^{-1}))$$$. First, we should find the multiplicative inverse of $$$B(x)$$$ which is

$$$ B^{-1}(x)=x^{-3}-8x^{-4}+55x^{-5}-369x^{-6}+2465x^{-7}-16454x^{-8}+109816x^{-9}-\cdots $$$

and compute

$$$ A(x)B^{-1}(x)=4x^2-31x+217-1457x^{-1}+9735x^{-2}-64983x^{-3}+433706x^{-4}-\cdots $$$

We could find that $$$Q(x)$$$ is exactly $$$A(x)B^{-1}(x)-(A(x)B^{-1}(x)\bmod 1)$$$, and the leading coefficient of $$$R(x)$$$ equals $$$[x^{-1}]A(x)B^{-1}(x)$$$. That's not a coincidence. Consider the simplest case

$$$ (x-v)^{-1}=x^{-1}+vx^{-2}+v^2x^{-3}+\cdots=\sum_{n\geq 0}v^nx^{-(n+1)} $$$

So we have $$$f(v)=[x^{-1}]f/(x-v)$$$ cause

$$$ \begin{aligned} f(x)/(x-v)&=f_0x^{-1}+f_1+f_2x+\cdots +f_nx^{n-1}\\ &+f_0vx^{-2}+f_1vx^{-1}+f_2v+\cdots +f_nvx^{n-2}\\ &+\cdots \end{aligned} $$$

The problem of multi-point evaluation is as follows: Given polynomial $$$f(x)=\sum_{i=0}^nf_ix^i$$$ and $$$v_0,v_1,\dots ,v_n$$$, print $$$f(v_0),f(v_1),\dots ,f(v_n)$$$.

The traditional algorithm for multi-point evaluation is that we make an almost balanced binary (sub)product tree with leaves are polynomials $$$p_i(x)=x-v_i$$$ and non-leaf nodes are the product of its children. Then we compute $$$f\bmod (p_1\cdots p_n)$$$ and then compute $$$f\bmod (p_1\cdots p_{n/2})$$$ and $$$f\bmod (p_{n/2+1}\cdots p_n)$$$ cause $$$(f\bmod (p_1\cdots p_n))\bmod (p_1\cdots p_{n/2})=f\bmod (p_1\cdots p_{n/2})$$$ (suppose $$$n$$$ is even) and then ... The transposed algorithm is that we compute $$$f/(p_1\cdots p_n)$$$ and $$$f/(p_1\cdots p_n)\cdot (p_{n/2+1}\cdots p_{n})=f/(p_1\cdots p_{n/2})$$$, finally we will have $$$f/p_i$$$ at the leaf of the tree, the coefficient of $$$x^{-1}$$$ is exactly what we want. The only remaining problem is how many terms should we track when doing the computation. It's clear that we don't care about the coefficients of $$$x^0,x^1,\dots$$$ in $$$f/(p_1\cdots p_n)$$$ cause they do nothing with the coefficient of $$$x^{-1},x^{-2},\dots$$$ when $$$f/(p_1\cdots p_n)$$$ was multiplied by a polynomial like $$$g_0+g_1x+g_2x^2+\cdots$$$. Another observation is that for $$$f/P\bmod 1=a_{-1}x^{-1}+a_{-2}x^{-2}+\cdots +a_{-\deg P}x^{\deg P}+\cdots$$$, only $$$a_{-1},\dots ,a_{-\deg P}$$$ are enough. Consider a very simple case that we have $$$f/(p_1p_2)\bmod 1=c_{-1}x^{-1}+c_{-2}x^{-2}+c_{-3}x^{-3}+\cdots$$$ and want to compute $$$f/p_1\bmod 1$$$ via multiplying $$$f/(p_1p_2)\bmod 1$$$ by $$$p_2$$$, the coefficients of $$$x^{-3},x^{-4},\dots$$$ have no effect on the coefficient of $$$x^{-1}$$$ (it's more complicated for the integer case). So we could almost halve the size of problem after one multiplication.

About how to compute $$$(p_1\cdots p_n)^{-1}$$$ in $$$\mathbb{R}((x^{-1}))$$$, we could simply use the method that we compute the multiplicative inverse of formal power series $$$x^{\deg(p_1\cdots p_n)}p_1(x^{-1})\cdots p_n(x^{-1})$$$ in $$$\mathbb{R}\lbrack \lbrack x\rbrack \rbrack$$$.

  • Vote: I like it
  • +60
  • Vote: I do not like it

»
2 years ago, # |
  Vote: I like it +16 Vote: I do not like it

This seems like a nice (and obvious-in-hindsight) optimization over the more widely known mod-only recursion. Using a higher level of precision for the initial multiplication by $$$\left(\prod p_i\right)^{-1}$$$ seems like a very small price to pay for making every later step in the algorithm use a multiplication instead of a mod.

But I don't understand in what sense it is transposed: It calculates the same things in the same order as that more widely known algorithm (albeit using a different representation), whereas transposing an algorithm (as I understand it) involves calculating related quantities but in exactly the opposite order. Am I missing something here?

  • »
    »
    2 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    It's a little hard to say, but yes, it is transposed. Consider the multi-point evaluation problem as the matrix-vector multiplication problem with the matrix transposed, and then transpose each step of the algorithm (see Tellegen’s Principle into Practice). The multiplication in the algorithm is transposed convolution if we treat them like polynomials. But in this way we could not give an explicit definition of the intermediate result.