MacMahon's master theorem

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	162
4	TheScrasse	159
5	nor	157
6	maroonrk	156
7	-is-this-fft-	152
8	orz	146
9	Petr	145
9	pajenegod	145

Hi everyone!

Mandatory orz to Elegia whose blog introduced me to these concepts.

Today, I would like to write about MacMahon's master theorem (MMT). It is a nice result on the intersection of combinatorics and linear algebra that provides an easy prove to some particularly obscure combinatorial identities, in particular the Dixon's identity:

$$$ \sum\limits_k (-1)^k \binom{a+b}{a+k} \binom{b+c}{b+k} \binom{c+a}{c+k} = \binom{a+b+c}{a,b,c}. $$$

To begin with, let's formulate MacMahon's master theorem.

MMT. Let $$$\mathbf A = [a_{ij}]_{n\times n}$$$, $$$\mathbf X = \operatorname{diag}(x_1,\dots,x_n)$$$ and $$$\mathbf k = (k_1,\dots,k_n) \geq 0$$$. Then,

$$$ \boxed{[\mathbf t^\mathbf k] \prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n a_{ij} t_j\right)^{k_i} = [\mathbf x^\mathbf k] \det(\mathbf I-\mathbf X \mathbf A)^{-1}} $$$

With MMT, the Dixon's identity is proven as elegantly as this:

Proof

Variable-less MMT

Now, let's look into MMT and try to understand what does it actually stand for? First of all, we can multiply both sides by $$$\mathbf x^\mathbf k$$$ and sum them up over all $$$\mathbf k \geq 0$$$. In this case, the RHS, by definition, will turn into $$$\det(\mathbf I - \mathbf X \mathbf A)^{-1}$$$, while the LHS will turn into

$$$ \sum\limits_{\mathbf k \geq 0} [\mathbf t^\mathbf k] \prod\limits_{i=1}^n\left(\sum\limits_{j=1}^n a_{ij} x_i t_j\right)^{k_i}. $$$

We can now do the substitution $$$\mathbf B = \mathbf X \mathbf A$$$ and use $$$b_{ij} = a_{ij} x_i$$$ to reformulate MMT in the equivalent form

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} [\mathbf t^\mathbf k] \prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n b_{ij} t_j\right)^{k_i} = \det(\mathbf I - \mathbf B)^{-1}} $$$

This form is more convenient for us to work with, as it doesn't depend on any variables in RHS.

LHS as the sum of permanents

Now, let's take a closer look on the individual summands of the LHS. What do they enumerate?

$$$ [\mathbf t^\mathbf k]\prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n b_{ij} t_j\right)^{k_i} = \sum\limits_{(j_1,\dots,j_k)} \prod\limits_{s=1}^k b_{i_s j_s}, $$$

where $$$(i_1,\dots,i_k)$$$ is a tuple such that $$$1 \leq i_1 \leq \dots \leq i_k \leq n$$$ and each $$$i$$$ occurs exactly $$$k_i$$$ times in it. Correspondingly, $$$(j_1,\dots,j_k)$$$ are all possible rearrangements of such tuple. Wait a second. We go over all rearrangements of a certain set of indices and sum up the products of $$$b_{i_s j_s}$$$ over them. Doesn't it sound familiar?

If we consider all permutations of $$$(1,\dots,k)$$$ rather than direct rearrangements of $$$(i_1,\dots,i_k)$$$, this will just be a permanent of a $$$(k_1+\dots+k_n) \times (k_1+\dots+k_n)$$$ matrix, in which the element $$$b_{ij}$$$ occurs as a block of size $$$k_i \times k_j$$$.

This concept usually occurs in literature defined even for two distinct vectors $$$\mathbf k \geq 0$$$ and $$$\mathbf l \geq 0$$$, so that $$$b_{ij}$$$ occurs in a block of size $$$k_i \times l_j$$$. The resulting matrix is typically denoted as $$$\mathbf B^{(\mathbf k, \mathbf l)}$$$. So, e.g. for $$$n = 3$$$, $$$\mathbf k = (3, 2, 1)$$$ and $$$\mathbf l = (1, 2, 3)$$$, $$$\mathbf B^{(\mathbf k, \mathbf l)}$$$ would look like

$$$ \mathbf B = \begin{pmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \end{pmatrix} \mapsto \left( \begin{array}{c|c|c} \begin{matrix} b_{11} \\ b_{11} \\ b_{11} \end{matrix} & \begin{matrix} b_{12} & b_{12} \\ b_{12} & b_{12} \\ b_{12} & b_{12} \end{matrix} & \begin{matrix} b_{13} & b_{13} & b_{13} \\ b_{13} & b_{13} & b_{13} \\ b_{13} & b_{33} & b_{33} \end{matrix} \\ \hline \begin{matrix} b_{21} \\ b_{21} \end{matrix} & \begin{matrix} b_{22} & b_{22} \\ b_{22} & b_{22}\end{matrix} & \begin{matrix} b_{23} & b_{23} & b_{23} \\ b_{23} & b_{23} & b_{23} \end{matrix} \\ \hline \begin{matrix}b_{31}\end{matrix} & \begin{matrix} b_{32} & b_{32} \end{matrix} & \begin{matrix} b_{33} & b_{33} & b_{33} \end{matrix} \end{array} \right) = \mathbf B^{(\mathbf k, \mathbf l)}. $$$

Note that to go back to rearrangements of $$$(i_1,\dots,i_k)$$$ rather than permutations of $$$(1,\dots,k)$$$ we should divide the permanent by $$$k_1 ! \dots k_n!$$$, which is commonly denoted in literature as $$$\mathbf k!$$$. In this terms, MMT rewrites concisely as

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} \frac{\operatorname{per}\mathbf B^{(\mathbf k, \mathbf k)}}{\mathbf k!} = \det(\mathbf I - \mathbf B)^{-1}} $$$

Or, in terms of the original matrix $$$\mathbf A$$$, as

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} \operatorname{per}\mathbf A^{(\mathbf k, \mathbf k)} \frac{\mathbf x^\mathbf k}{\mathbf k!} = \det(\mathbf I - \mathbf X\mathbf A)^{-1}} $$$

Sums of permanents as traces of symmetric powers

Now, let's denote $$$|\mathbf k| = k_1+\dots+k_n$$$. We can show that

$$$ \sum\limits_{|\mathbf k| = k} \frac{\operatorname{per}\mathbf B^{(\mathbf k,\mathbf k)}}{\mathbf k!} = \operatorname{tr} \operatorname{S}^k(\mathbf B), $$$

where $$$\operatorname{S}^k(\mathbf B)$$$ is the so-called symmetric power of $$$\mathbf B$$$.

Let's properly define it to better understand what it means. Before doing so, we should define tensor products. Let $$$V$$$ and $$$W$$$ be vector spaces with bases $$$B_V$$$ and $$$B_W$$$ correspondingly. Then, their tensor product $$$V \otimes W$$$ is the vector space associated with the tensor product operation $$$\otimes : V \times W \to V \otimes W$$$, such that the set $$$\{v \otimes w : v \in B_V, w \in B_W\}$$$ forms a basis of $$$V \otimes W$$$.

Computationally, if we represent vectors in coordinate form, one possible way to represent tensor products is via Kronecker product operation applied to coordinate arrays. When tensor product is defined on vectors, it is also implicitly defined on linear maps. So, if $$$A : V_1 \to V_2$$$ and $$$B: W_1 \to W_2$$$ are linear maps, then their tensor product is the linear map $$$A \otimes B : V_1 \otimes W_1 \to V_2 \otimes W_2$$$, such that $$$(A \otimes B)(v \otimes w) = (Av)\otimes (Bw)$$$. If $$$A$$$ and $$$B$$$ are represented by matrices, $$$A \otimes B$$$ can be represented as their Kronecker product.

Now, to define symmetric powers, we should consider $$$V \otimes V$$$, a tensor product of $$$V$$$ with itself.

Rev.	By	When	Δ	Comment
en8	adamant	2023-12-05 01:16:04	548
en7	adamant	2023-12-05 01:10:40	49
en6	adamant	2023-12-05 01:08:22	0	(published)
en5	adamant	2023-12-05 01:07:45	8731
en4	adamant	2023-12-04 22:14:50	2049
en3	adamant	2023-12-04 17:52:58	0
en2	adamant	2023-12-04 17:52:26	1627
en1	adamant	2023-12-04 17:18:03	5338	Initial revision (saved to drafts)

Variable-less MMT

LHS as the sum of permanents

Sums of permanents as traces of symmetric powers

History