MacMahon's master theorem

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	162
4	TheScrasse	160
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	orz	146
9	pajenegod	145
9	Petr	145

Hi everyone!

Mandatory orz to Elegia whose blog introduced me to MMT.

Today, I would like to write about MacMahon's master theorem (MMT). It is a nice result on the intersection of combinatorics and linear algebra that provides an easy proof to some particularly obscure combinatorial identities, such as the Dixon's identity:

$$$ \sum\limits_k (-1)^k \binom{a+b}{a+k} \binom{b+c}{b+k} \binom{c+a}{c+k} = \binom{a+b+c}{a,b,c}. $$$

To begin with, let's formulate MacMahon's master theorem.

MMT. Let $$$\mathbf A = [a_{ij}]_{n\times n}$$$, $$$\mathbf X = \operatorname{diag}(x_1,\dots,x_n)$$$, $$$\mathbf t = (t_1,\dots,t_n)$$$, $$$\mathbf x = (x_1,\dots,x_n)$$$ and $$$\mathbf k = (k_1,\dots,k_n) \geq 0$$$. Then,

$$$ \boxed{[\mathbf t^\mathbf k] \prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n a_{ij} t_j\right)^{k_i} = [\mathbf x^\mathbf k] \det(\mathbf I-\mathbf X \mathbf A)^{-1}} $$$

where $$$\mathbf t^\mathbf k$$$ stands for $$$t_1^{k_1} \dots t_n^{k_n}$$$, and $$$\mathbf x^\mathbf k$$$, correspondingly, for $$$x_1^{k_1} \dots x_n^{k_n}$$$.

With MMT, the Dixon's identity is proven as elegantly as this:

Proof

Variable-less MMT

Now, let's look into MMT and try to understand what does it actually stand for? First of all, we can multiply both sides by $$$\mathbf x^\mathbf k$$$ and sum them up over all $$$\mathbf k \geq 0$$$. In this case, the RHS, by definition, will turn into $$$\det(\mathbf I - \mathbf X \mathbf A)^{-1}$$$, while the LHS will turn into

$$$ \sum\limits_{\mathbf k \geq 0} [\mathbf t^\mathbf k] \prod\limits_{i=1}^n\left(\sum\limits_{j=1}^n a_{ij} x_i t_j\right)^{k_i}. $$$

We can now do the substitution $$$\mathbf B = \mathbf X \mathbf A$$$ and use $$$b_{ij} = a_{ij} x_i$$$ to reformulate MMT in the equivalent form

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} [\mathbf t^\mathbf k] \prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n b_{ij} t_j\right)^{k_i} = \det(\mathbf I - \mathbf B)^{-1}} $$$

This form is more convenient for us to work with, as it doesn't depend on any variables in RHS.

LHS as the sum of permanents

Now, let's take a closer look on the individual summands of the LHS. What do they enumerate?

$$$ [\mathbf t^\mathbf k]\prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n b_{ij} t_j\right)^{k_i} = \sum\limits_{(j_1,\dots,j_k)} \prod\limits_{s=1}^k b_{i_s j_s}, $$$

where $$$(i_1,\dots,i_k)$$$ is a tuple such that $$$1 \leq i_1 \leq \dots \leq i_k \leq n$$$ and each $$$i$$$ occurs exactly $$$k_i$$$ times in it. Correspondingly, $$$(j_1,\dots,j_k)$$$ are all possible rearrangements of such tuple. So, we go over all rearrangements of $$$(i_1,\dots,i_k)$$$ and sum up the product of $$$b_{i_s j_s}$$$ over $$$s$$$.

Doesn't it look familiar? If we consider all permutations of $$$(1,\dots,k)$$$ rather than direct rearrangements of $$$(i_1,\dots,i_k)$$$, this will just be a permanent of a $$$(k_1+\dots+k_n) \times (k_1+\dots+k_n)$$$ matrix, in which the element $$$b_{ij}$$$ occurs as a block of size $$$k_i \times k_j$$$.

This concept usually occurs in literature defined even for two distinct vectors $$$\mathbf k \geq 0$$$ and $$$\mathbf l \geq 0$$$, so that $$$b_{ij}$$$ occurs in a block of size $$$k_i \times l_j$$$. The resulting matrix is typically denoted as $$$\mathbf B^{(\mathbf k, \mathbf l)}$$$. So, e.g. for $$$n = 3$$$, $$$\mathbf k = (3, 2, 1)$$$ and $$$\mathbf l = (1, 2, 3)$$$, $$$\mathbf B^{(\mathbf k, \mathbf l)}$$$ would look like

$$$ \mathbf B = \begin{pmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \end{pmatrix} \mapsto \left( \begin{array}{c|c|c} \begin{matrix} b_{11} \\ b_{11} \\ b_{11} \end{matrix} & \begin{matrix} b_{12} & b_{12} \\ b_{12} & b_{12} \\ b_{12} & b_{12} \end{matrix} & \begin{matrix} b_{13} & b_{13} & b_{13} \\ b_{13} & b_{13} & b_{13} \\ b_{13} & b_{33} & b_{33} \end{matrix} \\ \hline \begin{matrix} b_{21} \\ b_{21} \end{matrix} & \begin{matrix} b_{22} & b_{22} \\ b_{22} & b_{22}\end{matrix} & \begin{matrix} b_{23} & b_{23} & b_{23} \\ b_{23} & b_{23} & b_{23} \end{matrix} \\ \hline \begin{matrix}b_{31}\end{matrix} & \begin{matrix} b_{32} & b_{32} \end{matrix} & \begin{matrix} b_{33} & b_{33} & b_{33} \end{matrix} \end{array} \right) = \mathbf B^{(\mathbf k, \mathbf l)}. $$$

Note that to go back to rearrangements of $$$(i_1,\dots,i_k)$$$ rather than permutations of $$$(1,\dots,k)$$$ we should divide the permanent by $$$k_1 ! \dots k_n!$$$, which is commonly denoted in literature as $$$\mathbf k!$$$. In this terms, MMT rewrites concisely as

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} \frac{\operatorname{per}\mathbf B^{(\mathbf k, \mathbf k)}}{\mathbf k!} = \det(\mathbf I - \mathbf B)^{-1}} $$$

Or, in terms of the original matrix $$$\mathbf A$$$, as

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} \operatorname{per}\mathbf A^{(\mathbf k, \mathbf k)} \frac{\mathbf x^\mathbf k}{\mathbf k!} = \det(\mathbf I - \mathbf X\mathbf A)^{-1}} $$$

Sums of permanents as traces of symmetric powers

Now, let's denote $$$|\mathbf k| = k_1+\dots+k_n$$$. We can show that

$$$ \sum\limits_{|\mathbf k| = k} \frac{\operatorname{per}\mathbf B^{(\mathbf k,\mathbf k)}}{\mathbf k!} = \operatorname{tr} \operatorname{S}^k(\mathbf B), $$$

where $$$\operatorname{S}^k(\mathbf B)$$$ is the so-called symmetric power of $$$\mathbf B$$$.

Let's properly define it to better understand what this means. Before doing so, we should define tensor products. Let $$$V$$$ and $$$W$$$ be vector spaces with bases $$$B_V$$$ and $$$B_W$$$ correspondingly. Then, their tensor product $$$V \otimes W$$$ is the vector space with associated bilinear operation $$$\otimes : V \times W \to V \otimes W$$$, such that the set $$$\{v \otimes w : v \in B_V, w \in B_W\}$$$ forms a basis of $$$V \otimes W$$$.

If we represent vectors in coordinate form, one possible way to represent tensor products is via Kronecker product operation applied to coordinate arrays. When tensor product is defined on vectors, it is also implicitly defined on linear maps. So, if $$$A : V_1 \to V_2$$$ and $$$B: W_1 \to W_2$$$ are linear maps, then their tensor product is the linear map $$$A \otimes B : V_1 \otimes W_1 \to V_2 \otimes W_2$$$, such that $$$(A \otimes B)(v \otimes w) = (Av)\otimes (Bw)$$$. If $$$A$$$ and $$$B$$$ are represented by matrices, $$$A \otimes B$$$ can be represented as their Kronecker product.

To define symmetric powers, we should consider $$$V \otimes V$$$, a tensor product of $$$V$$$ with itself. Assuming that $$$V$$$ has a basis $$$e_1,\dots,e_n$$$, we can say that $$$V \otimes V$$$ has a basis $$$\{e_i \otimes e_j : 1 \leq i, j \leq n\}$$$, so we can represent any $$$v \in V \otimes V$$$ as an array $$$[v_{ij}]_{n \times n}$$$ such that

$$$ v = \sum\limits_{i,j} v_{ij} (e_i \otimes e_j). $$$

Now, of all vectors $$$v \in V \otimes V$$$ we can consider a subset of symmetric vectors, that is those for which $$$v_{ij} = v_{ji}$$$ for any $$$(i,j)$$$. Such vectors form a subspace $$$\operatorname{S}^2(V)$$$ of $$$V \otimes V$$$, also called the symmetric square of $$$V$$$. As a subspace of $$$V$$$, symmetric square has a basis

$$$ \{e_i \otimes e_j + e_j \otimes e_i : 1 \leq i \leq j \leq n\}. $$$

Finally, we can define the symmetric power $$$\operatorname{S}^k(V)$$$ of $$$V$$$ as a subspace of $$$V^{\otimes k} = \overbrace{V \otimes \dots \otimes V}^k$$$ formed by symmetric arrays. By this we mean that any element of $$$V^{\otimes k}$$$ can be represented as an array $$$[v_{i_1 \dots i_k}]_{n \times \dots \times n}$$$ of size $$$n^k$$$, such that

$$$ v = \sum\limits_{i_1,\dots,i_k} v_{i_1,\dots,i_k} (e_{i_1} \otimes \dots \otimes e_{i_k}). $$$

With this definition, the $$$k$$$-th symmetric power of $$$V$$$ is a subspace of $$$V^{\otimes k}$$$ formed of all vectors $$$v$$$ whose coordinates wouldn't change if we rearrange vectors $$$e_1,\dots,e_n$$$ in any way. We can also define its basis as

$$$ \left\{\sum\limits_{(j_1,\dots,j_k)} (e_{j_1} \otimes \dots \otimes e_{j_k}) : 1 \leq i_1 \leq \dots \leq i_k \leq n\right\}, $$$

where $$$(j_1,\dots,j_k)$$$ go over all possible rearrangements of $$$(i_1,\dots,i_k)$$$. Looks familiar, doesn't it?

Rev.	By	When	Δ	Comment
en8	adamant	2023-12-05 01:16:04	548
en7	adamant	2023-12-05 01:10:40	49
en6	adamant	2023-12-05 01:08:22	0	(published)
en5	adamant	2023-12-05 01:07:45	8731
en4	adamant	2023-12-04 22:14:50	2049
en3	adamant	2023-12-04 17:52:58	0
en2	adamant	2023-12-04 17:52:26	1627
en1	adamant	2023-12-04 17:18:03	5338	Initial revision (saved to drafts)

Variable-less MMT

LHS as the sum of permanents

Sums of permanents as traces of symmetric powers

History