MacMahon's master theorem

Revision en4, by adamant, 2023-12-04 22:14:50

Hi everyone!

Mandatory orz to Elegia whose blog introduced me to MMT.

Today, I would like to write about MacMahon's master theorem (MMT). It is a nice result on the intersection of combinatorics and linear algebra that provides an easy proof to some particularly obscure combinatorial identities, such as the Dixon's identity:

$$$ \sum\limits_k (-1)^k \binom{a+b}{a+k} \binom{b+c}{b+k} \binom{c+a}{c+k} = \binom{a+b+c}{a,b,c}. $$$

To begin with, let's formulate MacMahon's master theorem.


MMT. Let $$$\mathbf A = [a_{ij}]_{n\times n}$$$, $$$\mathbf X = \operatorname{diag}(x_1,\dots,x_n)$$$, $$$\mathbf t = (t_1,\dots,t_n)$$$, $$$\mathbf x = (x_1,\dots,x_n)$$$ and $$$\mathbf k = (k_1,\dots,k_n) \geq 0$$$. Then,

$$$ \boxed{[\mathbf t^\mathbf k] \prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n a_{ij} t_j\right)^{k_i} = [\mathbf x^\mathbf k] \det(\mathbf I-\mathbf X \mathbf A)^{-1}} $$$

where $$$\mathbf t^\mathbf k$$$ stands for $$$t_1^{k_1} \dots t_n^{k_n}$$$, and $$$\mathbf x^\mathbf k$$$, correspondingly, for $$$x_1^{k_1} \dots x_n^{k_n}$$$.


With MMT, the Dixon's identity is proven as elegantly as this:

Proof
Variable-less MMT

Now, let's look into MMT and try to understand what does it actually stand for? First of all, we can multiply both sides by $$$\mathbf x^\mathbf k$$$ and sum them up over all $$$\mathbf k \geq 0$$$. In this case, the RHS, by definition, will turn into $$$\det(\mathbf I - \mathbf X \mathbf A)^{-1}$$$, while the LHS will turn into

$$$ \sum\limits_{\mathbf k \geq 0} [\mathbf t^\mathbf k] \prod\limits_{i=1}^n\left(\sum\limits_{j=1}^n a_{ij} x_i t_j\right)^{k_i}. $$$

We can now do the substitution $$$\mathbf B = \mathbf X \mathbf A$$$ and use $$$b_{ij} = a_{ij} x_i$$$ to reformulate MMT in the equivalent form

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} [\mathbf t^\mathbf k] \prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n b_{ij} t_j\right)^{k_i} = \det(\mathbf I - \mathbf B)^{-1}} $$$

This form is more convenient for us to work with, as it doesn't depend on any variables in RHS.

LHS as the sum of permanents

Now, let's take a closer look on the individual summands of the LHS. What do they enumerate?

$$$ [\mathbf t^\mathbf k]\prod\limits_{i=1}^n \left(\sum\limits_{j=1}^n b_{ij} t_j\right)^{k_i} = \sum\limits_{(j_1,\dots,j_k)} \prod\limits_{s=1}^k b_{i_s j_s}, $$$

where $$$(i_1,\dots,i_k)$$$ is a tuple such that $$$1 \leq i_1 \leq \dots \leq i_k \leq n$$$ and each $$$i$$$ occurs exactly $$$k_i$$$ times in it. Correspondingly, $$$(j_1,\dots,j_k)$$$ are all possible rearrangements of such tuple. So, we go over all rearrangements of $$$(i_1,\dots,i_k)$$$ and sum up the product of $$$b_{i_s j_s}$$$ over $$$s$$$.

Doesn't it look familiar? If we consider all permutations of $$$(1,\dots,k)$$$ rather than direct rearrangements of $$$(i_1,\dots,i_k)$$$, this will just be a permanent of a $$$(k_1+\dots+k_n) \times (k_1+\dots+k_n)$$$ matrix, in which the element $$$b_{ij}$$$ occurs as a block of size $$$k_i \times k_j$$$.

This concept usually occurs in literature defined even for two distinct vectors $$$\mathbf k \geq 0$$$ and $$$\mathbf l \geq 0$$$, so that $$$b_{ij}$$$ occurs in a block of size $$$k_i \times l_j$$$. The resulting matrix is typically denoted as $$$\mathbf B^{(\mathbf k, \mathbf l)}$$$. So, e.g. for $$$n = 3$$$, $$$\mathbf k = (3, 2, 1)$$$ and $$$\mathbf l = (1, 2, 3)$$$, $$$\mathbf B^{(\mathbf k, \mathbf l)}$$$ would look like

$$$ \mathbf B = \begin{pmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \end{pmatrix} \mapsto \left( \begin{array}{c|c|c} \begin{matrix} b_{11} \\ b_{11} \\ b_{11} \end{matrix} & \begin{matrix} b_{12} & b_{12} \\ b_{12} & b_{12} \\ b_{12} & b_{12} \end{matrix} & \begin{matrix} b_{13} & b_{13} & b_{13} \\ b_{13} & b_{13} & b_{13} \\ b_{13} & b_{33} & b_{33} \end{matrix} \\ \hline \begin{matrix} b_{21} \\ b_{21} \end{matrix} & \begin{matrix} b_{22} & b_{22} \\ b_{22} & b_{22}\end{matrix} & \begin{matrix} b_{23} & b_{23} & b_{23} \\ b_{23} & b_{23} & b_{23} \end{matrix} \\ \hline \begin{matrix}b_{31}\end{matrix} & \begin{matrix} b_{32} & b_{32} \end{matrix} & \begin{matrix} b_{33} & b_{33} & b_{33} \end{matrix} \end{array} \right) = \mathbf B^{(\mathbf k, \mathbf l)}. $$$

Note that to go back to rearrangements of $$$(i_1,\dots,i_k)$$$ rather than permutations of $$$(1,\dots,k)$$$ we should divide the permanent by $$$k_1 ! \dots k_n!$$$, which is commonly denoted in literature as $$$\mathbf k!$$$. In this terms, MMT rewrites concisely as

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} \frac{\operatorname{per}\mathbf B^{(\mathbf k, \mathbf k)}}{\mathbf k!} = \det(\mathbf I - \mathbf B)^{-1}} $$$

Or, in terms of the original matrix $$$\mathbf A$$$, as

$$$ \boxed{\sum\limits_{\mathbf k \geq 0} \operatorname{per}\mathbf A^{(\mathbf k, \mathbf k)} \frac{\mathbf x^\mathbf k}{\mathbf k!} = \det(\mathbf I - \mathbf X\mathbf A)^{-1}} $$$
Sums of permanents as traces of symmetric powers

Now, let's denote $$$|\mathbf k| = k_1+\dots+k_n$$$. We can show that

$$$ \sum\limits_{|\mathbf k| = k} \frac{\operatorname{per}\mathbf B^{(\mathbf k,\mathbf k)}}{\mathbf k!} = \operatorname{tr} \operatorname{S}^k(\mathbf B), $$$

where $$$\operatorname{S}^k(\mathbf B)$$$ is the so-called symmetric power of $$$\mathbf B$$$.

Let's properly define it to better understand what this means. Before doing so, we should define tensor products. Let $$$V$$$ and $$$W$$$ be vector spaces with bases $$$B_V$$$ and $$$B_W$$$ correspondingly. Then, their tensor product $$$V \otimes W$$$ is the vector space with associated bilinear operation $$$\otimes : V \times W \to V \otimes W$$$, such that the set $$$\{v \otimes w : v \in B_V, w \in B_W\}$$$ forms a basis of $$$V \otimes W$$$.

If we represent vectors in coordinate form, one possible way to represent tensor products is via Kronecker product operation applied to coordinate arrays. When tensor product is defined on vectors, it is also implicitly defined on linear maps. So, if $$$A : V_1 \to V_2$$$ and $$$B: W_1 \to W_2$$$ are linear maps, then their tensor product is the linear map $$$A \otimes B : V_1 \otimes W_1 \to V_2 \otimes W_2$$$, such that $$$(A \otimes B)(v \otimes w) = (Av)\otimes (Bw)$$$. If $$$A$$$ and $$$B$$$ are represented by matrices, $$$A \otimes B$$$ can be represented as their Kronecker product.

To define symmetric powers, we should consider $$$V \otimes V$$$, a tensor product of $$$V$$$ with itself. Assuming that $$$V$$$ has a basis $$$e_1,\dots,e_n$$$, we can say that $$$V \otimes V$$$ has a basis $$$\{e_i \otimes e_j : 1 \leq i, j \leq n\}$$$, so we can represent any $$$v \in V \otimes V$$$ as an array $$$[v_{ij}]_{n \times n}$$$ such that

$$$ v = \sum\limits_{i,j} v_{ij} (e_i \otimes e_j). $$$

Now, of all vectors $$$v \in V \otimes V$$$ we can consider a subset of symmetric vectors, that is those for which $$$v_{ij} = v_{ji}$$$ for any $$$(i,j)$$$. Such vectors form a subspace $$$\operatorname{S}^2(V)$$$ of $$$V \otimes V$$$, also called the symmetric square of $$$V$$$. As a subspace of $$$V$$$, symmetric square has a basis

$$$ \{e_i \otimes e_j + e_j \otimes e_i : 1 \leq i \leq j \leq n\}. $$$

Finally, we can define the symmetric power $$$\operatorname{S}^k(V)$$$ of $$$V$$$ as a subspace of $$$V^{\otimes k} = \overbrace{V \otimes \dots \otimes V}^k$$$ formed by symmetric arrays. By this we mean that any element of $$$V^{\otimes k}$$$ can be represented as an array $$$[v_{i_1 \dots i_k}]_{n \times \dots \times n}$$$ of size $$$n^k$$$, such that

$$$ v = \sum\limits_{i_1,\dots,i_k} v_{i_1,\dots,i_k} (e_{i_1} \otimes \dots \otimes e_{i_k}). $$$

With this definition, the $$$k$$$-th symmetric power of $$$V$$$ is a subspace of $$$V^{\otimes k}$$$ formed of all vectors $$$v$$$ whose coordinates wouldn't change if we rearrange vectors $$$e_1,\dots,e_n$$$ in any way. We can also define its basis as

$$$ \left\{\sum\limits_{(j_1,\dots,j_k)} (e_{j_1} \otimes \dots \otimes e_{j_k}) : 1 \leq i_1 \leq \dots \leq i_k \leq n\right\}, $$$

where $$$(j_1,\dots,j_k)$$$ go over all possible rearrangements of $$$(i_1,\dots,i_k)$$$. Looks familiar, doesn't it?

Tags combinatorics, generating functions, linear algebra, tensor

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en8 English adamant 2023-12-05 01:16:04 548
en7 English adamant 2023-12-05 01:10:40 49
en6 English adamant 2023-12-05 01:08:22 0 (published)
en5 English adamant 2023-12-05 01:07:45 8731
en4 English adamant 2023-12-04 22:14:50 2049
en3 English adamant 2023-12-04 17:52:58 0
en2 English adamant 2023-12-04 17:52:26 1627
en1 English adamant 2023-12-04 17:18:03 5338 Initial revision (saved to drafts)