XOR basis without linear algebra

Revision en2, by Everule, 2022-02-17 23:26:38

I think that linear algebra, is a unnecessarily intimidating tool, to solve a problem I believe is far easier, and needlessly intimidates those without formal knowledge of linear algebra, which is a significant part of codeforces. I know I did. Therefore I want to show a step by step problem solving process that will lead to the concept of linear basis at the end, with no formal knowledge required.

We start with the elementary problem of linear basis. Given an array $$$A = [a_1, a_2, \ldots a_n]$$$, where $$$a_i \le 2^d$$$, Choose some subset of elements in $$$A$$$, and XOR them. Find the number of distinct elements you could make.

We start by using a simple dp. Let $$$X_i$$$ be the set of elements you could make with the first $$$i$$$ elements of $$$A$$$. Then its easy to see that $$$X_{i+1}$$$ contains $$$x$$$ and $$$x \bigoplus a_{i+1}$$$ for every $$$x \in X_i$$$. This is an $$$O(2^dn)$$$ algorithm, which is too inefficient.

We need to put some constraints on what $$$X_i$$$ looks like to make further progress. Let us assume $$$x,y$$$ are two elements in $$$X_i$$$. Then $$$x \bigoplus y$$$, must also be in $$$X_i$$$. This is because I can just take the elements I used to make $$$x$$$ and the elements I used to make $$$y$$$. The elements used in both simply cancel out. This means that the new set of elements is also a subset of $$$[a_1, a_2, \ldots, a_i]$$$.

This provides us with an equally powerful converse.

Let $$$x$$$ be some element in $$$X_i$$$, and $$$y$$$ some element not in $$$X_i$$$. Then $$$x \bigoplus y$$$, cannot be in $$$X_i$$$, because if I could make $$$x$$$ and $$$x \bigoplus y$$$, I could make $$$x \bigoplus x \bigoplus y = y$$$.

Intuitively, this tells us that anything with a new element already exists, or is always new. Another way of thinking about this is, we want to attack $$$y$$$ with some set of elements in $$$[a_1, a_2, \ldots a_i]$$$, and kill it by getting it to $$$0$$$.

If I could kill $$$x$$$ and $$$y$$$, I can kill $$$x \bigoplus y$$$ by killing $$$x$$$ first, then $$$y$$$.

If I can kill $$$x$$$ but not $$$y$$$, neither can I kill $$$x \bigoplus y$$$, because I could already get $$$y$$$ to $$$x \bigoplus y$$$, but I couldn't kill that either.

I recommend spending a minute internalizing this.

Now let us try adding a new element $$$a_{i+1}$$$ to $$$X_i$$$ again. I check if I can make $$$a_{i+1}$$$. If I can, I can simply ignore it. If I cannot, then $$$x \bigoplus a_{i+1}$$$ for $$$x \in X_i$$$ is a new element too. So the size of $$$X$$$ will double. Obviously the set can double only $$$d$$$ times. This gives an algorithm in $$$O(n + d2^d)$$$ or $$$O(n + 2^d)$$$ depending on implementation.

if $$$2^d$$$ is large, this algorithm doesn't work either. If $$$2^d$$$ is large, we cannot explicitly store $$$X$$$. This means we will need to compress $$$x$$$ in some way. To do that, notice that there are only at most $$$d$$$ elements that matter, because the ones that I ignored, may as well not have been there. So just storing the elements $$$[a_{i_1}, a_{i_2}, \ldots, a_{i_k}]$$$, that doubled the size of $$$X$$$, is enough. $$$X$$$ is just the set of the XORs of every subset of the compressed array.

While we successfully compressed $$$X$$$, we lost our ability to quickly check if $$$a_{i+1}$$$ is in $$$X$$$. We need to get that back.

We cannot guarantee much about the structure of $$$[a_{i_1}, a_{i_2}, \ldots a_{i_k}]$$$. So it is hard to see an algorithm that does this quickly. However it is important to notice that its not the elements itself that are important, but more so what is the set that is generated by the compressed elements. So any edits made to the array, is okay as long it doesn't change the produced set.

We now show that the set $$$[a_{i_1}, a_{i_2}, \ldots a_{i_k}]$$$ and $$$[a_{i_1} \bigoplus a_{i_2}, a_{i_2}, \ldots a_{i_k}]$$$, produce the same $$$X$$$. Or more generally, I can take any element, and xor it with some other element, and it will produce the same set $$$X$$$. This is easy to prove, because if I was using $$$a_{i_2}$$$ in the first set, I use the first and second element in the second set. If I was using both $$$a_{i_1}$$$ and $$$a_{i_2}$$$, then I just need the second element. It is equally good.

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en9 English Everule 2022-02-18 01:05:10 95
en8 English Everule 2022-02-18 00:16:13 53
en7 English Everule 2022-02-18 00:12:06 3 Tiny change: 'know I did. Therefor' -> 'know I didn't. Therefor'
en6 English Everule 2022-02-18 00:02:46 2 Tiny change: 'g this to O(d) as an exe' -> 'g this to $O(d)$ as an exe' (published)
en5 English Everule 2022-02-17 23:59:34 22
en4 English Everule 2022-02-17 23:57:49 1288
en3 English Everule 2022-02-17 23:39:27 801 Tiny change: 'ldots, a_k$ and simi' -> 'ldots, a_k]$ and simi'
en2 English Everule 2022-02-17 23:26:38 3452 Tiny change: 'plus x \bihoplus y = ' -> 'plus x \bigoplus y = '
en1 English Everule 2022-02-17 22:48:03 581 Initial revision (saved to drafts)