XOR basis without linear algebra

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	160
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	orz	146
9	pajenegod	145
9	SecondThread	145

I think that linear algebra, is a unnecessarily intimidating tool, to solve a problem I believe is far easier, and needlessly intimidates those without formal knowledge of linear algebra, which is a significant part of codeforces. I know I did. Therefore I want to show a step by step problem solving process that will lead to the concept of linear basis at the end, with no formal knowledge required.

We start with the elementary problem of linear basis. Given an array $$$A = [a_1, a_2, \ldots a_n]$$$, where $$$a_i \le 2^d$$$, Choose some subset of elements in $$$A$$$, and XOR them. Find the number of distinct elements you could make.

We start by using a simple dp. Let $$$X_i$$$ be the set of elements you could make with the first $$$i$$$ elements of $$$A$$$. Then its easy to see that $$$X_{i+1}$$$ contains $$$x$$$ and $$$x \bigoplus a_{i+1}$$$ for every $$$x \in X_i$$$. This is an $$$O(2^dn)$$$ algorithm, which is too inefficient.

We need to put some constraints on what $$$X_i$$$ looks like to make further progress. Let us assume $$$x,y$$$ are two elements in $$$X_i$$$. Then $$$x \bigoplus y$$$, must also be in $$$X_i$$$. This is because I can just take the elements I used to make $$$x$$$ and the elements I used to make $$$y$$$. The elements used in both simply cancel out. This means that the new set of elements is also a subset of $$$[a_1, a_2, \ldots, a_i]$$$.

This provides us with an equally powerful converse.

Let $$$x$$$ be some element in $$$X_i$$$, and $$$y$$$ some element not in $$$X_i$$$. Then $$$x \bigoplus y$$$, cannot be in $$$X_i$$$, because if I could make $$$x$$$ and $$$x \bigoplus y$$$, I could make $$$x \bigoplus x \bigoplus y = y$$$.

Intuitively, this tells us that anything with a new element already exists, or is always new. Another way of thinking about this is, we want to attack $$$y$$$ with some set of elements in $$$[a_1, a_2, \ldots a_i]$$$, and kill it by getting it to $$$0$$$.

If I could kill $$$x$$$ and $$$y$$$, I can kill $$$x \bigoplus y$$$ by killing $$$x$$$ first, then $$$y$$$.

If I can kill $$$x$$$ but not $$$y$$$, neither can I kill $$$x \bigoplus y$$$, because I could already get $$$y$$$ to $$$x \bigoplus y$$$, but I couldn't kill that either.

I recommend spending a minute internalizing this.

Now let us try adding a new element $$$a_{i+1}$$$ to $$$X_i$$$ again. I check if I can make $$$a_{i+1}$$$. If I can, I can simply ignore it. If I cannot, then $$$x \bigoplus a_{i+1}$$$ for $$$x \in X_i$$$ is a new element too. So the size of $$$X$$$ will double. Obviously the set can double only $$$d$$$ times. This gives an algorithm in $$$O(n + d2^d)$$$ or $$$O(n + 2^d)$$$ depending on implementation.

if $$$2^d$$$ is large, this algorithm doesn't work either. If $$$2^d$$$ is large, we cannot explicitly store $$$X$$$. This means we will need to compress $$$x$$$ in some way. To do that, notice that there are only at most $$$d$$$ elements that matter, because the ones that I ignored, may as well not have been there. So just storing the elements $$$[a_{i_1}, a_{i_2}, \ldots, a_{i_k}]$$$, that doubled the size of $$$X$$$, is enough. $$$X$$$ is just the set of the XORs of every subset of the compressed array.

While we successfully compressed $$$X$$$, we lost our ability to quickly check if $$$a_{i+1}$$$ is in $$$X$$$. We need to get that back.

We cannot guarantee much about the structure of $$$[a_{i_1}, a_{i_2}, \ldots a_{i_k}]$$$. So it is hard to see an algorithm that does this quickly. However it is important to notice that its not the elements itself that are important, but more so what is the set that is generated by the compressed elements. So any edits made to the array, is okay as long it doesn't change the produced set.

We now show that the set $$$[a_{i_1}, a_{i_2}, \ldots a_{i_k}]$$$ and $$$[a_{i_1} \bigoplus a_{i_2}, a_{i_2}, \ldots a_{i_k}]$$$, produce the same $$$X$$$. Or more generally, I can take any element, and xor it with some other element, and it will produce the same set $$$X$$$. This is easy to prove, because if I was using $$$a_{i_2}$$$ in the first set, I use the first and second element in the second set. If I was using both $$$a_{i_1}$$$ and $$$a_{i_2}$$$, then I just need the second element. It is equally good.

Rev.	By	When	Δ	Comment
en9	Everule	2022-02-18 01:05:10	95
en8	Everule	2022-02-18 00:16:13	53
en7	Everule	2022-02-18 00:12:06	3	Tiny change: 'know I did. Therefor' -> 'know I didn't. Therefor'
en6	Everule	2022-02-18 00:02:46	2	Tiny change: 'g this to O(d) as an exe' -> 'g this to $O(d)$ as an exe' (published)
en5	Everule	2022-02-17 23:59:34	22
en4	Everule	2022-02-17 23:57:49	1288
en3	Everule	2022-02-17 23:39:27	801	Tiny change: 'ldots, a_k$ and simi' -> 'ldots, a_k]$ and simi'
en2	Everule	2022-02-17 23:26:38	3452	Tiny change: 'plus x \bihoplus y = ' -> 'plus x \bigoplus y = '
en1	Everule	2022-02-17 22:48:03	581	Initial revision (saved to drafts)

History