Slow man's solutions. Independent sets calculations.

Disclaimer: this is not my solution. Everything I could do is to chew it up to myself, and public here as well.

Recently I was checking out solution for "1221G — Graph And Numbers".

The most intriguing thing was independent sets counting. Unfortunately I found existing explanation merely a hints of what actually is should be done in order to calculate that amount.

But must say even those hints are good! Seriously! It gives you a motivation to work on details yourself. So thanks to awoo and BledDest for that! This is much much better than nothing. And perhaps better then everything. So I recommend you guys to read first the legacy tutorial and if you spent let say 24 hours with little progress, continue reading this super chewing post.

Ah yeah, there is also C++ solution, but with no comments. Again. You may just use it. And again thanks for that! But if want to grab the idea of how it works, you should dig into each line of that code first, and if you gave up, welcome to my post!

Below is some sort of reverse engineering of that solution, namely countIndependentSets function.

I didn't use spoiler tag... So if you don't want to read it, just stop it immediately! Don't look down, man!

Tutorial

Naive approach takes O(N*2^N), which is roughly O(k*40*256*4B) this is kind of big number, eh?

So, we're about meet-in-middle approach. That means that we do same calculations for both halfes of original set, and then we join that data in order to get final result. In our context, the calculation would be to calculate the amount of all independent sets (IS). That supposed to take O(N*2^(N/2)), with N=40 it is quite possible for modern computers.

First. We represent graph as a set of incident masks IM[i]. Each incident mask IM[i] represents the set of vertices which are incident with vertex i. Or in another words IM[i] represent set of edges between vertex i and vertices marked by "1" in IM[i].

Naive solution to count independend sets is just next thing:

Enumerate all possible subsets, dependent and not.
Go over all vertices in current subset and ensure it is not incident to other vertices. Actually we just should check incidence with vertices we met earlier in this loop.
If there are no incident vertices, mark it as independent.

As we already mentioned naive solution takes O(N*2^N) runtime.

Meet in the middle (M-I-M).

Split original set onto S1 and S2.

Note: sometimes when we talking about some subset from S1 we will call it SS1, and if we're talking about subset from S2 we will call it SS2.

In order to make M-I-M possible we should collect additional data during naive calculations for S1.

Then we use that data and apply some sort of greedy algorithm for S2, so while calculating data for S2 we also can calculate the total. How to do it?

Meaty part starts here

Imagine some independent subset from S2, let it be SS2. Assume we have "k" independent subsets from S1 which are NOT connected with SS2 (SS1SS2={SS1[1], SS1[2],..., SS1[k]}). Of course if SS1[i] has subsets those subsets must be present in SS1SS2 as well.

In combination with SS2 we will get 1+|SS1SS2| combinations. It would be SS2 itself, plus it may be joined by one and only one of subsets from SS1SS2.

Oh no! We forgot that empty set is also subset of SS2!

SS1SS2={{}, SS1[1], SS1[2],..., SS1[k]}

Then combination of sets from SS1SS2 with SS2 will be just |SS1SS2|.

It's not that obvious how to do it though. How to get |SS1SS2|? Let's modify naive calculation as follows.

Enumerate all possible subsets, dependent and not (no changes). But note, we also go over empty set.
Go over all vertices in current subset (SS1) and ensure it is not incident to other vertices.

And update is here: each vertex has incident mask, while going over all vertices also make total incident mask by ORing individual mask. Final mask will represent set of vertices incident to vertices from SS1. If we cut part of mask for S1, then we get vertices from S2.

This is some SS2. And SS1 is incident exactly to that SS2. When we say set A incident to set B we mean maximum possible set of vertices, the vertices from A incident to.

Obviously we may bump into that SS2 several times. Just because several SS1 may be incident with exactly same SS2.

If SS1 we are checking at step 2 is independent, we pick SS2 it incident with and increment its counter.

Step 3. After step 2, we will get cnt[SS2[1]..SS2[2^|SS2|]] array, where each item SS2[i] stores amount of independent sets SS1s incident exactly with that SS2 ("exactly" means that is incident not to its subset neither its super set, but strongly SS2).

Step 4. For each SS2 go over all its subsets and accumulate their "cnt" values. If we will go from smallest SS2 till biggest, we can do whole step #4 in O(2^(N/2)) time. Store it back into "cnt[SS2]". Also take into account {} from S1 which will be connected with {} from S2.

Now cnt[SS2] holds amount of independent sets connected with SS2 or some its subsets, but not its super sets.

And here is most crucial point!

Crucial point.

Empty set. There it is. As long as SS2 includes empty set, cnt[SS2] also takes into account connections with empty set! So that just means that cnt[SS2] holds amount of independent sets not connected with S2 at all + amount of ISs connected with SS2 and its subsets, but not its supersets! Think about it.

So how to get the number of independent sets which are NOT connected with SS2?

Here we should make and upside down trick. Consider subset S2\SS2 (S2 without SS2). cnt[S2\SS2] holds amounts of ISs from S1 which are either not connected with S2 or connected with only subsets of S2\SS2, latter means those ISs connected only with set of vertices which don't belong to SS2. This is exactly what we need! |SS1SS2| is cnt[S2\SS2]!

S2 part.

Now while naive detecting ISs from S2 (which is some SS2), just pick out cnt[S2\SS2] and it would be |SS1SS2|. Here we may stuck with some confusion with empty sets again. At least I had for a while haha. How to take into account SS2 itself? This is a "bump" of {} from S1 into SS2. It is always taken into account, see step #1 and #4 from updated algorithm. Don't mix it up with incidence of SS1 with {} from SS2. Latter means that SS1 is not connected with S2 at all. Those are different cases and sort of different empty sets. Both cases are taken into account.

Below is BledDest's countIndependentSets implementation with my comments.

const int N = 40;
const int M = 20;

long long incident_mask[N];

// ...

long long cntmask[1 << M];

// ...

long long countIndependentSets()
{
        // S2 will have 20 or less vertices
	int m1 = min(n, 20);
	
	// S2 will contain the rest 
	int m2 = n - m1;
	//cerr << m1 << " " << m2 << endl;
	
	// cntmask is actually what we called "cnt" in our tutorial
	memset(cntmask, 0, sizeof cntmask);
	
	// Go over all subsets SS1, "i" is SS1
	for(int i = 0; i < (1 << m1); i++)
	{
		long long curmask = 0;
		bool good = true;
		
		// Go over all vertices "j" and skip ones that doesn't belong to SS1
		for(int j = 0; j < m1; j++)
		{
			if((i & (1 << j)) == 0)
				continue;
				
		    // If "j" is in current incident mask already,
		    // then it is incident to some of vertices from current SS1
		    // mark such SS1 as dependent (good=false).
			if(curmask & (1 << j))
				good = false;
				
			// Otherwise register "j" in incidence mask, and also register all its
			// incident vertices.	
			curmask |= ((1 << j) | incident_mask[j]);
		}
		
		if(good)
		{
		    // SS1 is independent,
		    // SS2 = high(curmask) = (curmask >> m1) represent SS2 that SS1 incident to.
		    // See step #3 of updated algorithm.
			cntmask[curmask >> m1]++;
		}
	}	
	
	// This is step #4,
	// This is how we enumerate all SS2, and also its subsets, and
	// accumulate exact bumps of all its subsets.
	// I would probably reorder that loops, though.
	for(int i = 0; i < m2; i++)
		for(int j = 0; j < (1 << m2); j++)
			if(j & (1 << i))
				cntmask[j] += cntmask[j ^ (1 << i)];

        // This is a naive variation for S2 part calculations
	long long ans = 0;
	
	// Go over all subsets of S2, namely SS2=i
	for(int i = 0; i < (1 << m2); i++)
	{
		long long curmask = 0;
		bool good = true;
		
		// Go over all vertices "j" and skip ones that are not in SS2=i
		for(int j = m1; j < n; j++)
		{
			if((i & (1 << (j - m1))) == 0)
				continue;
				
			// Filter out dependent sets	
			if(curmask & (1ll << j))
				good = false;
				
			curmask |= (1ll << j) | incident_mask[j];
		}
				
		if(good)
		{
                        // If SS2 is independent count all its combinations with
                        // independent subsets from S1, including combination with {}.
                        // Here (i ^ ((1 << m2) - 1)) actually selects subset S2\SS2.
                        // and thus cntmask[S2\SS2] means |SS1SS2|
            		
			//cerr << i << endl;
			ans += cntmask[(i ^ ((1 << m2) - 1))];
		}
	}
	return ans;
}