[C++] Use basic_string for graphs to speedup code and reduce memory usage

Revision en8, by drdilyor, 2023-12-24 16:39:22

TL;DR: Simply use vector<basic_string<int>> adj(n); instead of vector<vector<int>> adj(n); for graphs. Don't use it anywhere else.

Hey Codeforces! I learned from our template lord nor about basic_string. Contrary to vector, it has memory optimization. When you use a vector, it will allocate on heap and the structure of vector is similar to this:

struct vector {
    size_t size;
    size_t capacity;
    T*     data;
}
/*
64 bits - size
64 bits - capacity
64 bits - data
*/

(Not to mention, the overhead of malloc)

Which means if you are using vector<int> with size 1, it will be using at least 28 bytes and accessing it requires indirection to a completely different part of memory, which is bad for cache. On 32-bit systems, memory usage is lower since they are 32-bits each, but the cache unfriendliness still holds. Also, as nor mentioned, this isn't exactly how the struct vector is implemented in libstdc++.

basic_string has Short string optimization, where it uses 1 bit flag for short or long mode. In long mode it behaves like vector, but in short mode it has different memory layout. For example, basic_string<int> in short form is similar to this (Note that capacity isn't needed as there is no allocation):

/*
 1 bit  - the flag
 7 bits - size
32 bits - int

32 bits - int
32 bits - int

32 bits - int
32 bits - int

24 bits - unused
*/

(I'm simplifying and not including memory alignment)

So in short form, basic_string stores elements directly inside the struct when the size is small! With int, it stores up to 5 ints without allocating on heap, which makes DFS much more cache friendly and uses less memory.

How much would you ask?

Cycle detection, 1905B

Problem 1 vector<edge>       121 ms  25.2 Mib
Problem 1 basic_string<edge>  83 ms  24.3 Mib
Problem 2 vector<int>         62 ms   4.5  MB
Problem 2 basic_string<int>   62 ms   2.9  MB

edge is a struct of 2 ints, so it's bigger and makes the optimization little less useful.

(Problem 2 doesn't require storing adjacency list but I guessed it's a good enough benchmark)

Bear in mind that basic_string requires the element types to be "simple" types, don't use it for vectors, sets and maps, only for ints, simple structs of ints or pairs of these (pair is a bit nuanced but it works).

More about it here

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en8 English drdilyor 2023-12-24 16:39:22 22
en7 English drdilyor 2023-12-24 13:49:46 83
en6 English drdilyor 2023-12-24 13:46:24 6
en5 English drdilyor 2023-12-24 13:45:53 210
en4 English drdilyor 2023-12-24 13:25:31 12
en3 English drdilyor 2023-12-24 13:23:57 2 Tiny change: ' example, basic_string<int> in short ' -> ' example, `basic_string<int>` in short '
en2 English drdilyor 2023-12-24 13:21:29 12 Tiny change: '*TL;DR**: use `vect' -> '*TL;DR**: Simply use `vect' (published)
en1 English drdilyor 2023-12-24 13:20:37 2536 Initial revision (saved to drafts)