Generic data structure templates via C++ mixins

Правка en1, от clam, 2021-05-29 22:47:27

Hi Codeforces,

I've spent some time recently writing template code for common data structures for my own use, and I think I came across a really nice design for implementing data structures that have to be generic over both types and operations on those types---e.g. segtrees with their element type and the operation that combines them. Rather than modifying the template code itself itself, have the data structure inherit from a template parameter that contains the information of the element and operation.

That was a pretty bad explanation, but take for example this idea applied to the implementation of segtrees as per this classic blog.

template<class B> struct Segtree : public B {
    using T = typename B::T;

    size_t n; vector<T> s;
    Segtree(size_t n) : n(n), s(2*n,B::e) {};
    void update(int i, T val) {
        for (s[i += n] = val; i >>= 1;) s[i] = B::op(s[2*i],s[2*i+1]);
    }

    T query(int l, int r) {
        T la = B::e, ra = B::e;
        for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
            if (l & 1) la = B::op(la, s[l++]);
            if (r & 1) ra = B::op(s[--r], ra);
        }
        return B::op(la, ra);
    }
};

which can be used like

struct MaxInt {
    using T = int;
    const T e = numeric_limits<int>::min();
    T op(T a, T b) { return max(a, b); }
};

using MaxIntST = Segtree<MaxInt>;

...

MaxIntST st(200);
st.update(2, 3);
cout << st.query(0, 200) << endl;
st.update(5, 6);
cout << st.query(0, 3) << endl;
cout << st.query(3, 100) << endl;

...

and so on. Standard segtrees are a rather simple example, but I think the main benefits come in when dealing with more intricate templates like lazy segtrees or HLD.

For reference, my (jank) lazy segtree using this

I'm obviously not the first person with an attempt at doing this. There's a ton of template code out there where you just modify the code directly to add the function you want, some of it isolates the parts that you change into the first few declarations of the code, and notably AtCoder's ACL itself has a function-generic segtree template. But, here's some of the advantages I believe this has over those approaches:

  • The place where you have to modify code is obvious, self-contained, doesn't pollute the global namespace (much), and you can easily make multiple of the same data structure with different functions / types. When using Kactl a few times I've run into the problem modifying some parts of the code to the operation I wanted, but missed a subtle place and had a weird bug because of it. This completely avoids that.
  • Since you aren't modifying any existing code, you can use scripts to insert in your template code automatically before compiling without worrying about having to modify code after insertion. There's a few editor extensions I've seen that advertise this feature, so this would in theory make that more useful.
  • If the same operation has to be used for multiple data structures, there's no need to duplicate code. The main example I can think of is HLD, where (at least with how Kactl does it) both the HLD code and the underlying segtree have to be aware of the operation you are doing---multiple places to potentially forget to change something and make a bug versus having an HLD<base class> that uses the information in the base class for itself and passes it down to the underlying segtree just by using a LazySegtree<base class>.
  • Optimization. Obviously substituting in everything yourself will optimize just as well as this, but as it turns out gcc on -O2 will not inline function pointer template arguments, so ACL has a (slight) extra overhead. For reference, look at the generated code on compiler explorer between mine and ACL. (Also, imo the global functions and template instantiation for ACL just looks uglier than instantiating via mixins).
  • Black-boxing: while it's always useful to know what the code you're calling out to is actually doing, sometimes you just want to treat it like a black box and use what it gives you without thinking about, reviewing, or even looking at the code. This and ACL involve no modification to prewritten code so you can do just that.

Assorted remarks: There's no actual reason why the template has to inherit from the base struct, it just avoids the pain of having to write static a couple times. As far as I can tell, the type of data structure that is generic over an operation is known as "augmented" and inheriting from a base class that doesn't have use on its own is called a "mixin".

I'm curious if other people have ran into any of the above issues / how they dealt with it (or didn't). Any thoughts / suggestions are welcome.

Теги #data-structure, #c++ template

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en2 Английский clam 2021-05-29 22:50:49 0 (published)
en1 Английский clam 2021-05-29 22:47:27 8179 Initial revision (saved to drafts)