[Tutorial] Dictionary of Basic Factors — O(1) String Matching

Revision en2, by Leonardo16, 2021-06-23 00:30:48

Hello, codeforces.
I think this technique is important, and easy to code, although I haven't found much about it, so I've decided to make a blog explaining it.This technique allows to compare substrings lexicographically in O (1) with a preprocessing in O (NlogN).
As this is my first post on codeforces, please let me know if there are any mistakes.

Prerequisites:
  • Basic strings knowledge.
  • Sparse Table(Optional).

What Dictionary of Basic Factors is?

As in the sparse table we will have a matrix of size $$$N\logN$$$ called $$$DFB$$$, where $$$DFB[i][j]$$$ tells us the position of the string $$$S[i ... i + 2^j-1]$$$ (the first time it appears) among all the strings of size $$$2^j$$$ ordered lexicographically.

So, how do we build the matrix? Let's start with the lowest level, for powers of size $$$2^0$$$, we simply put the position of the character $$$S[i]$$$ in the dictionary (1 for A, 2 for B ... and so on).

For the following $$$ DFB[i][j] $$$ we can save a pair of numbers that are $$$ (DFB[i][j-1], DFB[i+2^{j-1}][j-1]) $$$

Now we replace the pairs by their position (the first time it appears) in the sorted array of pairs in that power of two

Handling queries:

Proof:

Tags #strings, #string matching

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en13 English Leonardo16 2021-06-29 00:24:36 22
en12 English Leonardo16 2021-06-28 21:05:23 111
en11 English Leonardo16 2021-06-24 20:08:52 237 Tiny change: 'strings. \nThis t' -> 'strings. \nThis t'
en10 English Leonardo16 2021-06-23 02:53:32 4 Tiny change: '[2], DBF [2] [2])$, w' -> '[2], DBF [3] [2])$, w' (published)
en9 English Leonardo16 2021-06-23 02:39:03 663 Tiny change: ' in $O(N)$ (I'll explain it more deeply later), otherwis' -> ' in $O(N)$, otherwis'
en8 English Leonardo16 2021-06-23 02:23:48 4 Tiny change: 'hose pairs, this can be' -> 'hose pairs. This can be'
en7 English Leonardo16 2021-06-23 02:22:01 8 Tiny change: 'irst time it appears) ' -> 'irst time the pair appears) '
en6 English Leonardo16 2021-06-23 02:20:42 44
en5 English Leonardo16 2021-06-23 01:41:15 810 Tiny change: 'he string ABACABA and we wa' -> 'he string "ABACABA" and we wa'
en4 English Leonardo16 2021-06-23 01:03:39 602 Tiny change: 'l sort in $N\log{N}$(But this ' -> 'l sort in O(NlogN) (But this '
en3 English Leonardo16 2021-06-23 00:33:42 73 Tiny change: 'ize $N\logN$ called ' -> 'ize $N\log N$ called '
en2 English Leonardo16 2021-06-23 00:30:48 153
en1 English Leonardo16 2021-06-23 00:27:13 1289 Initial revision (saved to drafts)