Блог пользователя zscoder

Автор zscoder, история, 8 лет назад, По-английски

Abridged Problem Statement : You have a list of N distinct words, and you want to print K of them. You have a very basic text editor : It supports 3 types of operations: typing a letter, deleting the previous letter, and printing the current document. You need to leave the document empty after printing K words. What's the minimum number of operations required to get the job done?

Constraints : 1 ≤ K ≤ N ≤ 300 The total length of all N words in each set will be at most 100, 000 characters.

Solution :

Name the words s1, s2, ..., sN. Let s0 = "". We initially have s0 on the document. After some operations, we want to print k of s1, s2, ..., sN and change the document to s0. For some pair of string si, sj, how many operations do we need to print sj if we have a document with si? Let p be the length of the Longest Common Prefix of si and sj. We can compute this naively in O(min(|si|, |sj|)). Then, we need to delete |si| - p characters, type |sj| - p characters, and print it (1 operation), for a total of |si| + |sj| - 2p + 1 operations. We can precalculate this value for each pair of words si, sj in a short time.

Now, suppose we have chosen our K words s1, s2, ..., sK. In which order do we print them in order to use a minimum amount of operations? We claim that in fact, if we print the strings in lexicographical order, we will use the minimum amount of operations. Suppose not, let s1, s2, ..., sK be arranged in lexicographical order. Suppose we print the words in the sequence s1, s2, ..., si - 1, sj, si + 1, ..., sj - 1, si, sj + 1, ..., sK, with i < j. Let a1, a2, ..., K be the length of word s1, s2, ..., sK respectively and denote the length of the Longest Common Prefix of sa and sb by L(a, b). Note that L(i, j) ≥ L(i, k) if i < j < k. Suppose to the contrary the longest common prefix of si and sk have longer length than the longest common prefix of si and sj. Then, let p be the longest common prefix of si and sk. The prefix of sj does not contain p but this contradicts the fact that si, sj, sk are increasing in lexicographical order. Thus, the inequality holds. We'll use this inequality many times in the calculations below. It can be easily seen that the total number of operations used is 2(a1 + a2 + ... + aK) - 2(L(1, 2) + L(2, 3) + ... + L(i - 1, j) + L(j, i + 1) + ... + L(j - 1, i) + L(i, j + 1) + ... + L(K - 1, K)). Compare this to when the words are printed in lexicographical order. Firstly, if j > i + 1, we just need to compare L(i - 1, j) + L(j, i + 1) + L(j - 1, i) + L(i, j + 1) and L(i - 1, i) + L(i, i + 1) + L(j - 1, j) + L(j, j + 1). We want the former to be at most the latter. Indeed, L(i - 1, i) ≥ L(i - 1, j), L(i, i + 1) ≥ L(i, j - 1), L(j - 1, j) ≥ L(i + 1, j), L(j, j + 1) ≥ L(i, j + 1), since the words are arranged in lexicographical order. So, summing up yields the desired inequality. Now suppose j = i + 1. We want to compare L(i - 1, i + 1) + L(i + 1, i) + L(i, i + 2) and L(i - 1, i) + L(i, i + 1) + L(i + 1, i + 2). Again, we have L(i - 1, i) ≥ L(i - 1, i + 1), L(i + 1, i + 2) ≥ L(i, i + 2), as desired. So, printing the words lexicographically is the optimal way.

Now, we already have our cruical step. It remains to select the K words that minimize the amount of operations. We can do this by dynamic programming. First sort the N words in lexicographical order. Let dp[end][parts] be the minimum number of operations needed to print parts number of words all of whose index is at most end and the last word printed has index end. We want min(dp[1][k], dp[2][k], ..., dp[n][k]). If end < parts, we let dp[end][parts] be INF.

Let cost(i, j) be the number of operations needed to print word j if our current document is word i. Recall that we already know how to compute cost(i, j) in O(1) since we already precomputed the values.

Now, how do we find dp[end][parts]? It is the minimum of dp[i][parts - 1] + cost(i, end) among 1 ≤ i ≤ end - 1. Time complexity is O(n2k). Finally, don't forget to add the cost from s0 to si. (in the beginning and end).

  • Проголосовать: нравится
  • -25
  • Проголосовать: не нравится

»
8 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Auto comment: topic has been updated by zscoder (previous revision, new revision, compare).

»
8 лет назад, # |
  Проголосовать: нравится +6 Проголосовать: не нравится

I solved it using dynamic programming on a Trie.

»
8 лет назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

Maybe you mustn't post solution before the contest ends?

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I suggest, if ever someone commits this mistake(knowingly/unknowingly), contact MikeMirzayanov and ask him to take down the post/comment.

»
8 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Late Clarification : I drafted this 2 days before the contest ended and posted it after contest but apparently it still shows the date of draft.