parag776's blog

By parag776, history, 11 months ago, In English

Suppose I have a list of strings S, each string at most 50 characters long. I'm looking for a data structure and algorithm that can handle the following query efficiently:

Given another list of strings G (a small list of at most 20 strings), I want to find the lexicographically smallest subset of at most 15 strings from S. The result should satisfy the condition that each string in G is a substring of each and every string in the result. i need fast query.

I've searched extensively but couldn't find a satisfactory answer. Most solutions I found address the case with a single string in G, but I need to handle multiple strings as stated above.

Example: S: ["hello my", "yuck ", "mr elan", "now is the", "parag", "hello", "yustuc"] Query1: G: ["el", "m"] Query2: G: ["yu", "uc"] Desired Result1: ["hello my", "mr elan"] Desired Result2: ["yuck", "yustuc"]

If anyone has insights or knows the answer, please share your knowledge.

  • Vote: I like it
  • 0
  • Vote: I do not like it

»
11 months ago, # |
  Vote: I like it +3 Vote: I do not like it

First of all, I think u should define lexographically smallest subset. Because a subset is a set, and so its not ordered. So it doesn't make sense to compare subsets of S lexographically.

  • »
    »
    11 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    i meant suppose i get 10 results then those 10 results should lexicographically ordered but for a particular query i get 50 results then among those 50 results i need 15 results such that those results are lexicographically smallest amongst all 50.

    in the example "hello my" is lexicographically smaller than "mr elan"