Peter-007's blog

By Peter-007, history, 13 months ago, In English

Here's a simple problem for you.

You are given $$$n=200$$$ elements and you have to iterate over all possible quadriplets (4) of them, the order of elements of each quadriplet doesn't matter, and you can take same element more than one time. Give quick approximation of how many iterations will algorithm run.

Answer

It is not $$$1,6*10^9$$$

So if you have to iterate over all subsets of length $$$k$$$, it is important to remember to divide running time by $$$k!$$$. In fact, if $$$k=6$$$, you can fit in one second with $$$n<=85$$$, while $$$85^6≈3*10^{11}$$$, but you would iterate only $$$≈5*10^8$$$ subsets, and not knowing that you might not even think of it.

I hope this blog will help, because I was doing same mistake for a long time.

  • Vote: I like it
  • +78
  • Vote: I do not like it

| Write comment?
»
13 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Does this formula only work when the order of the elements does not matter?

Also, is this the difference between iterating an inner for from 0 to iterating from the outer for variable? Such as:

Code Example
  • »
    »
    13 months ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    Yes and yes. The first way of doing the loop doesn't give you the speedup by a factor of 24. It is the same time complexity but the latter is significantly faster.

»
13 months ago, # |
Rev. 2   Vote: I like it +222 Vote: I do not like it

Alternative Title : person discovers number of ways to choose x items from n things without repeats is nCx and not n^x.

(This is a joke)

  • »
    »
    13 months ago, # ^ |
    Rev. 2   Vote: I like it -26 Vote: I do not like it

    Disagree, I made this blog not because I discovered this formula, but because I never used it when analysing time complexity, and I saw other people do the same mistake.

    upd: ok, r/woosh

  • »
    »
    13 months ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    Interesting

  • »
    »
    13 months ago, # ^ |
      Vote: I like it +78 Vote: I do not like it

    … and deducts an almost-correct formula for counting them.

    • »
      »
      »
      13 months ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      well he is close enough, do you have a way to analyse the exact complexity for his task? (with formula) (genuinely curious, maybe im missing something obvious)

      F(n, x) = sum(i = 1 to n, F(i, x — 1)) was the best i could get.

      (2 other people suggested ways to get it, however i realized a simpler way(kind of similiar to nor's))

      consider the difference between the previous element and the next element in the sorted order of the list, and you get a simple equation in variables that you can solve with stars and bars.

      • »
        »
        »
        »
        13 months ago, # ^ |
          Vote: I like it +34 Vote: I do not like it

        Here's a physicist's (quick and dirty, not necessarily rigorous) way of doing things:

        Assume $$$n$$$ is large enough. Then $$$1$$$ is small enough to be a $$$\mathrm{d} x$$$ (in slightly more rigorous terms, you use the trapezoidal rule).

        Then the relation (approximately) becomes $$$F(n, k) = \int_0^n F(x, k - 1) \mathrm{d} x$$$.

        By induction, $$$F(n, k) \approx \frac{n^k}{k!}$$$.

        • »
          »
          »
          »
          »
          13 months ago, # ^ |
            Vote: I like it +3 Vote: I do not like it

          im aware of the approximate result, i was looking for an exact one, thanks all the same.

          • »
            »
            »
            »
            »
            »
            13 months ago, # ^ |
              Vote: I like it +21 Vote: I do not like it

            For your recurrence, if you note the Hockeystick identity, you can easily show via induction that the answer is $$$\binom{n + k - 1}{k}$$$, assuming $$$F(n, 1) = n$$$.

      • »
        »
        »
        »
        13 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        It should be something like $$$n+k-1$$$ choose $$$k$$$, from stars and bars.

      • »
        »
        »
        »
        13 months ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        My comment somehow got deleted, reupload:

        I claim that F(n,k)=C(n+k-1,k), the proof is as below:
        If we have an array of length n and we want to partition it into k pieces (each piece must have at least 1 member), the number of ways we can do it is f(n,k)=sum(f(n-i,k-1),i=1~n)=sum(f(i,k-1),i=0~n-1) (if n<k, f(n,k)=0), which means we split off a piece that's i long and split the rest into k-1 pieces.

        We can construct a function g(n,x)=f(n+x,x+1),thus:

         g(n,x)
        =f(n+x,x+1)
        =sum(f(i,x),i=0~n+x-1)
        =sum(f(i,x),i=x~n+x-1)
        =sum(f(i+x,x),i=0~n-1)
        =sum(f(i+(x-1),(x-1)+1),i=1~n)
        =sum(g(i,x-1),i=1~n)
        

        Obviously, g(n,x)=F(n,x), which means F(n,x) is equal to the ways you can partition an array of size n+x into x+1 parts. Imagine we have x boards and we insert them into the n+x-1 gaps between the n+x array members. For each insertion policy there is one-and only one partition scheme associated with it. And since there is C(n+x-1,k) insertion schemes, there is also C(n+x-1,k) ways to choose k repeatable items from n different ones without considering order.

        Q.E.D.

  • »
    »
    13 months ago, # ^ |
      Vote: I like it +39 Vote: I do not like it

    The title is a bit misleading because complexity is still $$$O(n^4)$$$. But your critique is unfounded I think.

    In competitive programming, we need to estimate the running time of the code, and the most widely used method is calculating time complexity. But the complexity itself doesn't answer the question "Will it run in time under the given constraints?", which is what we really want. Our heuristic for this is to plug max limits into the calculated complexity and compare the resulting number with some kind of constant, like $$$10^9 \cdot t$$$ (replace $$$10^9$$$ with the number you believe in) where $$$t$$$ is the time limit in seconds. So, if complexity is $$$O(f(n))$$$ and $$$n \le N$$$, we look at how $$$f(N)$$$ and $$$10^9 \cdot t$$$ are compared. More precisely, we usually look at the number $$$\frac{f(N)}{10^9 \cdot t}$$$: if it is significantly (by an order of magnitude, let's say) less than 1, we can be safe; if it is greater than 1, that will be TL for sure; between $$$1/10$$$ and 1 there is a gray area.

    That is what I was taught when I was starting, and that is what I would teach people who are starting. I believe this to be the accepted standard in the community (modulo the particular number instead of $$$10^9$$$), correct me if I'm wrong.

    And this heuristic actually fails miserably in the case highlighted by this post, and I believe that this case is significant (there are more than a couple of problems falling under this category), and that there are people with more experience and bigger rating compared to the topic-starter who don't know that. It's not that people don't know the right answer, they just don't bother calculating it when they check if the solution will run in time, because they were taught to ignore constant factors when calculating complexity. Heck, there was a person who set the problem where the intended solution was using this kind of thing to prove that it is fast enough, and they did not provide the best constant estimation!

    "... when analysing running time" instead of "... when analysing time complexity" would be a better name though. $$$\frac{n^4}{24} + O(n^3)$$$ is the right way to write what you wanted to write.

    • »
      »
      »
      13 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I agree with you about title being misleading, will change it.

    • »
      »
      »
      13 months ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      hi, im sorry for the misleading comment, my comment was made mostly in jest (because i believe this to be rather well known), however i do see that this could be easily mistaken for a serious suggestion at something wrong with the blog. I will update my comment to make it clear its a joke.

»
13 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Added correct formula for the problem.