I_love_Tanya_Romanova's blog

By I_love_Tanya_Romanova, 9 years ago, In English

Hello everyone!

I want to invite you to participate in November Clash at HackerEarth. Contest is scheduled on November, 28. Contest duration is 24 hours, so there should be some comfortable time for every timezone :)

There will be six five tasks in a problemset. Five Four of them are standard algorithmic problems with partial solutions allowed — you get points for every test that your solution passed. And the last task is an approximate problem — your score for this task depends on how good your solution is comparing to current best solution.

witua is author of this problemset. He is experienced setter — I bet you already saw his problems at Codeforces, TopCoder, CodeChef or some other site, and now it is time for him to debute at HackerEarth.

I was working on this contest as a tester. I believe it is going to be an interesting one :) I hope that several problems will be not too hard for beginners (don't give up and show your best with partial scoring — even very naive solution may give you some points; and 24 hours should be enough for you to read all problems and find some tasks which you can solve) while some tasks are challenging enough to make this contest interesting for more experienced contestants. shef_2318 worked on this contest as translator — you will be also provided with problem statements in Russian. Also I want to thank to thank belowthebelt for providing technical help and doing his best on fixing all issues and improving HackerEarth platform.

As usual, here is one more reason for you to participate in this contest:

Top5 of leaderboard will also receive some nice prizes:

  1. $100 Amazon gift card + HackerEarth T-shirt
  2. $80 Amazon gift card + HackerEarth T-shirt
  3. $50 Amazon gift card + HackerEarth T-shirt
  4. HackerEarth T-shirt
  5. HackerEarth T-shirt

Good luck to everybody — I hope to see you at the scoreboard :)

Upd. Contest has ended :) Thanks to everyone for participating :) Congratulations to winners:

1) anta

2) Carsten Eilks

3) Kmcode

4) HellKitsune

5) mugurelionut

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

The contest has started. Good luck, everyone! :)

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    So what's happened to the judge?

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Was down for a while; should be working fine now. We'll extend the duration by sometime to compensate for it. Apologies!

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I have result "attempted" and all testcases are accepted. What does that mean?

»
9 years ago, # |
  Vote: I like it +10 Vote: I do not like it

So is the sixth problem still going to be added or this is all of them now?

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    There will be only 5 problems, no more problems are going to be added. I'm sorry for announcement being misleading (will fix it soon).

    I really like that missing task; some time before a contest we had discovered a problem with time&memory usage measurement at HE platform, and because of that problem last task loses all its beauty. We decided not to waste a nice task — I hope it will be a part of some other contest in near future.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it +12 Vote: I do not like it

      About reusing it later: don't know about the others, but I already saw the statement while it was up for a short while, including the tricky restrictions, and even wrote a solution for it.

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you check the judge program of the approximate problem? I made some submissions which verified that the interval [X[1],X[N]] is ~10^7 on 19/20 cases — in this case, the worst absolute score on these test cases is < 5 — however, the judge program is even showing me scores > 100000, which are not possible, unless the judge program uses a different scoring function from the one mentioned in the problem statement.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thanks for pointing that out.

    Value A in formula is squared number of inversions, not number of inversions itself. Will be fixed soon.

»
9 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

The scores on the challenge problem are way too high; in particular, the scores for the trivial output are way too high; this may lead to loss of differentiation between top scores in the leaderboard (it does at this point). And you don't want the challenge problem to have multiple full scores.

Maybe altering the scoring function to difference/ratio of the final to the initial score would fix it?

UPD: Not to mention I got 2nd place score on that problem with a random if (which isn't the setter/tester's fault, more like contestants not trying hard enough).

UPD2: Ok, this solved itself.

»
9 years ago, # |
  Vote: I like it +8 Vote: I do not like it

How did you approach the challenge problem?

My solution consists of minimizing the number of inversions first (i.e. every move must decrease the number of inversions by 1) and, among multiple such moves, it picks the one which maximizes the squared sum of distances. This was the basic greedy which I later randomized (e.g. at each step, I picked a random move among the top K best ones). Towards the end of the time limit I only did this for the last moves (i.e. I kept a large prefix of the best solution found so far and only tried to improve something at the end).

What's strange is that on my local tests I had an additional strategy which was giving me between 2%-5% improvement on about 40% of the test cases (and no improvement on the others), but it didn't improve anything on any of the official test cases, which I find quite weird. The additional strategy was to focus only on increasing the squared sum of distances as much as possible (and ignoring if it would decrease/increase the number of inversions). I only used this strategy for improving the last moves (about 10%) of the best solution which tried to minimize the number of inversions first. I also tried doing a greedy which always picks the move which improves the score the most (i.e. it might be OK to increase the number of inversions if this also increases the squared sum of distances significantly). This was giving me similar improvements on my local tests (when used under the same conditions), but none on the official ones. Given that the difference between the top 8 solutions was < 0.1%, an improvement of even 1% on just 1-2 of the test cases with larger scores (out of 20) would have been enough for any of them to win.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I had exactly that greedy approach. I've tried to improve it with something like annealing but it didn't give me any points on official testcases. Also I've tried some of modifications you described but my first attempt was the best one. I'm wondering what is authors solution.

  • »
    »
    9 years ago, # ^ |
    Rev. 3   Vote: I like it +15 Vote: I do not like it

    Just out of curiosity, I generated 100 random tests. I chose N randomly between 500 and 1000, and K randomly between 10000 and 100000. I chose the X coordinates randomly between 0 and 10000000 (avoiding repetitions), but I set X[1]=0 and X[N]=10000000 (in order to be similar to the official test cases). I chose the permutation P randomly and I chose each value of D randomly between 1 and 100.

    I then copy-pasted the best solutions of the top 5 competitors, except that in my case I used the solution which, for the last 10% of the moves, tries to maximize the squared sum of distances, ignoring the number of inversions (as explained in my previous comment). ceilks's solutions was taking too long (there is no time control in his code, so I'm guessing that his code just worked on the official test cases in time) and I couldn't compile HellKitsune's solution in the setup I had (my compiler couldn't find <bits/stdc++.h> and I didn't spend any time trying to find out how to make it recognize this header). So, in the end, I only evaluated my solution, anta's and Kmcode's.

    The results on the 100 test cases are below and my solution obtains better scores than the other two. This is mostly due to the fact that the additional strategy which focused on improving the squared sum of distances in the last 10% of the moves obtained between 1% and 5% better scores than the solution without it on 38/100 test cases. On the other test cases, as expected, my solution was generally worse than the other two, but by a very small margin (while the wins on the 38 test cases were by a large margin).

    However, as mentioned in my previous comment, this strategy brought zero improvement on the 20 official test cases. How is that possible? If the official test cases had been randomly generated, then there should have been around 7-8 test cases on which I should have got 1%-5% better scores than what I did. My only explanation is that the test cases were generated according to some hidden strategy which is significantly different from a random one. And this brings me to my biggest complaint about challenge problems on Hackerearth and also on Codechef. The strategy for generating the test cases is almost never described, which is an awful decision, in my opinion. If we can't generate local tests in order to tune our solutions, then everything becomes a guessing game. In the case of this problem, when I saw that a solution which was bringing me huge score improvements locally had zero effect on the official test cases, I basically stopped trying, because it meant that I was testing my solution on the wrong kind of test cases, without having any idea how the "right" (i.e. official) test cases looked like.

    Results of 100 random local tests (the scores on each line are: first-mine, second-anta's solution, third-Kmcode's solution; smaller scores are better). I will upload the files somewhere and share them later for anyone interested (though you can obtain similar results by writing your own generator and using the strategy I described at the beginning of this post)

    • 0: 118775.593255 120449.850401 121034.333374

    • 1: 175777.354474 179931.467292 182660.491005

    • 2: 531.401095 525.478701 531.565640

    • 3: 14083.361750 13871.374769 14049.528068

    • 4: 43271.604239 42615.338797 43037.820895

    • 5: 229440.948818 234355.370203 235493.339293

    • 6: 8468.543977 8348.384712 8449.553956

    • 7: 20765.191134 20355.213065 20722.876004

    • 8: 66153.103441 65946.799585 66942.328976

    • 9: 21465.960530 21062.681433 21394.392677

    • 10: 180313.676816 183058.700239 184447.359038

    • 11: 2189.342829 2156.765415 2187.870864

    • 12: 3671.068116 3627.771627 3665.717468

    • 13: 14520.477126 14283.212222 14477.683136

    • 14: 134693.351150 136524.129409 138722.517411

    • 15: 0.005492 0.004444 0.000000

    • 16: 33487.700300 33273.992054 33623.149107

    • 17: 21947.569780 21614.321303 21852.374478

    • 18: 45262.909914 45364.718182 45940.395344

    • 19: 81590.884365 81480.554588 82685.592188

    • 20: 0.168267 0.142691 0.000000

    • 21: 1656.637277 1634.607731 1656.759530

    • 22: 9598.209981 9445.979189 9586.128109

    • 23: 117017.351580 119593.477492 120509.618310

    • 24: 334.270813 330.613132 334.561708

    • 25: 3153.370299 3113.650690 3148.972083

    • 26: 51979.703701 52819.868027 53385.876226

    • 27: 29669.893828 29853.968418 30167.364562

    • 28: 31021.211354 30547.961608 30879.654225

    • 29: 127581.370180 130076.532780 132281.548622

    • 30: 2282.259273 2255.880674 2281.972294

    • 31: 10064.822119 9868.336529 10041.007310

    • 32: 27883.976786 27300.487018 27853.576393

    • 33: 86370.507894 84766.232678 86486.095749

    • 34: 11209.943362 11099.222906 11160.280797

    • 35: 18934.018932 18608.380386 18870.639383

    • 36: 2688.476182 2654.933772 2681.699868

    • 37: 9994.539039 9786.186934 9973.090106

    • 38: 125864.488913 128869.973054 129305.531915

    • 39: 0.011393 0.008895 0.000000

    • 40: 1083.051813 1070.024909 1083.352531

    • 41: 10553.200729 10432.236471 10535.842485

    • 42: 141849.048828 144639.894866 145798.831536

    • 43: 0.551054 0.520647 0.000000

    • 44: 10096.366076 9968.217139 10051.761981

    • 45: 580.658971 572.882198 580.896288

    • 46: 8776.439279 8612.498453 8752.525437

    • 47: 31523.776271 30860.513929 31484.003196

    • 48: 158054.016150 161068.239602 162778.073123

    • 49: 4308.224529 4245.484249 4299.767027

    • 50: 11084.775464 10857.521785 11055.225077

    • 51: 42758.692483 41995.694069 42495.939144

    • 52: 113756.680615 115940.220423 117499.705965

    • 53: 0.402649 0.395810 0.402809

    • 54: 8990.456485 8930.970726 8946.920708

    • 55: 22157.918747 22021.379702 22207.570395

    • 56: 50359.670605 50280.307220 50738.449711

    • 57: 73554.882744 74714.743325 76054.446864

    • 58: 170.974950 168.922806 170.998083

    • 59: 1541.745733 1520.683850 1538.981892

    • 60: 17427.172150 17122.913003 17347.208597

    • 61: 65590.437202 64008.103485 65281.658552

    • 62: 300.470178 297.507711 300.579359

    • 63: 2250.266834 2203.535288 2245.928809

    • 64: 3217.849383 3177.377405 3212.745288

    • 65: 23862.276415 23514.502469 23768.750398

    • 66: 69749.553179 71042.035768 71881.341771

    • 67: 153931.598838 159496.781745 162442.580444

    • 68: 4320.433198 4261.377842 4297.999279

    • 69: 13703.966877 13422.147557 13658.587314

    • 70: 100460.916986 102698.379084 103745.156410

    • 71: 253275.121133 258091.441184 258407.175896

    • 72: 12116.784968 11949.970400 12081.693306

    • 73: 4954.639811 4903.666153 4946.959437

    • 74: 24193.570723 24047.823174 24256.605802

    • 75: 65267.924194 66834.056255 67134.887620

    • 76: 135382.015927 140027.936887 142201.851356

    • 77: 300.368418 296.995973 300.436205

    • 78: 3868.452733 3802.703764 3861.658060

    • 79: 67184.118561 68056.156239 68503.908845

    • 80: 156314.895628 159524.611323 160497.491723

    • 81: 7744.743766 7659.351417 7720.956067

    • 82: 195067.253409 201764.859326 204134.506339

    • 83: 0.723379 0.662980 0.000000

    • 84: 8333.797364 8249.620756 8313.640940

    • 85: 30412.643462 29990.227520 30352.935345

    • 86: 111243.097039 111974.700032 114075.644949

    • 87: 195.116852 192.776825 195.299753

    • 88: 4947.566577 4890.392076 4933.071624

    • 89: 94584.077612 95578.847863 96398.144913

    • 90: 136226.694122 140816.825189 143466.898866

    • 91: 82127.785561 81450.464010 82681.843439

    • 92: 11528.936750 11445.924613 11487.978420

    • 93: 19486.206725 19306.875304 19429.696129

    • 94: 67369.912108 68563.368559 69130.315950

    • 95: 172928.020459 175725.841232 176545.667213

    • 96: 11004.470112 10905.679033 10947.414167

    • 97: 810.542780 798.601247 809.533873

    • 98: 80575.467958 81027.551761 81819.932406

    • 99: 71648.229224 70247.486462 71584.005115

    • Total: 4858824.932541 4918744.004140 4972999.650346

    (Small explanation: there are some test cases where Kmcode obtains a perfect score of 0, while my solution and anta's solution don't — in my case, this is because I never allowed to swap P[N], because on the official test cases K was smaller than the initial number of inversion by a sufficiently high difference)

    • »
      »
      »
      9 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      If you want to get around stdc++.h, just replace it by every reasonable include you'd expect a solution to need. It's just a collection of libraries. But every modern g++ should have it.

      Also, pastebin or <br></br> tags to save post space.

    • »
      »
      »
      9 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      By the way, if anyone is interested, ceilks analyzed some parameters of the official test cases and it seems that the values of D seem to be more like <=10, than <=100 (as mentioned in the problem statement), which would explain why the strategy I mentioned (trading off some inversions for a larger squared sum of distances) has no effect on the official test cases. You can read more about it in the Comments section of the problem

    • »
      »
      »
      9 years ago, # ^ |
      Rev. 4   Vote: I like it 0 Vote: I do not like it

      Very interesting.(I think I submitted many solutions for this problem, but how did you find my best solution?)

      As for me, I used only hill climbing and modulated some random numbers to get the best score for ONLY the testcases of this contest ,so it doesn't work so good for another testcases I think.

      And as for anta , according to his twitter https://twitter.com/anta_prg/status/670854615156543489 , he said that he used the algorithm like "Breadth-first beam search".

      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Hm.. I think I missed taking your best solution. I saw that your final score was 99.977, so I went through your last few submissions until I found one which rounded the same as this (since they only show the score for each submission with 2 decimal places). But I think I picked one whose rounded score was 99.97, instead of 99.98 :( Sorry for that. The individual submissions also show the total absolute score (if you hover the mouse over the right place), but using that would have been too time consuming.

        So, obviously, the experiments I did were somewhat incomplete. It's possible that some other submission of each contestant (not the best scoring one on the official test cases) might score better on my local test cases. My point was just to point out that the official test cases missed some important cases, which show up easily when generating random test cases. As explained in another comment, this seems to be because the D values used in the official test cases were much smaller than the ones mentioned in the problem statement (more like D<=10 instead of D<=100). And this issue of having test cases generated by some secret strategy shows up again and again for challenge/approximation problems on platforms like Hackerearth or Codechef (only the Topcoder Marathon matches are professional enough from this perspective).

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Guys, four standard problems and strange challenge with optimising <1% of score. That's not how I like to spend my Saturday evening. I think there was lack of hard algorithmic problem with interesting subtasks. Also I think it's a good idea to write about subtasks scores in statements.

What what intended solution for the challenge problem?

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

how to solve Cats Substrings using suffix automaton? (if possible)

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Cats substrings is 452E - Three strings with slight variations.