For Cody challenges, should the best solutions be based on the least number of characters used in one's code?

3 views (last 30 days)
Mingji Chen on 30 Aug 2020
Commented: Rik on 11 Sep 2020
It's pretty odd how a solution that uses more characters than usual can be the "leading solution" of a Cody problem and have the least size. Compare these two codes that find the sum of integers from 1 to 2^x, which one uses fewer characters, thus should be the better solution?
function y = sum_int(x)
regexp '' '(?@y=sum(1:2^x);)'
end
function ans = sum_int(x)
sum(1:2^x)
end

1 Comment

goc3 on 10 Sep 2020

John D'Errico on 30 Aug 2020
Edited: John D'Errico on 30 Aug 2020
Cody has some unusual aspects. I hate to call the idea flawed, but there are some aspects I would like to change, with absolutely no idea how to make it better. It is sadly easy to game the Cody scoring algorithm, at least once you learn some tricks.
The problem is you want Cody to reward the "best" code possible as the best solution. But the truly best code for any problem is often dependent on some feature of the problem. It may even depend on the user, or the computer system it will be solved on.
One user might not care about the time, but is severely memory constrained. For this person, they cannot afford a solution that uses large memory as a tradeoff for time.
Another person may have plenty of memory, but must solve the problem millions of times. Therefore time is the factor we need.
Finally, solution complexity is always an important feature. By teaching students to write unreadable code, that just happens to optimize the Cody scoring algorithm, we create programmers who write terrible code, thinking it is good, without a clue as to the bad habits they have learned.
Having said all of that, Cody is itself a good idea, in that it does teach people to learn programming skills. It teaches them creativity in solving problems, a HUGELY important thing to learn for programming. It can teach you to look for different ways to solve a problem, recognizing that not every algorithm is always optimal for every problem.
My personal goal when solving a Cody problem is to not give a tinker's damn about the score. I know when I've done it well. For example, suppose I want to solve the problem posed? Thus compute the sum of the integers from 1 to 2^x, where x is an input parameter?
I'm sorry, but codes like this:
function ans = sum_int(x)
sum(1:2^x)
end
are just bad programming style, for several reasons. The hack where ans is used to give a slight bump to Cody score is a pure hack. But it makes for poor practice when someone starts to write real code, perhaps for a job, where readability of the code is an important factor. You need to learn to write code that can be easily debugged, used by others, including your successors.
Next, if x will be even remotely large, so 20 or 25, now you are summing possibly millions of numbers. At the same time, it is trivial to write the sum of the integers from 1 to N, as
N*(N+1)/2
This costs almost no computation time, very few flops, regardless of the size of N.
Therefore, a function to compute the sum of the integers from 1 to 2^x might be written as
function S = sum_int(x)
x2 = 2^(x-1);
S = x2*(2*x2+1);
This is always as efficient as possible. It could even be easily vectorized. Better code would be friendly. It would check to see if x is a non-negative integer. It would check to see if x was too large, causing overflows. And those error checks would be huge downgrades to the Cody score. But they would be upgrades to my personal score.
Would the above code score well? Surely not very highly. It is readable. It raises 2 to a power only once, not twice. It is efficient, not requiring millions of flops to compute a result for some values of x.
I want to make it perfectly clear - The Cody scoring algorithm has flaws. Can they be corrected? You could do many things, but any scheme you choose will still probably be possible to game the scoring. Does that mean Cody should not be used? Of course not! Cody is a great tool, if used properly.
I might as well say that money is a bad thing, terribly flawed, because it encourages people to rob banks to get more of it, to steal, to embezzle, etc. Money is fine, as long as we don't abuse it, as long as we understand that it should not be the purpose of our being.
And yes, you could say that I am being hypocritical. For someone with a lot of site rep to say that site rep is unimportant might seem that way. But when I die, I doubt that anyone will mention my site rep anywhere in my obituary, at least I hope not. But some people might remember the many tools I have provided, the many people I have helped at all levels. They might remember the lessons in numerical analysis or mathematics I was able to teach here.
SO PLEASE USE CODY! Just use it to learn good skills, and not bad habits. Don't worry about site rep. Site rep happens as a gradual consequence of your doing good work, not from gaming the system. And you can learn some nice programming skills, some nice mathematics from solving Cody problems. Just don't worry about the scoring algorithm. Once you start to do that, you focus on the wrong things.

Adam Danz on 10 Sep 2020
Yes, this was one of the things that deterred me from proceeding with Cody. I would get frustrated at knowing that there were better solutions (ie, higher scores) but not having access to them. After knowing that people can and do cheat the system, it became even more frustrating since now ya gotta wonder if the higher scores are really better solutions or just cheats. But if you put the competition aside, it's really fun to be exposed to so many different problems that you wouldn't normally get involved with. The Answers forum also provides that outlet.
Mario Malic on 10 Sep 2020
If we consider some very simple task, like calculating the average.
Out of perfect score of 100%
• Correct result 50%
• Correctly used functions 50% (Using them, in most cases, code will be understandable, short and fast)
These correctly used functions would be in this case sum, since it's simple, and it's the only function that user needs to use, so it would be 50% worth. In more complex tasks, if user should use two or three or more functions, then each would be 25%/16%/x%
• -25% when user uses for to loop over elements in array, adding them and dividing by number of elements instead of using sum
• -5% when size of a variable changes in a loop (when it's not initialised beforehand)
• -1000% when eval is used
• Other examples relevant to the specific problem
This might be hard to implement this for all problems, especially for user written ones, but it would be good for introductory ones.
Rik on 11 Sep 2020
It also requires the writer of the challenge to know the best solution in advance, which is probably bad design. By now I'm confident in claiming I'm fairly proficient, but 'even' I still see functions sometimes in solutions by others that I never knew anything about. It then turns out that that those functions have been part of Matlab for a decade (or two).

Rik on 30 Aug 2020
My personal opinion: no. I would favor a different metric: time.
If you go for character count, that will only drive the use of short variable names, which doesn't teach good coding practices. A mad hunt for the best perfomance is probably also not optimal, but at least you can see a benefit outside Cody. It should teach you ways to speed up your code, instead of teaching you all the stupid ways you can circument the blocking of eval with regexprep.