Many years ago, when I was teaching at Cornell, my then-colleague Nick Kiefer kept me endlessly amused with the creative assignments he dreamed up for the students in his Decision Theory class. I’m reporting this from memory, so I don’t guarantee that I have it exactly right, but I think this was one of those problems:
- On Day Zero (before classes start), Nick uses his pocket calculator to generate, at random, a secret number between 0 and 100. This number is drawn from a uniform distribution, which means (roughly) that the drawing is unbiased, so any number is as likely to come up as any other. The students’ job is to guess this number.
- On Day One — the first day of class — Nick uses his pocket calculator to generate a Number of the Day. He doesn’t tell the students what the Number of the Day is, but he does tell them whether it’s larger or smaller than the secret number.
- On Days Two, Three, and so forth, he does the same thing. He generates a Number of the Day and announces whether it’s bigger or smaller than the secret number.
- You, the student, can submit your guess on the day of your choice.
- You are then assessed two penalties. The first penalty is equal to the square of the error in your guess. So if the correct number is 3 and your guess is 5, your penalty is 22, or 4. The second penalty is equal to the natural logarithm of the day when you submit your guess. So if you submit your guess on Day 7, your penalty is the log of 7, which is 1.94591.
- These penalties are subtracted from some some fixed number of points that you’re given to start out with (say 1000). The result, after the subtraction, is your score for the assignment. This score counts for a significant fraction of your course grade.
Clearly, there’s some tension between the two penalties. To minimize the first penalty, you want to submit your guess as late as possible, when you’ve got the most information. To minimize the second penalty, you want to submit your guess as early as possible. To minimize the sum of the penalties — well, what should you do, exactly?
I remember loving the whole idea of this problem when Nick first described it to me, but, oddly, I don’t remember actually trying to solve it (in the sense of devising a strategy). I should probably be embarrassed by this failure of curiosity, but I made up for it a couple of days ago when I recalled the problem and worked on it for a while.
To avoid spoilers, I won’t say any more about my calculations. But: Suppose you’re in Nick’s class and your career depends on this assignment. What’s your strategy?
PS: When this first posted at 2:01AM Eastern Time, there was a final paragraph saying that I thought the problem would be more interesting with a different set of penalties. It is now 2:11AM Eastern time and I no longer believe that, so I’ve struck that final paragraph. (For the record, my mistake was caused by forgetting that the random number was between 0 and 100 and solving the problem as if it were between 0 and 1.)
Addendum: I believe the solution to the problem as stated involves waiting an unrealistic number of days (that is, many more days than are in the typical academic semester). This doesn’t particularly bother me, but if it bothers you, you might want to tackle a revised problem in which the first penalty is divided by 10.
Intuitively, it seems you should leave it quite late. Equally intuitively, I think Thomas Bayes will weigh in on this very soon! I’d better get cracking.
The marginal cost of delaying guessing by a day gets smaller each day (in terms of the log penalty). For example, guessing on day 40 would incur a penalty of 3.69, while guessing on day 39 would incur a penalty of 3.66. I’d guess on the last possible day and use all the information given so far to estimate the number.
1) Go into Nick’s office on Day 0 before classes start, and guess any number. Then your second “penalty” will be log of 0, or negative infinity, thus ensuring a healthy boost to your grade.
2) On any day, submit as your guess the imaginary number 1000000i. Assuming Nick’s number is an integer N between 0 and 100, your first penalty will be equal to the square of 1000000i – N, or some complex number equal to negative one trillion plus some comparatively small imaginary part. Add that to your grade and hope that the imaginary part is dropped.
Ok, after some false starts, I get this. If after M days, he’s said *every time* that the number of the day is less than the secret number, the expected decrease in penalty if I wait a day is -10000*(2*M*(M+1)!+4*(M+1)!-3*M^2*M!-11*M*M!-8*M!)/((M+2)^2*(M+3)^2*(M+4)*M!) – log(1+1/M)
If I approximate log(1+1/M) by 1/M, this is positive as long as -M^4-10*M^3+9963*M^2+9940*M-36>0, that is, M<95.4.
If sometimes the number of the day is more, this only pushes out the 'best guessing' day. Since a typical semester is less than 95 days, it seems to confirm Joshua's and my intuition. However Bennett's strategies are certainly worth a shot.
im going to guess on day 1. if he says larger, i will guess 25 and if he says smaller, i will guess 75.
subsequent days ‘information’ only seems to confuse the question..
lets say ive waited 100 days and received 60 largers and 40 smallers. i would think i would want to guess 40, but even with uniform distributions, there is enough room at the margins for that guess to be way off.
A couple meta strategies.
1) Assuming this is common practice for his course, guessing on day zero is strongly suggested.
2) When is the grade returned to the student? If it’s before the relevant ‘drop with no meaningful transcript penalty’ deadline, and you have a friend not otherwise taking the course, you can assure yourself of the exact number. Alternatively, a randomly chosen student sacrifices their grade for the rest of the class. (or two, if guessing 0/100 by the first is too much of a cost)
————
Thoughts on the problem
Determining the expected value of the secret number on any given day is a fairly simple problem. Circle with d + 2 random marks on it, the expected distance between two marks with n marks in between [cut point too make a linear range 0-100 and secret number] is (n+1)/(d+2)*circumference [in this case 100], but calculating the square of the expected error of that guess is a bit harder.
Without running any simulations, I’d expect the first error term to be roughly proportional to 1/(d+2)^2 as that’s the expected distance between adjacent numbers generated [including 0/100 as a number]. The derivative of that term gets smaller faster than the derivative of log d, but it’s also multiplied by 100 so there’s a trade off.
My best approach right now would be to daily apply Bayes to determine a probability distribution over the possibilities, estimating the change in the first error term over the next day or two or so, comparing that to the change in the second and guessing on the day in which I don’t expect to improve my overall grade.
I’m tempted to assume that the expected error of the first term is different for results near 50 than for results near 0/100 as the standard deviation of a random variable with probability 0.50 is larger than one with probability 0.99, so a fixed date of choosing doesn’t seem applicable.
What if it’s the same? Does he tell them it’s the same, or does he generate a new number, or does he perhaps randomly decide whether to say “larger” or “smaller”? It throws a spanner in the calculations, because a day with “same” gives you no useful information.
This assumes you’re trying to maximise your expected grade. What about risk preferences? Most students would be risk averse, but you could imagine some students who think they’re likely to fail otherwise trying to gamble it up to increase their chances of a lucky pass.
Although risk is an important part of decision making, it’s not so desirable to have grades significantly affected by random noise, given their purpose as a signaling device.
But note that there is also a decreasing marginal benefit by delaying guessing. With no information, your best guess is 50 with an average error penalty of 833. With a single “higher” or “lower”, your best guess is 33 or 67 respectively (I just trial-and-errored this on Excel), with an average error penalty of 561. If you had a large amount of information, your average error would be small and one more piece of information would reduce your expected error penalty by a small amount in both relative and absolute terms.
I am reminded of an econ lecturer (who blogs here) who assigned a word penalty for an essay with a 1200 word limit as ((1-(# words over limit/1200))^0.5)* essay grade. Thus, you would get a complex grade if you wrote more than 2400 words.
He also used a Cobb-Douglas production function for weighting the essay’s content vs. its spelling, grammar and punctuation.
Everyone (including me, when I first started thinking about it) seems to be assuming the chosen number is an integer… are we correct?
John Faben: I don’t know for sure what the original problem was, but when I worked on it a couple of days ago, I did *not* assume the chosen number is an integer.
This might help some folks: http://en.wikipedia.org/wiki/Checking_whether_a_coin_is_fair#Posterior_probability_density_function
Assuming no risk aversion it seems you have to find the probability distribution using the formula above, square it, find the mean and resulting variance (http://en.wikipedia.org/wiki/Variance). Then work out how much the resulting variance decreases when you add one more H or T to the formula. Compare this to the marginal increase in ln(n), which is 1/n.
Given the small penalty for delaying it seems the optimal could well be the last day of semester.
How long is the class? i.e. what is the maximum number of days? Since the log of number of days increases only very slowly, whereas the square of the error can be huge, I would assume as a first guess you are better off waiting as long as possible. There would come a point where the gain in information is not worth the increase in penalty, but it appears to me that the class will be over long before this happens. This makes me think I have missed something.
If we think about the distribution of numbers drawn, as the number drawn gets higher, the more closely we approach a “flat” distribution, with all numbers drawn equally. Given this information after, say 1 million draws, we could identify the mystery number with an error of less than 1, i.e precisely. All numbers will have been drawn very nearly 10,000 times. Fewer draws will introduce “noise” into this data. There is no point in adding further data once the “resolution” gets to less than 1. We would need to choose a confidence level for this.
If you are adding signals to reduce signal to noise ratio, you get an improvement according to the square root of the number of signals added. I am not sure this is really applicable, but it does not change my original view that waiting for as long as the class is likely to last would be the best strategy.
This seems to be analogous to the Schwartz Baysian Information Criteria (BIC) used to decide between empirical models. Let L be the Likelihood Function (Sum of Squared Errors) and N be the Number of Observations (Days) then:
BIC = -2 lnL + k lnN.
Is the “number of the day” also uniformly distributed between 0 and 100? Or do we get no information at all about the “number of the day”?
Phil: The number of the day is also uniformly distibuted between 0 and 100. Or at least that’s what I’ve been assuming.
Well one part of my strategy certainly involves re thinking every time he pulls a random number, because the actual sequence makes a big difference. Consider case 1 where his random draw is 0 every day which he announces is lower, versus the split points you would find doing a binary search, narrowing the field quickly. So assuming itis correct to wait at least a few days, it might pay off to do map the grid as it were and test the remaining candidates daily. Whne the mesh gets fine enough the (predictable) log penalty will be greater.
solution with no thought: Well assuming a 14 week semester with 3 classes (new numbers) a week. You’ll get 42 numbers by the end of this which amounts to a penalty of ln(42)=3.7 or a little less than the equivalent of missing by 2. Certianly less than the difference between missing by 2 and missing by 3. So waiting until the end is probably pretty close to optimal
solution with a little thought. Our expected loss given s and n=the number of days we wait is s(1-s)/n + ln(n). Differentiate and set equal to 0 and we’d like to stop when n=s*(100-s) (assume we’re only interested in minimizing expectation here).
Now this n is always bigger than 99 so we wait until the last day (42).
To “justify” maximizing expectation here penalty 2 is deterministic and penalty 1 has a scaled binomial distribution (assuming we knew s) so no need to worry about huge variances from either of them.
No course if we had an infinitly long semester (the joys) then I’d need to think about wheather to stop on day 99 or day 2500.
How many days are there?
Also, what if the number of the day is the same as the secret number?
After knowing those two things I think I can get it.
I had assumed an integer. If the number is not an integer, then with the error less than 1, the square becomes smaller. So once you get to within 1 of the number, there is little to be gained by getting even more precise.
If the number of points at the start is 1000, this is very much larger than the penalty from even a very large number of days. Waiting for 100 days will give you a final score of 995 if you guess correctly. Getting 99.5% is a pretty good mark, so waiting as long as possible is not going to seriously dent your grade for the term. I am not going to waste much time trying to optimise my result from 99.5% to 99.6%. Giving you 10 points at the beginning would be different, or assigning a grade based on your position in the class.
Intuitively, the logarithm penalty is small. You can go as many as 403 days and still only lose 6 points. But if you’re off on your answer by as little as 2.5 points, you still lose the 6.
So my gut says you wait it out a long time.
The expected penalty points for being off is just equal to the variance of your estimate, no? The variance of your estimate is .5 *.5 * 100 / N, where N is the number of trials. The penalty points are equal to the log of N.
So for the breakeven point, you want to find N where (variance after N+1 trials – variance after N trials) = log(N+1)-log(n).
When I program this, I get exactly 100 trials, assuming the secret number is 50. If it’s not, the binomial variance is lower, and so you’ll quit sooner than 100 trials.
I bet there’s an easier way. And I bet I did something wrong.
Yup, waiting is obviously the right way to go. Most grades don’t have enough sig figs to count a difference between log(n) and log(n+1) (for n>10 or so). I think the problem would be alot trickier and more interesting if the penalty for waiting were somehow scaled to make it more significant
I simulated this once. Would make an interesting Monte Carlo simulation.
Day 0 using Excel RANDBETWEEN(0,100) it spit out 71
I then used the same function for 30 days, comparing the LN error from number of days to the maximum error possible for a random guess:
The correct number was generated on the 9th and 17th days. Even if you eliminate those, the potential error gets smaller and smaller by the square compared to increasingly small increases in the natural log.
This is definitely a case of more information over time being more useful than the “decay of time”. The best strategy would be to plot out the days and guess the last day of class if you want the lowest expected penalty and highest expected score.
This is a very interesting problem, but what does it have to do with decision theory? In reality we don’t know the properties of distributions (values for mu and sigma) that would enable us to do the math. Just ask the guys that modeled MBS risk.
If this course is about making the best possible decisions under
any given set of circumstances, then the strategy for this secret
number game is clear. Drop the class as soon as it’s announced.
This guy may not be as bad as the professor who throws the test
booklets down the stairs and gives them a grade that depends on
distance. He’s a member of the same club, though.
I suppose that sticking with this Decision Theory course would be
excellent strategy for someone who would otherwise not stand the
ghost of a chance of passing. The gamble becomes worthwhile.
However, the better the student, the worse a deal this course
becomes.
Let’s take student A. He’s really smart, has analyzed the problem,
waits until he reaches a 90% confidence level that waiting for more
information would not narrow the guess sufficiently to offset the
time penalty. He makes his guess, correctly basing it on the
mathematical best number per the information. Too bad for him that,
this time, it’s the one class in 10 where the random “number of the
day” data end up misleading and causing rational guesses to be way
off. Student A has done everything right, and gets a lower grade
than student B who manages to make a lucky guess.
After 10 simulations (couldn’t resist):
Trial # Pen Max # Days Note
1 71 2.2 7 Guessed
2 84 19.91 50
3 75 2.64 14
4 13 3.69 40 Guessed
5 44 39.91 50
6 38 3.18 24
7 3 7.47 32
8 90 11.8 18
9 52 2.83 17 Guessed
10 54 4.87 48
Average: 9.85 30
I think this is an excellent problem for Decision Theory. The less uncertainty with more information, the better the decision will be given the relatively low risk of additional time. If the penalty was the square of the day number? That would be interesting and probably a more appropriate simulation to life…
(I’m assuming that the random numbers are real numbers uniformly distributed between 0 and 100.)
On the Nth day we will have the original number localized to an interval that is the difference between two numbers that have been rank ordered from a sequence of uniform random numbers of length N. (I think the width of every interval has the same distribution, but I’ll have to verify this for the first and last intervals.)
Let’s call the width of the interval in which we’ve localized the unknown number on the Nth day W_N. If we guess the number to be the mean of the interval, then the expected cost on the Nth day (conditioned on the width of the interval) is
E[C | W_N, N] = W_N^2/12 + log(N).
Again, I’m assuming that the interval widths all have the same distribution and that the original number is equally likely to be in any of the intervals.
On the Nth day, I believe the width of the first interval is a beta-distributed random variable with mean equal to
mean = 100/(N+1)
and variance equal to
var = 100^2 * N/((N+1)^2 * (N+2))
So, the mean-square-value for the width of the first interval is
E[W_N^2 | N] = 100^2 * 2 / ((N+1) * (N+2)),
and the total expected cost on the Nth day is:
E[C | N] = 100^2 * 2 / ((N+1) * (N+2))/12 + log(N)
If I didn’t make a mistake (big IF), the cost is minimized by selecting on the 55th day. The expected cost is a monotonic function up to this point, so, if the class ends before that, we should estimate on the last day of class.
Okay, by pulling out my graphing calculator, and graphing the function x^(# of ‘less than’s announced) * (100-x)^(#of ‘greater than’s announced), and then normalizing the function so that the integral from 0 to 100 is 1, so I think I have the “right” way to guess the mystery number down…
But is there a good, quantitative way to describe how much more concentrated this distribution is becoming each day? Anybody here know if there are any good techniques in statistics or measure theory for this?
Most commenters seem to be missing the fact that the Number of the Day is NOT REVEALED, only whether it’s larger or smaller than the Secret Number. Therefore as information is gained, the interval of possible numbers only shrinks probabilistically (except for the endpoints which can be ruled out), i.e. if you get “larger” 10 days in a row that doesn’t rule out the Secret Number being 99.
Noah: you are correct . . . I completely missed that fact.
I haven’t worked out the numbers, but the basic approach to the problem would be this (and I think you need to assume he tells you that the number of the day is greater than *or equal* to the secret number to avoid giving it away by accident):
Start with a uniform probability vector (which can be converted to a probability distribution by dividing by the sum of the entries). Each time a result is announced, you multiply the current vector by the likelihood vector on the outcome history, which is the same as a likelihood ratio, but generalized to more than two entries. So if the first day (well, any day) you hear “greater than or equal”, your likelihood vector is , and you multiply your current vector by that. (It’s the opposite order for if you hear “less than or equal”.)
At any given day, you can turn your current vector into a probability distribution as above. You can also calculate what your guess should be (the one with the highest probability, or the average of these if there’s more than one). The expected error penalty given that guess would be: sum i = 0 to 100 of [(p(i) * (g – i)^2] where g is your guess.
To determine whether you should wait another day, see if a) the expected decrease in error penalty is less than the b) increase in time penalty.
To get a), find average over the two possible changes in error penalty if the next outcome is greater vs. less (they should have different probabilities now, so be sure to weight by that).
To get b), just compute log(d+1) – log (d).
I haven’t looked into possible shortcuts involving calculating the entropy of your current probability distribution and when it stops decreasing quickly enough so that you can do a simple function minimization problem, but this is one way.
(And yeah, to make htis interesting, you’d really need to have a higher penalty on waiting … shouldn’t it increase exponentially?)
Oops, it stripped out everything in my greater-than/ less-than symbols. To clarify, you start with an initial probability vector of [1 1 1 … 1] (101 entries in total). (This turns into a probability distribution by dividing each entry by the sum of them.)
The likelihood vector you use for updating on “greater than or equal” is [101/101 100/101 99/101 … 1/101].
Oh, good point Noah. Yes I had assumed that we would hear the NOTD but rereading I see that is not so. So my first post is irrelevant. That actually should make it easier, because you no longer have to worry about what the actual sequence of numbers tells you. Now the percent of the NOTDs less than the number should converge on the secret number ie this is a monte carlo simulation. Intuitively it seems I should wait a few thousand days because the margin cost of ln d+1 – ln d is so low. The real question is about the variance.
Because of this statement:
—
“For the record, my mistake was caused by forgetting that the random number was between 0 and 100 and solving the problem as if it were between 0 and 1.”
—
I am assuming that the random numbers are real-valued, not integers.
On the first day, Professor Kiefer will have generated 2 random numbers: the secret number and the number-of-the-day (NOTD). On the Nth day, Professor Kiefer will have generated N+1 random numbers: the secret number and N numbers-of-the-day. On the Nth day, Professor Kiefer will have told me that the NOTD was smaller than the secret number N_s times over the N days. Because of this, I will know that the rank of the secret number is N_s+1 out of N+1
And because all of the numbers have been independent and uniformly distributed between 0 and 100, the conditional distribution for the secret number is a beta distribution. Based on this, here are my thoughts on this problem:
1. The conditional expected value of the secret number given the number of times the NOTD has been smaller is
E[secret | N_s, N] = 100 * (N_s + 1)/(N + 2).
This would be my guess for the secret number on the Nth day.
2. The conditional variance of the secret number given the number of times the NOTD has been smaller is
var[secret | N_s, N] = 100^2 * (N_s+1)*(N-N_s+1)/((N+2)^2 * (N+3))
This variance is my expected squared-error if I guess on the Nth day.
3. The overall expected cost for a guess on the Nth day based on the N_s smaller NOTDs is:
E[cost | N_s, N] = var[secret | N_s, N] + log(N).
This is a number I can compute on any given day. I need to decide, though, if I should wait another day to submit my guess.
4. One way to do decide if I should guess or not is to compute the expected cost for making a guess tomorrow. If this expected cost is larger than the expected cost for a guess today, then I will submit my guess today. Otherwise, I’ll check again tomorrow. (This may not be an optimal way to decide, but it seems reasonable for now.)
5. The expected cost for making a guess tomorrow is:
E[cost | N_s+1, N+1]*Pr[NOTD<secret | N_s, N]
+
E[cost | N_s, N+1]*(1 – Pr[NOTD<secret | N_s, N])
which is equal to
var[cost | N_s+1, N+1]*Pr[NOTD<secret | N_s, N]
+
var[cost | N_s, N+1]*(1 – Pr[NOTD<secret | N_s, N])
+
log(N+1)
where the conditional probability that the next NOTD will be smaller than the secret number is
Pr[NOTD<secret | N_s, N] = E[secret | N_s, N]/100 = (N_s+1)/(N+2)
6. The cost for waiting another day will increase by
log(1 + 1/N),
whereas the expected cost for waiting another day will decrease by
100^2 * (N_s+1)*(N-N_s+1)/((N+2)^2 * (N+3)^2).
(I may have goofed up the algebra in this last derivation.)
7. Based on my algebra, here is the rule I will use to decide whether or not to make a guess on the Nth day:
100^2 * (N_s+1)*(N-N_s+1)/((N+2)^2 * (N+3)^2) < log(1 + 1/N)
If this is true, I will guess. If it is not, I will test again tomorrow after I learn about the new NOTD.
8. As stated, the best guess is nearly always the last day of class. To make this interesting for a typical class, I would change the penalty to 100*log(N), or I would decrease the range for the random numbers to [0, 10].
9. I think my logic is sound. I'd be surprised, though, if I didn't make an algebra mistake somewhere.
10. Thanks again, Professor, for giving us an interesting problem. Even if I'm on the right track, I suspect I'll learn something from your solution.
Thomas Bayes: I will wait to delete until you resubmit. But it might not be necessary: At least in my browser, your inequality signs are showing up just fine.
(Here it is again. The last post lost some stuff between item 5. and item 8., I believe.)
Because of this statement:
—
“For the record, my mistake was caused by forgetting that the random number was between 0 and 100 and solving the problem as if it were between 0 and 1.”
—
I am assuming that the random numbers are real-valued, not integers.
On the first day, Professor Kiefer will have generated 2 random numbers: the secret number and the number-of-the-day (NOTD). On the Nth day, Professor Kiefer will have generated N+1 random numbers: the secret number and N numbers-of-the-day. On the Nth day, Professor Kiefer will have told me that the NOTD was smaller than the secret number N_s times over the N days. Because of this, I will know that the rank of the secret number is N_s+1 out of N+1
And because all of the numbers have been independent and uniformly distributed between 0 and 100, the conditional distribution for the secret number is a beta distribution. Based on this, here are my thoughts on this problem:
1. The conditional expected value of the secret number given the number of times the NOTD has been smaller is
E[secret | N_s, N] = 100 * (N_s + 1)/(N + 2).
This would be my guess for the secret number on the Nth day.
2. The conditional variance of the secret number given the number of times the NOTD has been smaller is
var[secret | N_s, N] = 100^2 * (N_s+1)*(N-N_s+1)/((N+2)^2 * (N+3))
This variance is my expected squared-error if I guess on the Nth day.
3. The overall expected cost for a guess on the Nth day based on the N_s smaller NOTDs is:
E[cost | N_s, N] = var[secret | N_s, N] + log(N).
This is a number I can compute on any given day. I need to decide, though, if I should wait another day to submit my guess.
4. One way to do decide if I should guess or not is to compute the expected cost for making a guess tomorrow. If this expected cost is larger than the expected cost for a guess today, then I will submit my guess today. Otherwise, I’ll check again tomorrow. (This may not be an optimal way to decide, but it seems reasonable for now.)
5. The expected cost for making a guess tomorrow is:
E[cost | N_s+1, N+1]*Pr[NOTD<secret | N_s, N]
+
E[cost | N_s, N+1]*(1 – Pr[NOTD<secret | N_s, N])
which is equal to
var[cost | N_s+1, N+1]*Pr[NOTD<secret | N_s, N]
+
var[cost | N_s, N+1]*(1 – Pr[NOTD<secret | N_s, N])
+
log(N+1)
where the conditional probability that the next NOTD will be smaller than the secret number is
Pr[NOTD<secret | N_s, N] = E[secret | N_s, N]/100 = (N_s+1)/(N+2)
6. The cost for waiting another day will increase by
log(1 + 1/N),
whereas the expected cost for waiting another day will decrease by
100^2 * (N_s+1)*(N-N_s+1)/((N+2)^2 * (N+3)^2).
(I may have goofed up the algebra in this last derivation.)
7. Based on my algebra, here is the rule I will use to decide whether or not to make a guess on the Nth day:
100^2 * (N_s+1)*(N-N_s+1)/((N+2)^2 * (N+3)^2) < log(1 + 1/N)
If this is true, I will guess. If it is not, I will test again tomorrow after I learn about the new NOTD.
8. As stated, the best guess is nearly always the last day of class. To make this interesting for a typical class, I would change the penalty to 100*log(N), or I would decrease the range for the random numbers to [0, 10].
9. I think my logic is sound. I'd be surprised, though, if I didn't make an algebra mistake somewhere.
10. Thanks again, Professor, for giving us an interesting problem. Even if I'm on the right track, I suspect I'll learn something from your solution.
I tried to resubmit twice, but I think I’m being filtered as spam.
Maybe the spam filter knows best?
(I changed the post to remove a ‘greater than’ sign. The problem seemed to happen when I followed a ‘less than’ sign with a ‘greater than’ sign. Everything in between was deleted from the post.)
##### SPOILER WARNING #####
I also assumed the numbers were real numbers.
Here’s the code I used to work it out. You can run these commands in maxima
px(N,k) := (1-x/100)^k*(x/100)^(N-k)/100*(N+1)*N!/k!/(N-k)!;
exp1(N,k) := integrate(px(N,k)*(x-G)^2,x,0,100);
bestG(N,k) := rhs(solve(diff(exp1(N,k),G),G)[1]);
bestp1(N,k) := substitute(bestG(N,k),G,exp1(N,k));
pnextIsLess(N,k) := integrate(px(N,k)*x/100,x,0,100);
expNextP1(N,k) := pnextIsLess(N,k)*(bestp1(N+1,k)-bestp1(N+1,k+1)) + bestp1(N+1,k+1);
waitBenefit(N,k) := bestp1(N,k) – expNextP1(N,k);
w(N,k) := float(waitBenefit(N,k) – log(N+1)/log(N));
k is the number of ‘less thans’ announced so far, on day N.
px is the pdf of the secret number
exp1 is the expected 1st penalty if I guess G
bestG is the G I should guess.
bestp1 is my expected penalty if I guess the optimal G
pnextIsLess is the probablity that tomorrow the lecturer will say ‘less than’
expNextP1 is the expected value of penalty1 if I wait until tomorrow
waitBenefit is the expected decrease in penalty1 if I wait
w is the expected total decrease in penalties 1 and 2.
NB – we were told to minimise our expected penalty. However in reality, I think students try to optimise their chances of getting a particular grade (eg, a pass). If my career depend son this course, I’m also quite likely to be optimising my probablity of obtaining a particular grade.
Silas Barta – the teacher doesn’t need to include the “greater than or equal to” caveat, as the probability of picking the same real number from a uniform distribution twice is zero. Of course, he’s actually picking from the uniform distribution on a finite set of rationals, but we can presumably ignore that for the purposes of the problem.
@John_Faben: Oops, I guess I missed that it never said they were integers. So you’re right, the “or equal to” assumption can be removed. And my summations replaced by integrals, and my probability/likelihood vectors replaced by probability distribution functions over the [0 100] interval.
Thomas Bayes:
I believe I disagree with your steps 5 and 6. The reason is that the cost of “waiting another day” is sometimes less than the cost of deciding to guess tomorrow instead of guessing today. This is because, if you wait another day, you are not locked into guessing on that next day – you can guess any time in the future. It is possible that the expected penalty if I guess today is 5, tomorrow is 6, and the next day is 4. In that case, I believe your math would suggest that I guess today, whereas really I should wait. (I actually didn’t do the algebra you did in step 6, but I assume that it is based on what you did in step 5. If this is wrong, then you can disregards this comment.)
In the case where I divide the first penalty by 10, and where there are 100 days in class (excluding day 0), I get that, for example, day 12 is the first day on which you might guess, if the # of lowers is either 0 or 12. On day 99, you should guess rather than wait if the # of lowers is less than 33 or greater than 66.
I should have mentioned that my math is mostly the same as Thomas Bayes, except for what I mentioned. I decide whether to guess or wait recursively, where the expected penalty each day is the min of 1) the penalty for guessing on that day and 2) the expected penalty of the next day.
Jonathan Campbell: This is the issue I was referring to when I said that this may not be the optimal way to decide. A better thing to do would be to calculate and compare the expected cost for not making a guess today with the expected cost for making a guess today. As you point out, the expected cost for making a guess tomorrow is not the same as the expected cost for not making a guess today.
I haven’t thought about this problem enough to see how to compute the expected cost for not making a guess today. It appears to involve an exponentially growing chain of costs and probabilities. (Each additional day has two possible events: NOTD can be less than or greater than the secret number.) Perhaps there is a nice trick for addressing this, or, perhaps, there is a better approach altogether. I suspect Professor L will show us the way soon.
Thomas Bayes: I don’t know that there is a closed form way to solve the problem. What you can do, though, is the following:
Define F(n,k) as the expected penalty you will incur, given that it is the nth day, and there have been k “lowers” (k = N_s in your notation), and that you will play the game optimally.
If n = # of days in the semester, then F(n,k) = var[secret | k, n] + log(n) (your formula).
If n < # of days in the semester, then F(n,k) = the min of the following 2 things (1 represents expected penalty if you wait, 2 represents expected penalty if you guess today):
1) (k+1)/(n+2)*F(n+1,k) + (n-k+1)/(n+2)*F(n+1,k+1)
2) var[secret | k, n] + log(n)
Thus you can recursively find F(n,k) for any n and k, starting with n = # of days in the semester and moving backwards.
You should guess for all n,k where F(n,k) = var[secret | k,n] + log(n), i.e. where the expected penalty is lower if you guess today. If this is not true, you wait instead.
Note that your strategy depends not only on n and k on any given day, but also how many days are in the semester.
Using this method led to the results that I noted in the 2nd paragraph of my 1st comment.
The simple answer is to guess when the cost of getting the answer wrong is less than the cost of waiting another day. The cost of waiting another day is easy to calculate, growing with the ln of the day. The cost of guessing is a little harder, but it should shrink exponentially. My back of the envelope calculate tells me to guess on day 27.
The most likely number is not the same as the number that will minimise the penalty. If day 1 is (less than), then the most likely number is 100. However, this will not minimise your expected penalty.
The answer of course is 42.
I like Dave’s answer though, it reminds me of prez schwarzenegger in the simpsons movie:
[quote]im going to guess on day 1. if he says larger, i will guess 25 and if he says smaller, i will guess 75.[/quote]
I was elected to lead, not to read!