RUU #19: Inference on the free binomial model
Uploader Comments (trondreitan)
Video Responses
All Comments (18)
-
ok i'll read all your replies here and tell you if i could solve my excel problem :)
-
(cont 4) When taking the continuous limit, this sum of binomial distributions is known as the beta-binomial distribution, by the way. It is the predictive distribution for the number of successes in new trials in a binomial model with uncertain success rate, p. Ideally, this example should show the difference between a specific model prediction (known p) and and an overall prediction based on previous data.
-
(cont 3) The Pr(x|p) is of course the binomial distribution for a given success probability. Since we are not certain, after 100 trials, what the success probability is, we get a weighted sum of binomial distributions, using what we have learned about the success probability, p, from the previous trial. This sum of binomail distributions is not itself a binomial distribution. It is more spread out than what you get from a max likelihood estimation, because we are not certain about the value of p
-
(cont 2) The code "pred=pred+p.post[i]*dbinom(x,
n2,p[i])" gives a probability-weighted sum of the binomial distribution. The principle behind this is rule 6 in clip 2b: Pr(A)=sum Pr(A|B_i) Pr(B_i) where the sum runs over all possible B_i's in a partition of models. In this case the p's take the role of B_i and the outcome x takes the role of A. The probabilities in question are the posterior probability after handling the first 100 trials, which then form the prior for handling the next 50. -
(cont) The possible success probabilities can range continuously from 0 to 1, but since I wanted to avoid taking the continuous limit (as that would involve integral calculus and make the mathematical aspect much heavier). So instead I choose a discreet range of possible values for 'p': 0.0, 0.01, 0.02, ... , 1.0. Since 'p' is the 'model' part of Bayes equation here, while 'x' is the data part, there is no need for these to match in numbers.
thx Dr trond
i've problem in drawing the binom distribution cause after 51 values of x it give error and didn't give any values for the rest of 101, in excel the function called BINOMDIST(no of success trials, no of indp trails, probability of success of each trial, false) = dbinom(x,n2,p[i]), so no of success trials should equal probability of success of each trial
can this method use with categorical prediction, or only for numerical prediction
thx for your playlist , i learned alot form it
besbesmany 10 months ago
@besbesmany In the R program, a for-loop is used for going through all possible values for p. For each single possible value for p, dbinom(x,n2,p[i]) is called, where p[i] is the i'th value for p. Only x takes several values here, ranging from 0 to 50. It may be that excel needs a single value rather than an array, in which case you need a double for loop. One ranging over the possible outcomes and one ranging over the possible values for p.
trondreitan 10 months ago
(cont) It is the for-loop that then goes through all 101 values for p (a stand-in for the uncountably infinite possible values of p). Inside the for-loop, the routine calculates the outcome probability for all 51 possible outcomes for each possible value for p in one feel swoop using dbinom(x,n2,p[i]), since x is an array here. It returns an array of probabilities for each element in x. If excel doesn't allow vector inputs, you would instead need another for-loop going through x.
trondreitan 10 months ago
(cont 2) It may be valuable to try to run the code using R so you see what is intended. If you just repeat the name of the variables or the functions, you can see what they contain. R is free, so anyone can use it. It may then be easier to replicate the results in excel.
trondreitan 10 months ago
Dear trond , thanks alot for the video, i've a problem in prediction of 50 next battel pred=pred+p.post[i]*dbinom(x,n2,p[i])
i plot the case on excel but i couldn't understand the prediction dbinom(x,n2,p[i])
x is only 50 numbers but p[i] is 101 numbers how can i get this binomial distribution??
i can send you excel sheet to help me but tell me how
please make more prediction examples specially if we have database with categorical columns, how can i use the bayes rule and max likelihood
besbesmany 10 months ago
@besbesmany See also my series on "YT Identity Survey" for some more on testing with categorical data. In that case there are both several "columns" and "rows". Having both columns and rows means that the data have more than one categorical bin they fall into. For instance, on can both be a female and a christian. The object is to find out if different rows have different column probabilities or not, i.e. if there is dependency between the columns and rows (between gender and religion).
trondreitan 10 months ago