In an earlier post, I have explained how a Bayesian based method can replace standard chi-square technique. Since a Bayesian method is solely based on statistics, having a very few data points then has a significant consequence on the fit quality. To evaluate how reproduceble are the fit parameters (for a parabolic function),

I performed the following test: I created 10 data points and assumed given values for {*a*, *b*, **c**}. The resulting y-values then formed a parabola. I added normal noise to both x and y axis. I let the PyMC run the Bayesian fit for 100 times. That means we have 100 solutions for each parameter. Then I repeated the identical experiment but this time, I assumed 100 data points rather than 10. The following plot show the scatter of data points for parameter **b**.

Blue and red circles correspond to solutions for samples with 10 and 100 data points, respectively. 4.5 is the true **b** value (thick horizontal line).

It is clear that with more input information, we can better constrain the parameters. In this particular example, the solutions with ten data points (blue circles) returned on average *b*=4.39±0.78 while the ones with hundred data points (red circles) return a more accurate solution of *b*=4.41±0.25.

Read Full Post »