Monday, June 27, 2011

Deterministic-Continuous models vs. Stochastic-Discrete models

Recently, I've been in several workshops/discussions where several people were using continuos numerical models rather than ABM. A very common question from these guys are along the line of "why we should use ABM when we have mathematical models?". The argument is a known controversy (see McElreath and Boyd 2007 for a typical critique of ABM from a mathematical standpoint). Generally ABM is considered weak as it is computationally far more demanding, cannot be solved analytically and requires the exploration of a phase space (which sometime is not done, or not done properly) which is often too large. ABM users defend themselves by pointing out the importance of stochastic components, and the difficulty to represent complex behaviour which cannot be represented (easily at least) with a set of differential equations. My position stays somewhere in the middle, as I know how sometimes the temptation is to go for ABM due to my mathematical deficiency as a modeller. Ideally I try to invest on both types of modelling, and not to be epistemologically confined in one or another. Having said that one aspect which gives an advantage to ABM is the capacity to model dynamics with small population sizes.
While trying to verify the ABM of my doctoral project, I found a stunning example of this. In order to test if my ABM was working properly, I've created a mathematical equation of the same model, to see if their behaviour were consistent. Of course there were small differences, the mathematical formula was not discrete and non-stochastic. Thus for instance, if the population reproductive rate is 0.3, then the new population size will be:

old population size + old population size * 0.3

This means that if the original population size was 7, then the new population size would be 9.1. In the stochastic discrete model the different is that the population size is always an integer, and that the reproduction is basically modelled through a random pick from a binomial distribution as follows:

old population + Bin(old population, 0.3)

where the Bin() stay for a random pick of a Binomial distribution, with number of trials = old population, and success rate 0.3. In this case the most typical outcome will be the same (9.1), but there will be chances for smaller or larger number (you can visualise this in R, with hist(7+rbinom(100,size=7,prob=0.3);Notice that the mean of such frequency distribution will be roughly 9.1).


The first figure shows an example which compares a single run of the ABM (solid line) with the expectation derived from the numerical equation. Although the exact values are slightly different, the qualitative long-term outcome of the two cases are identical.

The second figure shows the same model with different parameter settings. The deterministic/numerical simulation expects an equilibrium around 4 individuals (actually 3.824 persons) while the discrete/stochastic model shows a second increase in population size at around timestep 170. The model actually continues such dynamic (not shown here) with other instances of growth and decline.


The third figure shows the very same model, expect that this time one of the parameter were set in a way that the population size was 25 times larger. Although the timing between the deterministic and stochastic model is slightly different, the overall shape, along with the long term equilibrium appears to be essentially the same.

So what is the point of all these figures? If you are dealing with low population sizes, the fact that your model is deterministic and continuous matters, and, depending on the model, matters a lot.  This is not just a problem of dealing with continuous numbers (e.g. 1.5 persons) but also because the consequences of small fluctuations are far more bigger in smaller population (the greatest real-world example is probably the effects of genetic drift). If you are dealing with large populations, chance events will be averaged out before it effects determines any qualitative change in the system. If you are dealing with smaller population, chance events might radically condition the output. This happens because some basin of attractions are so small that can be captured only when we deal with non-discrete population sizes. For instance the first figure had an equilibrium of 3.824. In a discrete model we have either 3 or 4 individuals, and both are likely to be outside of such basin. If you have a model which is scaled by x25 (as in the third figure) we will have an equilibrium of ca 95.59 individuals, and the discrete model can get closer to this (95 or 96), allowing to reach (and stay) within the basin of attraction. If the process we are trying to model is occurring at smaller population sizes, continuous model becomes unrealistic as decimal values are no longer an acceptable starting assumption. Of course you could tweak the mathematics and reason in discrete terms, but this becomes complicated and requires additional assumptions on how to convert continuous outputs into discrete values. In that case ABM is a robust alternative which can definitely give some contribution despite the known drawbacks.