Technical overview of the CSIR’s election results prediction methodology
The SABC and the CSIR have been in partnership to bring South Africans early election result predictions since 1999. But doing that is not a matter of simple mathematics or science.
CSIR’s election prediction model relies on two core principles relating to voting behaviour, namely:
- Voters do not randomly allocate their electoral preferences but are influenced by political, socio-economic and demographic factors, as well as past voting history; and
- Changes in voting behaviour between one election and the next are also not random, but are correlated with past voting behaviour, demographic and socio-economic factors.
These two principles combined allow the CSIR team to group voters (or rather voting districts) together based on their past voting behaviour (using a statistical clustering method) and to then expect that any changes to voting behaviour in the new election will be fairly similar within each group.
Therefore, when the early results come in, the model can use this sample of results to calculate new voting behaviour for the groups of voting districts and then use these calculations to impute values for the remaining voting districts that have not yet been counted. The known results for the counted voting districts can then be aggregated together with the predicted results for the uncounted voting districts to get a final prediction.
Thus, the model is able to reduce the bias resulting from the “non-randomness” of the incoming results, which arises from the order in which results are received, so as to narrow the gap between what the current scoreboard is showing and what one can expect as a final result.
The CSIR’s election prediction methodology was initially published in scientific journals in 2005 and 2006 and so model details can be found in Greben et al. (2005) and Greben et al. (2006).