The Ancestor Paradox Revisited

The previous article dealt primarily with what might be called “the paradox of the missing ancestors”:- We have two parents, four grandparents, eight great-grandparents and so on; the number simply doubles with every generation. The trouble is that this doubling procedure soon gives improbable results. Just 22 generations ago we would all have had just over 4 million ancestors. Now that would be around 1300 (allowing 30 years per generation) when the population was also about 4 million. Can we possibly believe that virtually everyone alive at that time was an ancestor of each and every one of us and, furthermore, that most members of earlier generations must head not one, but several, lines of descent to each of us?

Eventually and reluctantly I concluded that we must indeed accept this somewhat astounding assertion but only after I had thoroughly tested and rejected the usual explanation of the phenomenon – that occasional marriages between relatives reduce the required number of ancestors to more acceptable levels. They do not! As I stated in the original article, even if every marriage in every generation was between second cousins, a quite unbelievable situation, we would still run out of people to be our ancestors within 29 generations, say 1100.

To be honest I did not really believe it myself, although I could find no flaws in the arguments, and I expected to be inundated with correspondence from demographers and mathematicians among the readership pointing out my errors. My expectation was not fulfilled; I received only one letter and that related to the earlier part of the article concerning the uncertainties inherent in tracing the male line. Did everyone accept that we do indeed descend from the whole population around 1300? Did nobody find this as incredible as I did? Or had I expressed myself so badly that nobody had understood what I was trying to say.

Since the article appeared I have discussed it in detail with many friends and colleagues in an attempt to understand the problem more fully. As a result a few new ideas emerged. The first – mobility, or rather the lack of it before, say, the industrial revolution of the late eighteenth century – seemed to be a major threat to the origin conclusion. Basically the idea is that until fairly recent times people did not move around the country to any great extent. We might well find that our immediate ancestors came from quite widely scattered locations. but we will then find that earlier generations will almost all originate from the vicinities of those same few locations. If this is true then perhaps all we can say is that we descend from the entire populations of a few distinct areas.

So, to use my own family as an illustration, I have great-great- grandparents who were born in St Petersburg, Norwich, Plymouth and Callington (Cornwall) and twelve who were born “locally” – within 50 Km of Newcastle. My information on previous generations is far from complete but I know of only one born outside these areas and he came from Scotland. So I cannot refute the objection, perhaps it is true that most of my ancestors came from these places.

Looking at early parish registers seems to confirm the view that people did not move very far. Marriage partners were usually “both of this parish” and most of the remainder would involve one partner from a neighbouring parish. Very few involved partners from more remote areas and they were nearly always from the wealthier sections of the community.

Assuming that this was generally true, how would it affect the geographical distribution of our ancestors? Would it really mean that all our forebears were born in a few parts of the country? There is no way to be sure but there is a way to form an impression – mathematical modelling. This is a very useful and widely used technique which can test out the likely consequences of hypotheses affecting the real world when the general rules can, at least, be estimated. Normally the technique is used to predict the future – perhaps global weather pattern changes – but in this case it is used to guess at a possible distribution of ancestors around the year 1300 which could have resulted in the known distribution of certain of their descendants. All such models need two basic ingredients; the initial conditions – or, in this case, the final conditions – and a rule which will enable the next – or, rather the previous – set of conditions to be determined. Here the “initial” conditions are the birthplaces of my great-great-grandparents (excluding the Russian), and the “rule” is a guess at the spread of distances between the birthplaces of one generation and the previous one.

Birth of g-g-Grandparents
Blanchland Rookhope Allenheads
Sparty Lea Wolsingham Satley
Kenton Ninebanks Birtley
Lambton Lanchester Windy Nook
Plymouth Callington Norwich
 0 Km  5%
 3 Km (2 miles) 30%
 9 Km (6 miles) 30%
15 Km (9 miles) 20%
30 Km (19 miles) 10%
60 Km (37 miles)  5%

The distance/proportion table is necessarily somewhat arbitrary but it is based loosely on an analysis of a small sample (530) of eighteenth century baptismal records which fortuitously gave the parents’ birthplaces. The very occasional larger distances – as much as 235 Km in one case – were ignored, so the table is certainly on the conservative side.

Having fixed these parameters we can calculate a possible birthplace for a parent of any particular ancestor by randomly choosing one of the distances from the table (in such a way that there is a 5% chance of it being O Km, a 30% chance of it being 3 Km and so on) and a random bearing (direction) between 0 and 359 , then plotting the position at the chosen distance and bearing from the ancestor’s birthplace. If this is done twice for each of the 15 great- grandparents we will have 30 possible birthplaces for the members of the previous generation. We can repeat the whole process for any required number of generations; at each stage we start with the birthplaces of one generation and end up with the birthplaces of the previous one.

In theory this could be done by hand but it would be a tedious and time consuming exercise; with a small computer it is easy and quick. Locations can be stored as grid references, random choices can be made without resort to picking from a hat and trigonometric calculations take only a few microseconds. To avoid complications minor geographic details such as mountains and other barriers to habitation and movement are ignored; only the Coastline is considered and, if a chosen distance and bearing happen to give a birthplace in the sea, a further random choice is made. The computer programme was designed to automatically work through 17 generations, that is 21 generations back from me, and in doing so it multiplies the original 15 locations to nearly 2 million. After the calculations a map is printed out showing the coastline and the distribution of ancestors in that earliest generation. A dot on this map represents a 3 Km square containing at least one ancestor.

distributionWhat was the outcome? It was quite surprising. The programme was run several times and, although each run differed in detail as we might expect, every single one showed fairly complete coverage of England and Wales and much of Scotland. Judge for yourself, the example reproduced here is typical.

Of course this in only an idealised model and it cannot be taken too seriously. All it shows, and shows quite conclusively, is that the cumulative effect of several quite small movements – perhaps a girl marrying into the next parish or a family moving into a town from the country – is quite sufficient to ensure that our ancestors 21 generations ago could well have been spread over much of the country.

The next two ideas actually reduce the number of generations needed to reach the point where almost everyone would have to be our ancestor. In previous calculations I have compared the number of ancestors at a given time with the total population, but the population comprises members of perhaps three or four generations. What proportion of the population makes up one generation? A generation is really a rather hazy concept but a little thought will show that what we really need is the number of children born in a thirty year period who will survive to marry and have children themselves.

Although this might seem difficult to quantify, all we have to do is to move forward sixteen years and consider that those aged 16 to 45 constitute a generation. This is fairly easy to estimate. Today rather less than 40% of the population are in this age group, in earlier times there was a greater proportion below 16 than there is today – because a quarter or more never reached 16 – but this was more than offset by the much smaller numbers of older people – the average life expectancy for men was only 41 years as recently as 1871. The result was a proportion of about 45% in the range 16 to 45 years and this figure was probably valid for many centuries. So perhaps we should compare the number of ancestors with 45% of the population rather than all of it. This would bring forward the date when our ancestors comprised most of the population to about 21 generations ago, and I do mean most of the population because if we were descended from this 45% we would also be descended from most of the rest because they would also form the previous and next generations.

No, its not as simple as that; there is another complication – some lines will have disappeared. Some people had no children, others had children but no grandchildren and so on. The proportion with no children was reasonably constant until recent years – about 10% never married and a further 8% married but had no children – 18% of a generation whose lines died out immediately. How many more will have died out after one, two or more generations? It can be worked out using elementary probability theory but we need to know the proportions of various family sizes because these clearly affect the chances of anyone leaving descendants. I used figures from the period 1870 and 1879 – before family planning became a factor – and assumed that they applied to previous centuries too. It might be of interest to note that more than half of all families at that time were of 5 or more, and 11% were of 11 or more!

The results were surprising. Only 1% would have children but no grandchildren and only 0.1% (1 in a 1000) would have grandchildren but no great-grandchildren. From the way these proportions are decreasing it will come as no surprise that the chances that anyone had great- grandchildren but no great-great-grandchildren are infinitesimal. Indeed we can conclude that if anyone had children and grandchildren (81% of the population) it is virtually certain (99.9% chance) that they would have great-grandchildren too and descendants in all later generations. This is no longer true because of much smaller family sizes, in fact more than half of today’s population will have no descendants within three generations.

So we no longer have the whole population to compare with the number of ancestors or even 45% of it, all we have is 81% of 45% or 36% of the population. On this basis the numbers match about 20 generations ago and a large part of this and earlier generations would have to be our ancestors. Quite amazing isn’t it?

© Brian Pears 1991, 1998, 2006