• Don't want to see ads? Install an adblocker like uBlock Origin or use a Europe-based privacy-friendly browser like Vivaldi or Mullvad.

4500 years of human settlement in Southern England across 606 ancient DNA samples

Capitalis

Junior Member
Messages
22
Reaction score
28
Points
13
Location
Wales
Ethnic group
Northwest European
Y-DNA haplogroup
U106>DF96>S15663
mtDNA haplogroup
H13a1a
Hello all. I've put together a little amateur study and welcome any observations or criticism. I tried talking my work through with Gemini AI but after a while using AI it feels like a cheerleader and becomes a shade too agreeable for my liking.

I made a few PCA plots using Eurogenes Global 25 data. Hopefully the plots are self-explanatory, but I added some comments to each plot as guidance. All PCA plots display PC 1 vs. PC 2. None of the other PCs reached significance on the scree plot.

First, a map of the study area, namely Southern England (SW England, SE England, London, East Anglia).

Next, a time transect in PCA plot form of samples from Southern England. I perhaps should have made it clearer that the "Viking" label applies to the CE batch, not the BCE batch that occupies the same position.

Next, UMAP plots from Genetic history of Cambridgeshire before and after the Black Death (2024), annotated by myself.

Next, the same time transect as before, but drawing a parallel to the "professional" UMAP plot.

Next, my LivingDNA results, to support my ancestry being correct.

Next, a time transect of Belgium and a comment on Normans.

Next, using Medieval Belgium as a proxy for the new ancestry source in Medieval England, in PCA plot form...

...and in Vahaduo modelling form.

Lastly, a model of my own ancestry in PCA plot form...

...and in Vahaduo modelling form.

If anyone wants the scaled G25 coordinates of the averages, here they are:

Code:
Southern_England_1000–1249_CE_[26]_average,0.1224036154,0.136979346,0.0576125,0.036474154,0.040906962,0.013418846,0.001039423,0.005502731,0.009274346,0.009791654,-0.004284538,0.005896615,-0.012247308,-0.010263538,0.016046269,0.004451846,-0.006975615,0.000843,0.001919269,0.002308769,0.005912577,0.002064038,-0.002185269,0.008508885,-0.000515808
Belgium_1000–1249_CE_[177]_average,0.1242665085,0.1378193729,0.0523238362,0.0314879379,0.0392336384,0.0105048814,0.0028293107,0.0057911695,0.0074830452,0.0090222034,-0.0041587966,0.0054146610,-0.0116005537,-0.0086850056,0.0137154181,0.0033761469,-0.0056337401,0.0015224011,0.0033710960,0.0017211469,0.0034085254,0.0028342034,-0.0002631977,0.0082931751,-0.0005554124
Ireland_1000–1249_CE_[27]_average,0.1298848148,0.1331847778,0.0611774444,0.0503999630,0.0371122593,0.0189129259,0.0020453704,0.0051793333,0.0034616667,0.0022947407,-0.0066698889,0.0059002963,-0.0148219259,-0.0166115185,0.0251534815,0.0087264815,-0.0081997407,0.0020457778,0.0032215556,0.0052941852,0.0044551481,0.0037508519,0.0010224074,0.0098183333,-0.0001109259
Southern_England_750–999_CE_[82]_average,0.1312991098,0.1338641220,0.0666491829,0.0564265244,0.0418651585,0.0207501585,0.0050840854,0.0060898171,0.0020152927,-0.0011156463,-0.0046656463,0.0053220732,-0.0111350732,-0.0103888415,0.0236781829,0.0079521829,-0.0087102683,0.0030636707,0.0044853049,0.0047034756,0.0072600488,0.0042207927,-0.0001412561,0.0149020244,-0.0001708415
Belgium_750–999_CE_[56]_average,0.1247582500,0.1382751964,0.0534434107,0.0303100893,0.0403426429,0.0099803214,0.0020143036,0.0041124464,0.0077609464,0.0094046607,-0.0030738393,0.0059919643,-0.0109398036,-0.0090953929,0.0130533929,0.0051922679,-0.0023236250,0.0015633036,0.0044263750,0.0024565000,0.0023685893,0.0029433393,0.0006381964,0.0092847143,-0.0012958214

Thanks for reading. :)
 
Thank you to the moderators for making my post available. I can also add a supplementary section, containing the scree plot that shows PCs 3 and above are below the broken stick, hence being theoretically "uninterpretable".


However, I think it is of interest that my model is still visibly acceptable on PCs 3 and 4.



 
1773101353199.png
 
Re: using Medieval Belgium as a proxy for the new ancestry source in Medieval England, the relevant samples as individuals in PCA plot form.

And certainly of less general interest, thanks to the recent Akbari et al. 2026 data dump, we now have three more ancient DNA samples descended from R-U106>S15663, in addition to the already published Napoleonic soldier who died in Portugal. My surname is nominally Norman, originating in Wiltshire.
 
It may be of interest to read my back and forth with Google Gemini AI (link to PDF file below), discussing the PCA plots and Vahaduo models. Sometimes the AI uses terms such as Continental Northern European (CNE) when I believe it means CWE, but overall it's a decent sparring partner and can follow an argument and contribute supporting data very well.

Google Gemini AI analysis of Southern England genetics
 


Target: Flann_Fina
Distance: 1.6981% / 0.01698102

52.6 Southern_England_1000–1249_CE_[26]_average
28.0 Southern_England_750–999_CE_[82]_average
19.4 Belgium_750–999_CE_[56]_average


Distance to: Flann_Fina

0.01874570 Southern_England_1000–1249_CE_[26]_average
0.02253248 Belgium_1000–1249_CE_[177]_average
0.02304809 Belgium_750–999_CE_[56]_average
0.02560514 Ireland_1000–1249_CE_[27]_average0.02887837Southern_England_750–999_CE_[82]_average
 
Also, a hierarchical clustering of the entire PCA by group. Note that modern Normandy clusters with Belgium (Flanders) in the 750–1499 CE period.

I will be producing a hierarchical clustering of all north / west European samples from the medieval period once I've removed the low coverage samples, and will provide G25 scaled coordinates for the population averages.
 
Target: Flann_Fina
Distance: 1.6981% / 0.01698102

52.6 Southern_England_1000–1249_CE_[26]_average
28.0 Southern_England_750–999_CE_[82]_average
19.4 Belgium_750–999_CE_[56]_average


Distance to: Flann_Fina

0.01874570 Southern_England_1000–1249_CE_[26]_average
0.02253248 Belgium_1000–1249_CE_[177]_average
0.02304809 Belgium_750–999_CE_[56]_average
0.02560514 Ireland_1000–1249_CE_[27]_average0.02887837Southern_England_750–999_CE_[82]_average

Best I could do:

This is modelling using the group averages, rather than individuals. Ireland_1000–1249 CE produced a zero value in Vahaduo even before I added Orkney, and looking at PC3 we can see why. Orkney_750–1249 CE is seemingly a better proxy for medieval Scotland. From memory, these are individuals found in a Viking context but without Scandinavian ancestry.
 
Best I could do:

This is modelling using the group averages, rather than individuals. Ireland_1000–1249 CE produced a zero value in Vahaduo even before I added Orkney, and looking at PC3 we can see why. Orkney_750–1249 CE is seemingly a better proxy for medieval Scotland. From memory, these are individuals found in a Viking context but without Scandinavian ancestry.
Thank you, that’s very interesting! It has been quite awhile but if I remember correctly, I used to use some of the Orkney samples as proxies because I was fairly close to the ones that were primarily Scottish back then. Judging by the addition of the Sicilian average, I’m glad to see you still have my data and have figured out who I am. ;-)
 
Last edited:
It may be of interest to read my back and forth with Google Gemini AI (link to PDF file below), discussing the PCA plots and Vahaduo models. Sometimes the AI uses terms such as Continental Northern European (CNE) when I believe it means CWE, but overall it's a decent sparring partner and can follow an argument and contribute supporting data very well.

Google Gemini AI analysis of Southern England genetics
This is the PCA plot that the AI requested, comparing Cambridgeshire in 1000–1249 CE and 1250–1499 CE, possibly showing a shift towards the Netherlands and/or Northwest Germany.
https://postimg.cc/JskrHbww
 
This comes from A genetic perspective on the recent demographic history of Ireland and Britain (2025).

We chose to split ROH and IBD segments into the corresponding bins (a) 1 to 3cM [1,3cM), (b) 3 to 5cM [3,5cM), and (c ) greater than or equal to 5cM (≥5cM). Using the methodology above, these length bins correspond to 100 generations ago, 40 generations ago and 15 generations ago respectively. An important caveat is that the estimates have very wide distributions, as well as the aforementioned assumption of population size.

If we take a generation as 28 years, then we have:

[1,3cM) ≈ 100 generations ago ≈ 776 BCE
[3,5cM) ≈ 40 generations ago ≈ 904 CE
(≥5cM) ≈ 15 generations ago ≈ 1604 CE

As an alternative to the PCA plots of IBD sharing which include all of the Isles, I've created PCA plots isolating the IBD sharing at 3-5 cM (i.e. ~ 904 CE ) for,

The Isles and France
https://postimg.cc/DmfvzzkR

The Isles and Belgium
https://postimg.cc/KkL2YQn5

The Isles and Germany / Poland
https://postimg.cc/jCrzNg7f

All data is from the Supplementary Tables.
 
Last edited:
I wrote another amateur research paper in collaboration with Google Gemini AI. The paper can be downloaded in PDF format from the link below.

The hidden genetic connections between the British and Irish Isles and Norman Europe in the High Medieval period
I see that my Early Modern Europe research paper was removed. I fully accept the site owners' right to do so, but a private message to explain what the issue was would be appreciated, so I know what not to share in the future. :)
 
I see that my Early Modern Europe research paper was removed. I fully accept the site owners' right to do so, but a private message to explain what the issue was would be appreciated, so I know what not to share in the future. :)
Sorry I thought you posted twice the same message.
 
Back
Top