I wanted to know how much ancestry I have considering the Bronze / Iron Age. Below is the best model I made, with p-value of 0.173 , low SE, all z > 2 (statistically supported).
Before I’ve tried to reconstruct the two Indo-European waves into Iberia separately, the Bell Beaker in the Bronze Age and the Celts in the Iron Age. I try to make some models with Portugal_BA and several Central European populations like Halberstadt (Urnfield) , Knoviz , Hallstatt or La Tene, but they sit on a shared cline and create instability, high SE and not good p-values.
When the sources sit on a shared cline that creates collinearity, qpAdm struggles because left and right are similar. That doesn’t mean the model is invalid, it means that the proportions become unstable, the SEs inflate and the interpretation becomes muddy.
By the Iron Age, Iberia already had a stabilized Bell Beaker-derived gene pool with continental interaction embedded in it. Here I use an already admixed historical Iberian population, this is statistically cleaner and historically more realistic. This model avoids the earlier collinearity problem because Spain_IA_Celt already includes Iberian Bronze Age substrate with some steppe ancestry (via Bell Beaker) and Iron Age continental input. Here I’m no longer trying to separate gene flows from the Bronze Age vs flows from the Iron Age, I collapsed them into a single historical population, that removes artificial dimensionality.
This model says that my ancestry is statistically consistent with being primarily derived from the genetic profile of Celtiberian-like Iron Age Iberia, with a modest eastern Mediterranean–related component and a small North African component.
It does not mean that I’m 80% Celtiberian, it means that my ancestry fits within that Iron Age Iberian genetic structure. That it is statistically compatible with that genetic profile. The Italy_Imperial_oAnatoliaCaucasus component is not equivalent to “Roman ancestry”, it is a proxy for Eastern Mediterranean / Anatolian-shifted drift. This could reflect Punic-era or Roman/post-Roman-era gene flow. This model is statistically stable, historically coherent, not overparameterized and avoids collinearity.
Spain_IA_Celt represents three individuals from La Hoya in northern Spain.
The model works well also for some samples of Spain_IA, that occupy the same position on the PCA than Spain_IA_Celt.