Btw have you tried using the HO set? I've noticed some studies are using the HO set for PCA as the ancient samples have been filtered on the Human Origins dataset. Also the lower snp count is less of an issue with PCA as the point of it is to reduce complex data to be visualised in 2 dimensions.
Okay. Probably don't worry about the filtering and LD-pruning – I think it's more pertinent to admixture breakdowns than PCA. Just make sure you only use modern populations to set the vectors. Then you can project the ancient (and modern) plots on top of those.
This article here explains some of...
Probably because most of the ancient samples have large amounts of missingness which has a wildcard effect, matching variants when it shouldn't thereby reducing accuracy. You will probably need to filter out some of the missingness and maybe prune for Linkage Disequilibrium. What format are your...
Well you're ahead of me in the practical use of qpAdm. As for myself I've actually only got as far as installing AdmixTools2 and loading the graphical interface, lol!
You can change the .fam or .ind file anytime before or after merging. Merging doesn't set the case/control selection into concrete so you don't need to remerge again if you change your mind.
When you convertf your Plink files to PAM it adds your family ID to the front of your individual ID...
I don't think any of the tasks that you would do in Admixtools would be affected. If so it would be mentioned in the manual.
However because I often use ADMIXTURE I do all the QC work in Plink. So if I filter for hwe, the filter will by default omit case individuals.
0 0 2 2 as I suggest putting yourself and family as ‘case’.
But it doesn’t affect most analysis. -9 is undefined, 1 is control. When using convertf you must change all the -9 to either 1 or 2.
You can merge them all into the same v62_HO set. For the females you can change the 3rd number (sex) manually to 2 (female).
If you want you can set yourself and your other family members to 'case' instead of 'control'. The equivalent in Plink is the 4th number: 2 for case, 1 for control...
No, only in the convertf par file. Then you have to merge your individual file with the AADR subset file, not with the full AADR.
Although the ancient samples are the same labelling in both datasets, the ones in the HO set have been merged with the Human Origins array which has resulted in only...
Great to hear. Yeah the HO set has more samples but less snps. If you ever want to do an analysis with the higher quality samples from the 1240k set then you can use the subset option in convertf.
.anno file is just information on the samples. It's now in .xlsx format.
Here's an example of a convertf par file for subsetting:
genotypename: v62.0_1240k_public.geno
snpname: v62.0_1240k_public.snp
indivname: v62.0_1240k_public.ind
outputformat: PACKEDANCESTRYMAP...
I googled the “Killed” message in Linux and yeah it seems it’s likely a case of out of memory. You might be able to tweak some settings on your computer to utilise the memory more efficiently.
A couple of options. Either try merging with the v62.0_HO set which is a smaller file; or use convertf...
I only added the outputformat parameter to the main 9 however I have always had it located between the 2nd fileset and the merged fileset – in other words on the 7th line. Also I didn't have any empty lines, so it was just 10 lines with no line spaces in between.
You could try that and if it...
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.