Repetitive Sequences: High-identity repetitive elements were assembled de-novo from k-mers (K=31) that were abundant in sperm and blood reads, with k-mer counting via Jellyfish version 2.2.4 and assembly using Velvet version 1.2.10. Copy number thresholds for abundant k-mers set at 3X modal copy numbers for 31-mers: 165 for sperm and 180 for blood. Abundant k-mers from sperm and blood were combined and used as a single-end reads for Velvet running with 29-mers. These analyses resulted in de novo repeat library with 130,632 sequences (overall length ~11Mb with individual contigs lengths range from 57 bases to 15.5 kb). These repeats were annotated using RepeatMasker version open-4.0.5 and repeat libraries generated for the germline assembly and from Repbase (repeatmaskerlibraries-20140131: “vertebrate repeats”). For downstream analyses we used a set of model repeats representing the union of de novo repeats, those identified within assembled genomic sequences via RepeatModeler and an updated assembly of the previously-identified Germ1 element. Enrichment analyses were performed by separately aligning paired end reads from blood and sperm DNA to the repeat dataset. As with single-copy sequence, alignments were pre-filtered to exclude unmapped reads and supplementary alignments. The remaining data were processed to generate average coverage ratios for intervals of approximately 100bp.


