Petromyzon marinus repeats

Overview
Analysis NamePetromyzon marinus repeats
MethodRepeatMasker (open-4.0.5)
SourceRepARK_B_S_auto_collapsed.fasta.gz
Date performed2017-12-13
View In Genome Browser

Repetitive Sequences: High-identity repetitive elements were assembled de-novo from k-mers (K=31) that were abundant in sperm and blood reads, with k-mer counting via Jellyfish version 2.2.4 and assembly using Velvet version 1.2.10. Copy number thresholds for abundant k-mers set at 3X modal copy numbers for 31-mers: 165 for sperm and 180 for blood. Abundant k-mers from sperm and blood were combined and used as a single-end reads for Velvet running with 29-mers. These analyses resulted in de novo repeat library with 130,632 sequences (overall length ~11Mb with individual contigs lengths range from 57 bases to 15.5 kb). These repeats were annotated using RepeatMasker version open-4.0.5 and repeat libraries generated for the germline assembly and from Repbase (repeatmaskerlibraries-20140131: “vertebrate repeats”). For downstream analyses we used a set of model repeats representing the union of de novo repeats, those identified within assembled genomic sequences via RepeatModeler and an updated assembly of the previously-identified Germ1 element. Enrichment analyses were performed by separately aligning paired end reads from blood and sperm DNA to the repeat dataset. As with single-copy sequence, alignments were pre-filtered to exclude unmapped reads and supplementary alignments. The remaining data were processed to generate average coverage ratios for intervals of approximately 100bp.

Downloads

Repeat FASTA For downstream analyses

Repeat FASTA used with MAKER to generate Alignment GFFs