Petromyzon marinus

Full NamePetromyzon marinus
Common NameSea Lamprey

The sea lamprey is a member of an ancient lineage that diverged from the vertebrate stem approximately 550 million years ago (MYA). By virtue of this deep evolutionary perspective, lamprey has served as a critical model for understanding the evolution of several conserved and derived features that are relevant to broad fields of biology and biomedicine. Studies have used lampreys to provide perspective on the evolution of developmental pathways that define vertebrate embryogenesis, vertebrate nervous and neuroendocrine systems, genome structure, immunity, clotting and others. These studies reveal aspects of vertebrate biology that have been conserved over deep evolutionary time and reveal evolutionary modifications that gave rise to novel features that emerged within the jawed vertebrate lineage (gnathostomes). Lampreys also possess several features that are not observed in gnathostomes, which could represent either aspects of ancestral vertebrate biology that have not been conserved in the gnathostomes or features that arose since the divergence of the ancestral lineages that gave rise to lampreys and gnathostomes. These include the ability to achieve full functional recovery after complete spinal cord transection, deployment of evolutionarily independent yet functionally equivalent adaptive immune receptors, and the physical restructuring of the genome during development known as programmed genome rearrangement (PGR).

Programmed genome rearrangement results in the physical elimination of ~0.5 Gb of DNA from it’s ~2.3 Gb genome. The elimination events that mediate PGR are initiated at the 7th embryonic cell division and are essentially complete by 3 days post fertilization. As a result, lampreys are effectively chimeric, with germ cells possessing a full complement of genes and all other cell types possessing a smaller, reproducible, fraction of the germline genome. Previous analyses support the idea that the somatic genome lacks several genes that contribute to the development and maintenance of germ cells but are potentially deleterious if misexpressed in somatic lineages. However, our understanding of the mechanisms and consequences of PGR remains incomplete, as only a fraction of the germline genome has been sequenced to date.

In contrast to the germline genome, the somatically retained portions of the genome are relatively well characterized. Because PGR was not known to occur in lampreys prior to 2009, sequencing efforts focused on somatic tissues from which DNA or intact nuclei could be readily obtained (e.g. blood and liver). Sequencing of the sea lamprey somatic genome followed an approach that had proven successful for other vertebrate genomes prior to the advent of next generation sequencing technologies (Sanger sequencing of clone ends, fosmid ends and BAC ends). Due to the abundance of highly-identical interspersed repetitive elements and moderately high levels of polymorphism (approaching 1%), assembly of the somatic genome resulted in a consensus sequence that was substantially more fragmentary than other Sanger-based vertebrate assemblies. Nonetheless, this initial assembly yielded significant improvements in our understanding of the evolution of vertebrate genomes and fundamental aspects of vertebrate neurobiology, immunity and development.

This assembly represents the first assembly of the sea lamprey germline genome. Through extensive optimization of assembly pipelines, we identified a computational solution that allowed us to generate an assembly from next-generation sequence data (Illumina and Pacific Biosciences reads) that surpasses the existing Sanger-based somatic assembly. Analysis of the resulting assembly has revealed several hundred genes that are eliminated from somatic tissues by PGR and sheds new light on the evolution of genes and functional elements in the wake of ancient large-scale duplication events.

Add info about kPetMar1

Dr. Robb Krumlauf's Group, here at the Stowers Institute for Medical Research, is studying the Hox gene clusters in the sea lamprey and comparing their organisation with those of jawed vertebrates. The Hox family of transcription factors are encoded by genes that reside in genomic clusters and play key anterior-posterior patterning roles in multiple tissues during embryonic development. Hox genes exhibit segmental domains of activity in the developing head and neck that are remarkably similar between different vertebrate species. A remarkable feature of Hox gene expression during embryogenesis is that the order and timing of Hox gene expression along the embryonic anterior-posterior axis correlate with the relative positions of those genes along the Hox cluster. It is widely held that the emergence of gene regulatory networks governing Hox-dependent patterning of the embryonic brain and pharynx were fundamentally important events during the evolution of the complex vertebrate head. Thus, it is important to understand when and how these molecular cascades evolved in early vertebrates. As one of the only living jawless vertebrates, the sea lamprey is a crucial model organism for such endeavours, and research into the sea lamprey Hox genes is illuminating these important questions.

Feature Summary
The following features are currently present for this organism
Feature TypeCount
Community Annotation

Crowd Sourced Curation

Join the community effort to improve the gene models. We would like to generate a high quality set of gene models. This is only possible with manual gene annotation. The more curators, the more likely this task can be completed. To request an invitation to help curate, please email us at

Our Curation Editor

We use Apollo, an open source software project. Apollo is a web application that allows for many curators to edit gene models at the same time. You can watch other edits going on live even when they are happening across the country or even across the world. To request an invitation to help curate, please email us at

Apollo Documentation

Find information on how to use Apollo in the offical Apollo User Guide. There is plenty of information about how to get started and detailed information about manual gene editing. Send any questions to


To be sure that all annotations are of the highest standard and trustworthy, there is some information that is required so that the curation process by each individual is transparent.

All new annotations must have the following:

  1. Gene Information
    • Name: Gene Name
    • Symbol: Short name or symbol
    • Description: Informative description
    • Attributes:
      • Attribute: "Supported By"
      • Value: for :Supported By":
        • DNA Sequencing Reads,
        • RNA Sequencing Reads,
        • BLAST Alignment,
        • or custom value
    • Comments: Any comments that document the process of curating this sequence feature

    Annotations can have the following if they exist

    • Any Custom Attributes and values
    • DBXRefs: Accessions of this sequence feature in another database of the same species.
      • DB: Database name like GenBank
      • GenBank ID or other database ID for this feature in this species
    • Pubmed IDs: Pubmed IDs of any article that mentions this sequence feature in this organism
    • Gene Ontology IDs: Any GO ID that is associated with this sequence feature in this organism
    Genome Consortium

    The work of many diverse groups made this germline genome assembly and annotation possible.

    Group 1 speciality
    Group 2 speciality
    Group 3 speciality
    Group 4 speciality
    Group 5 speciality
    Group 6 speciality
    Genome Properties
    Property NameValue
    Assembly Scaffold Count 12,077
    Assembly VersiongPmar1.0
    Chromosome Number99 pairs
    Genome Size~2.3 Gb
    N5034 contigs contain 12Mb
    gPmar1.0 Genome Assembly gPmar100
    Transcripts PMZ_v3.1 Transcripts
    Proteins PMZ_v3.1 Proteins

    See Data Analysis page for a listing of the Analyses performed on gPmar100 genome, PMZ_v3.1 Gene Model transcripts or proteins. Downloads for each analysis can be found on the individual analysis pages, along with a description of the methodologies used.

    Data Analyses

    All data loaded into SIMRbase has an "Analysis Page". These pages provide information about the methods and provide a link to download the files.

    Namesort descending Program Source
    Petromyzon marinus Genome Assembly (kPetmar1) kPetmar1 GCF_010993605.1_kPetMar1.pri_genomic.fna
    Petromyzon marinus Germline Genome Assembly FASTA (gPmar100) Dovetail gPmar100.fa
    Petromyzon marinus Germline Specific Regions BWA-MEM
    Petromyzon marinus HOX Genes Manual Curation cDNA FASTA Apollo Annotations.cdna.12142017.fasta
    Petromyzon marinus HOX Genes Manual Curation GFF Apollo Annotations.12142017.gff3
    Petromyzon marinus kPetMar1 BLASTX Cavefish e!100 BLASTX cavefish.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Celegans e!100 BLASTX celegan.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Fly e!100 BLASTX fly.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Human e!100 BLASTX human.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Lamprey e!100 BLASTX lamprey.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Medaka e!100 BLASTX medaka.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Mouse e!100 BLASTX mouse.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Nematostella e!Metazoa46 BLASTX nematostella.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Smed Smesg BLASTX smed.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Uniprot BLASTX uniprot.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Xenopus e!100 BLASTX xenopus.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Yeast e!100 BLASTX yeast.tar.gz
    Petromyzon marinus kPetMar1 BLASTX Zebrafish e!100 BLASTX zebrafish.tar.gz
    Petromyzon marinus kPetMar1 IPRSCAN IPRSCAN pmz_iprscan.gz
    Petromyzon marinus kPetMar1 Mapped GFF MAKER pmz.genes.gff
    Petromyzon marinus kPetMar1 Protein FASTA MAKER pmz.proteins.fasta
    Petromyzon marinus kPetMar1 Transcript FASTA MAKER pmz.transcripts.fasta
    Petromyzon marinus lncRNA GSNAP P_marinus_lncrna.fasta
    Petromyzon marinus PMZ_v3.1 BLASTX Cavefish e!98 BLASTX PMZ_v3.1_vs_cavefish.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Celegans e!98 BLASTX PMZ_v3.1_vs_celegan.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Fly e!98 BLASTX PMZ_v3.1_vs_fly.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Human e!98 BLASTX PMZ_v3.1_vs_human.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Lamprey e!98 BLASTX PMZ_v3.1_vs_lamprey.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Mouse e!98 BLASTX PMZ_v3.1_vs_mouse.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Schmidtea mediterranea smes_v2_hconf_SMESG BLASTX PMZ_v3.1_vs_smed.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Uniprot BLASTX PMZ_v3.1_vs_uniprot.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Xenopus e!98 BLASTX PMZ_v3.1_vs_xenopus.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Yeast e!98 BLASTX PMZ_v3.1_vs_yeast.tar.gz
    Petromyzon marinus PMZ_v3.1 BLASTX Zebrafish e!98 BLASTX PMZ_v3.1_vs_zebrafish.tar.gz
    Petromyzon marinus PMZ_v3.1 Gene Models FASTA MAKER PMZ_v3.1_transcripts.fa
    Petromyzon marinus PMZ_v3.1 Gene Models GFF MAKER PMZ_v3.1_genes.gff3
    Petromyzon marinus PMZ_v3.1 Interproscan Interproscan PMZ_v3.1_IPRSCAN.tar.gz
    Petromyzon marinus repeats RepeatMasker germ1_update4_RModeler_union_velvet29_from_t165_bl180.fa.gz
    Germline Genome Manuscript

    The sea lamprey germline genome provides insights into programmed genome rearrangement and vertebrate evolution

    Jeramiah J. Smith, Nataliya Timoshevskaya, Chengxi Ye, Carson Holt, Melissa C. Keinath, Hugo J. Parker, Malcolm E. Cook, Jon E. Hess, Shawn R. Narum, Francesco Lamanna, Henrik Kaessmann, Vladimir A. Timoshevskiy, Courtney K. M. Waterbury, Cody Saraceno, Leanne M. Wiedemann, Sofia M. C. Robb, Carl Baker, Evan E. Eichler, Dorit Hockman, Tatjana Sauka-Spengler, Mark Yandell, Robb Krumlauf, Greg Elgar & Chris T. Amemiya

    The sea lamprey (Petromyzon marinus) serves as a comparative model for reconstructing vertebrate evolution. To enable more informed analyses, we developed a new assembly of the lamprey germline genome that integrates several complementary data sets. Analysis of this highly contiguous (chromosome-scale) assembly shows that both chromosomal and whole-genome duplications have played significant roles in the evolution of ancestral vertebrate and lamprey genomes, including chromosomes that carry the six lamprey HOX clusters. The assembly also contains several hundred genes that are reproducibly eliminated from somatic cells during early development in lamprey. Comparative analyses show that gnathostome (mouse) homologs of these genes are frequently marked by polycomb repressive complexes (PRCs) in embryonic stem cells, suggesting overlaps in the regulatory logic of somatic DNA elimination and bivalent states that are regulated by early embryonic PRCs. This new assembly will enhance diverse studies that are informed by lampreys’ unique biology and evolutionary/comparative perspective.

    Nature Genetics (2018) doi:10.1038/s41588-017-0036-1

    Go to Nature Genetics to Download