The development of genome-scale sequence data has provided a foundation for studying biological processes in targeted species, while the ability to compare the genomes of different species has afforded a greater understanding of genome evolution and the functional elements dispersed among coding and non-coding regions.

  Using Evolutionary Cytogenomics to understand a key structural component of chromosomes: the centromere
Centromeres are the site of kinetochore assembly and spindle attachment to chromosomes during mitosis and meiosis in eukaryotic organisms. The proper functioning of centromeres is essential to the faithful segregation of genetic material in each cell generation, and errors in the cascade of events circumscribing spindle attachment lead to genome instability and aneuploidy. Despite its critical function, the centromere is also known to be a labile genomic feature when considering karyotypic evolution. In fact, species-specificity in karyotypes frequently involves centromere-associated rearrangements, including centric shifts, translocations, fusions, and inversions. Unfortunately, while integrated efforts such as the Human Genome Project and ENCODE have annotated and characterized functional elements across ~ 90% of the human genome, the ~10% that remains uncharacterized (often referred to as genomic “dark matter”) is concentrated in the centromeric, pericentromeric, and telomeric regions of every chromosome. Given the essential role of centromeres to the segregation of chromosomes during cell division, and hence their indispensability to genome stability, their absence from genome-scale studies is a significant shortcoming to achieving a full understanding of chromosome biology and the processes that underlie chromosome change. Despite this deficit, deep insights into centromere structure and function have emerged from studies of chromosome evolution. Through comparative genomics and cytogenetics, we have brought to bear newly developed genetic tools on a non-traditional model system that is uniquely well-described both cytogenetically and phylogenetically (the genus Macropus (kangaroos and wallabies)), overcoming some of the critical barriers to understanding centromere evolution and function in eukaryotes. Rachel O'Neill Lab
Noncoding RNAs and Centromere Function
Although only recently discovered, small RNAs have proven to be essential regulatory molecules encoded within eukaryotic genomes. These molecules, represented by major class sizes ranging from 20nt to 42nt, are participants in a diverse array of cellular processes including gene regulation, chromatin dynamics and genome defense. We are focusing our efforts on three model systems, human, gibbons, and marsupials to understand how noncoding RNAs and transcription are critical for maintaining genome integrity and proper cell division. Most significantly, we have discovered that alteration of the pathway that produces these small RNAs results in a defect in loading of the modified histone CENP-A, a step required for proper centromere integrity and function. Future work will employ chromosome engineering, fluorescent -tagged constructs, small RNA work and deep sequencing to define the action and partners for these small RNAs in the process of CENP-A recruitment and cell stability.
 Marsupial Chromosome Biology - Centromere Drive in Marsupials
Because of their well-studied and relatively simple karyotypes, Macropodine marsupials (the kangaroos and wallabies) offer a unique system in which to study karyotypic diversification and speciation. Characterization of the composition and distribution of centromeric sequences within this group of mammals indicates these sequences have been involved in amplifications, segmental duplications, fissions and fusions. It has been proposed that, while molecular drive is responsible for the convergent and concerted evolution of satellites within one species, genetic conflict and meiotic drive may be responsible for the different centromere satellite sequence suites found between species. Accordingly, as satellite arrays expand, they attract more microtubules in female meiosis, subverting Mendelian chromosomal segregation with distorted transmission of one parental chromosome over  the other. In the model proposed by Malik and Henikoff, this Centromere Drive results in rapid evolution of centromere binding proteins selected to restore parity in meiosis (especially if the “driven” centromere is sex-linked). Current models of centromere evolution and molecular drive predict that other species in a phylogeny would experience shifts and expansions of satellite sequences that would result in species-specific satellites. Using a detailed phylogeny with accompanying chromosomal data from the genus Macropus (kangaroos and wallabies), we have shown this not to be the case. Such a system provides an ideal opportunity to test the Centromere Drive hypothesis and explore the consequences of centromere selection on chromosome, and species, evolution.
Using comparative genomics to understand adaptive responses within marine ecosystems
Despite the advantages in employing comparative genomics approaches to understanding genome and species biology, few species that occupy pivotal roles in marine ecosystems and pelagic food webs have been utilized as model organisms for genome-scale studies. Employing diverse technologies for the development of reference genome assemblies, we have targeted several species for comparative genomics that will afford a greater understanding of adaptive responses to environmental variation and the evolution of novel phenotypes within marine ecosystems. One of our target species is the Southern Ocean salp, Salpa thompsoni, which has shown altered distribution and abundance in Antarctic Ocean ecosystems in recent years. With a high-quality genome assembly, we can begin to unravel the genomic features that allow the salp to adapt to our warming oceans with the devastating consequence of depleting the ocean of the extraordinary diversity of organisms that rely on krill.
  Developing genome resources and novel sequencing applications
Rapidly evolving sequencing technology has enabled more comprehensive, genome-scale analyses of genome structure and function, gene expression networks, and population diversity. Coincident with this advance is the capacity to establish broad eukaryotic species as models for studying genome biology. My lab has a long-standing interest in developing new model systems with broad applications to scientists as well as novel methodologies for assessing genome sequences. To this end, we have worked on genome projects for over 40 different species, have developed computational tools to merge various forms of data for scaffolding, including non-sequence based data, and have established workflows for using these genome assemblies for understanding the dynamics of gene expression and epigenetics in various contexts. Part of our efforts have focused on understanding the transcriptional landscape of transposable elements in centromeres of various model species, including human.

Deep Ocean Genomes Project

Earth’s ocean is the largest and most biodiverse ecosystem on Earth, hosting 33 known phyla from the tree of life, with ~410,000 named species and estimates of >100 million species. The deep sea hosts a broad spectrum of habitats including hydrothermal vents, methane seeps, oxygen minimum zones, seamounts, canyons and trenches with evidence suggesting deep-ocean life is richly diverse and highly adapted. In partnership, Woods Hole Oceanographic Institute (Tim Shank) and the University of Connecticut have established the Deep-Ocean Genomes Project (DOG). We are focused on implementing genomics technologies to address diverse ecological and evolutionary hypotheses within and across a myriad of deep-sea habitats and lineages.