The development of genome-scale sequence data has provided a foundation for studying biological processes in targeted species, while the ability to compare the genomes of different species has afforded a greater understanding of genome evolution and the functional elements dispersed among coding and non-coding regions.


Using Evolutionary Cytogenomics to understand a key structural component of chromosomes: the centromere

Centromeres are the site of kinetochore assembly and spindle attachment to chromosomes during mitosis and meiosis in eukaryotic organisms. The proper functioning of centromeres is essential to the faithful segregation of genetic material in each cell generation, and errors in the cascade of events circumscribing spindle attachment lead to genome instability and aneuploidy. Despite its critical function, the centromere is also known to be a labile genomic feature when considering karyotypic evolution. In fact, species-specificity in karyotypes frequently involves centromere-associated rearrangements, including centric shifts, translocations, fusions, and inversions. Unfortunately, while integrated efforts such as the Human Genome Project and ENCODE have annotated and characterized functional elements across ~ 90% of the human genome, the ~10% that remains uncharacterized (often referred to as genomic “dark matter”) is concentrated in the centromeric, pericentromeric, and telomeric regions of every chromosome. Given the essential role of centromeres to the segregation of chromosomes during cell division, and hence their indispensability to genome stability, their absence from genome-scale studies is a significant shortcoming to achieving a full understanding of chromosome biology and the processes that underlie chromosome change. Despite this deficit, deep insights into centromere structure and function have emerged from studies of chromosome evolution. Through comparative genomics and cytogenetics, we have brought to bear newly developed genetic tools on a non-traditional model system that is uniquely well-described both cytogenetically and phylogenetically (the genus Macropus (kangaroos and wallabies)), overcoming some of the critical barriers to understanding centromere evolution and function in eukaryotes.


Noncoding RNAs and Centromere Function

Although only recently discovered, small RNAs have proven to be essential regulatory molecules encoded within eukaryotic genomes. These molecules, represented by major class sizes ranging from 20nt to 42nt, are participants in a diverse array of cellular processes including gene regulation, chromatin dynamics and genome defense. We are focusing our efforts on three model systems, human, gibbons, and marsupials to understand how noncoding RNAs and transcription are critical for maintaining genome integrity and proper cell division. Most significantly, we have discovered that alteration of the pathway that produces these small RNAs results in a defect in loading of the modified histone CENP-A, a step required for proper centromere integrity and function. Future work will employ chromosome engineering, fluorescent -tagged constructs, small RNA work and deep sequencing to define the action and partners for these small RNAs in the process of CENP-A recruitment and cell stability.


Marsupial Chromosome Biology – Centromere Drive in Marsupials

Because of their well-studied and relatively simple karyotypes, Macropodine marsupials (the kangaroos and wallabies) offer a unique system in which to study karyotypic diversification and speciation. Characterization of the composition and distribution of centromeric sequences within this group of mammals indicates these sequences have been involved in amplifications, segmental duplications, fissions and fusions. It has been proposed that, while molecular drive is responsible for the convergent and concerted evolution of satellites within one species, genetic conflict and meiotic drive may be responsible for the different centromere satellite sequence suites found between species. Accordingly, as satellite arrays expand, they attract more microtubules in female meiosis, subverting Mendelian chromosomal segregation with distorted transmission of one parental chromosome over  the other. In the model proposed by Malik and Henikoff, this Centromere Drive results in rapid evolution of centromere binding proteins selected to restore parity in meiosis (especially if the “driven” centromere is sex-linked). Current models of centromere evolution and molecular drive predict that other species in a phylogeny would experience shifts and expansions of satellite sequences that would result in species-specific satellites. Using a detailed phylogeny with accompanying chromosomal data from the genus Macropus (kangaroos and wallabies), we have shown this not to be the case. Such a system provides an ideal opportunity to test the Centromere Drive hypothesis and explore the consequences of centromere selection on chromosome, and species, evolution.


Using comparative genomics to understand adaptive responses within marine ecosystems

Despite the advantages in employing comparative genomics approaches to understanding genome and species biology, few species that occupy pivotal roles in marine ecosystems and pelagic food webs have been utilized as model organisms for genome-scale studies. Employing diverse technologies for the development of reference genome assemblies, we have targeted several species for comparative genomics that will afford a greater understanding of adaptive responses to environmental variation and the evolution of novel phenotypes within marine ecosystems. One of our target species is the Southern Ocean salp, Salpa thompsoni, which has shown altered distribution and abundance in Antarctic Ocean ecosystems in recent years. With a high-quality genome assembly, we can begin to unravel the genomic features that allow the salp to adapt to our warming oceans with the devastating consequence of depleting the ocean of the extraordinary diversity of organisms that rely on krill.


Cancer Susceptibility linked to Viral noncoding RNAs

Despite an incredibly high incidence of infection within the human population (>90%), Epstein Barr Virus (EBV) is maintained as an asymptomatic infection of B lymphocytes in the majority of cases. However, in a small number of cases, EBV is linked to the development of malignancies, including Burkitt’s lymphoma, Hodgkin’s lymphoma, nasopharyngeal carcinoma, gastric carcinoma, Posttransplant lymphoproliferative disorders (PTLD), T cell lymphoma, T/NK nasal type lymphoma and B lymphoproliferative diseases. Interestingly, each of these EBV+ cancers are regionally delimited; for example, Burkitt’s lymphoma is mainly found in equatorial Africa while nasopharyngeal carcinoma is predominantly found in South East Asia, Southern China, North Africa and in the Eskimo population of Alaska. Searches for a link between EBV sequence variants and the observed geographical distribution of cancer types have failed to provide conclusive evidence that EBV diversity contributes to regional EBV+ cancer susceptibility, we are studying the interaction between EBV small RNAs and the host genome in an effort to understand EBV+ cancer susceptibility.


Developing genome resources and novel sequencing applications

Rapidly evolving sequencing technology has enabled more comprehensive, genome-scale analyses of genome structure and function, gene expression networks, and population diversity. Coincident with this advance is the capacity to establish broad eukaryotic species as models for studying genome biology. My lab has a long-standing interest in developing new model systems with broad applications to scientists as well as novel methodologies for assessing genome sequences. To this end, we have worked on genome projects for 7 different species, have developed computational tools to merge various forms of data for scaffolding, including non-sequence based data, and have established workflows for using these genome assemblies for understanding the dynamics of gene expression and epigenetics in various contexts, such as viral infection, warfighter stress and adaptability, environmental impact, adaptation and alternative splicing, and cancer.