The Cancer Genome Atlas and sister projects have now completed analysis of over 10,000 tumor genomes, providing a catalog of the gene mutations, copy number variants and other genetic alterations that cause cancer. In many cases it remains unclear, however, which are the key driver mutations or dependencies in a given cancer and how these influence pathogenesis and response to therapy. Although tumors of similar types and clinical outcomes can have patterns of mutations that are strikingly different, it is becoming apparent that these mutations recurrently hijack the same hallmark molecular pathways and networks. For this reason, cancer research and treatment is increasingly dependent on knowledge of biological networks of multiple types, including physical interactions among proteins and synthetic-lethal and epistatic interactions among genes. Our Cancer Cell Map Initiative (CCMI) is aimed at comprehensively detailing these complex interactions among cancer genes and proteins using a combination of physical interaction, genetic interaction, and computational approaches. This work will enable us to analyze cancer molecular networks with a view towards pathway and network-based personalized therapy.
We are applying our expertise in mass spectrometry to identify key regulators of cardiac differentiation. This work will help us understand how healthy heart tissue develops, how this process can go awry in disease, and how to make a diseased heart healthy again. While previous studies have gained important insight into this cell state transition by focusing on transcriptional regulators and epigenetics, our work focuses on the relatively uncharacterized protein and post-translational levels. By combining data on transcription, translation, and post-translational modifications, we will produce a map of cardiac differentiation at unprecedented detail. This work is done in collaboration with investigators at the Gladstone Institute of Cardiovascular Disease.
Viruses are molecular machines that have co-evolved with their hosts over millions of years to exploit their
specialized cellular niches for replication. This co-dependency manifests itself in thousands of molecular
changes that influence cellular function and may result in disease. A systematic, quantitative understanding
of these changes is essential for the understanding of these disease states and for the development of
next-generation therapeutics. Only recently, however, have technological advances in proteomics,
functional genomics, and cellular engineering allowed for the generation of systems-level interaction
maps in mammalian cells. By bridging these technological innovations to infectious disease, we strive
to gain insight into the molecular mechanisms of health and pathogenesis. In this way, we can use
‘big data’ to go from systems to structure to patients and back again to inform the development and
design of therapeutic treatments.
In the Krogan lab, we specialize in a variety of high-throughput, systems-based techniques to explore the changes made to host cell systems during infection. These include proteomics techniques like affinity purification with mass spectrometry (AP-MS), post-translational modification (PTM) profiling, cross-linking mass spectrometry (XL-MS), Ascorbate Peroxidase proximity labeling mass spectrometry (APEX-MS), and quantitative mass spectrometry, as well as functional genomic techniques like CRISPR/Cas9 editing, genetic interaction (GI) mapping, pooled and/or arrayed small RNA interference (RNAi) screening, next-generation sequencing, Chromatin Immunoprecipitation (ChIP), and microarray profiling. We have used these approaches to study a wide array of bacterial and viral pathogens including: Human Immunodeficiency Virus (HIV), Influenza A Virus (IAV), Hepatitis B Virus (HBV), Hepatitis C Virus (HBV), Dengue virus, Moloney Leukemia Virus (MLV), Zika virus, Ebola virus, Herpesvirus, Chlamydia, Pseudomonas, and Tuberculosis. By employing unbiased systems approaches to infectious disease, we can identify critical nodes for pathogenic persistence and infection in the host, which in turn can inform the design and development of new therapeutic strategies.
Mass spectrometry based proteomics
Mass spectrometry based proteomics is a powerful approach to characterize proteins and post-translational modifications (PTMs). Using this technology, we can accurately quantify the dynamics of proteins and PTMs across a wide variety of conditions and cellular states. In the Krogan lab, we are developing novel proteomics methods to facilitate biological discovery, as well as employing established approaches. We perform a wide-variety of proteomics techniques including: targeted absolution and relative quantitation via selected reaction monitoring (SRM), unbiased global proteome and PTM characterization with label-free quantitation, large-scale quantitative affinity purification mass spectrometry (APMS), as well as protein complex interface characterization using cross-linking mass spectrometry. A primary interest in the lab is utilizing several of these complementary mass spectrometry based approaches in a systems biology fashion to develop a comprehensive picture of the biological system of interest.
Post-translational modifications (PTMs) are critical for regulating nearly all biological processes. PTMs can regulate a system in a very rapid manner, compared to gene expression changes that take time to affect change in a biological system. PTMs are also frequently dysregulated in human disease. Many cancers are driven by aberrant PTM signaling that promotes cellular growth in an unhealthy manner. Pathogens frequently usurp and disrupt host cellular PTM machinery in order to silence innate immune responses and create an environment favorable for replication. Our lab uses mass spectrometry-based proteomics approaches to study PTMs in a comprehensive, unbiased manner. We have developed platforms for proteome-wide quantification of changes in many PTMs, particularly for phosphorylation, ubiquitylation, and acetylation. Additionally, we have developed bioinformatics tools for interpreting the highly complex data resulting from these platforms in order to maximize the extraction of functionally relevant biological knowledge. These approaches have allowed us to identify signaling pathways regulated as a stem cell differentiates into a heart cell, to discover novel ubiquitylation substrates of viral and bacterial pathogens, and to rapidly and specifically map substrates of acetylation enzyme machinery in yeast.
Protein interaction networks
Proteins typically do not function alone, but in physical or functional interaction with other proteins or biomolecules forming macromolecular complexes. The complex cellular network of protein interactions is highly organized in time and space and adapts dynamically to external or internal perturbations to define the cell’s functional state. Consequently, characterizing protein interaction networks and their dynamic changes in response to perturbations can better our understanding of protein function. In the Krogan lab, we use affinity purification combined with quantitative mass spectrometry to characterize protein interactions networks. We also developed a novel approach based on APEX-proximity biotinylation combined with quantitative mass spectrometry, which allows for the first time to study protein interaction networks with temporal and spatial resolution simultaneously. We apply these approaches towards understanding how viruses hijack the cellular machinery for replication and infection, and how genetic mutations cause rewiring of proteins interactions networks leading to the development of cancer or neuronal disorders.
Cross-linking mass spectrometry (XL-MS)
XL-MS represents a suite of powerful tools for defining protein-protein interactions (PPIs) and probing PPI interfaces. The combination of cross-linking with AP-MS strategies allows for the capture and identification of not only stable, but also transient, dynamic, and weakly associating proteins. In this way, XL-MS experiments provide complementary PPI information, often defining PPIs that are lost during native AP-MS experiments. What’s more, XL-MS strategies store additional PPI data in the form of cross-linked peptides. Identification of cross-linked residues can provide a map of interacting protein surfaces that can be used for: 1) direct PPI network generation (versus inferred); and 2) interface mapping for integrative structural determination of protein complexes. Our lab utilizes a specialized in vitro XL-MS approach to determine structures for protein complexes that have been challenging to define. These include complexes which contain transient interactions, (e.g. substrate-enzyme complexes), flexible subunits, heterogeneous composition or conformation, and subunits with poor solubility or stability. Currently our main focus is on defining virus-host PPI networks and protein complexes that are involved in viral pathogenesis or host innate immunity.
Genetic interactions in yeast
Genetic interactions report on how the presence of one mutation alters the phenotypic outcome of a second mutation, and can be used to identify genes that function in related or parallel pathways. We have developed a large-scale, quantitative approach for measuring genetic interactions, called E-MAP (Epistatic Miniarray Profile). E-MAPs are comprised of quantitative measurements of genetic interactions between pairs of mutations within large sets of genes. We have generated E-MAPs for most processes in budding yeast and fission yeast. These studies have led to a wealth of functional insights into these organisms and on the evolution of genetic interactomes.
Our initial E-MAP studies used deletions and knockdowns of non-essential and essential genes, respectively. However, many important proteins are multi-functional, and a limitation to this work was that secondary functions of proteins tend to be obscured by the central function. To address the next level of complexity, we developed an advance of the E-MAP analysis, termed point-mutant E-MAP (pE-MAP), which allows us to examine multifunctional gene function at a residue-level resolution. We applied this technique to functionally dissect RNA polymerase II and histones H3 and H4 in budding yeast. In addition to discovering connections between individual residues and cellular processes, we discovered a relationship between spatial proximity and genetic similarity of the mutated residues. Based on this observation, we are currently exploring the applications of our technique as a tool for structure-function analysis.
In a further advance of E-MAP technology, we have developed Triple Mutant Analysis (TMA), which allows for investigation of genetic interactions in triple mutants. This approach is particularly useful for revealing functional redundancies that could not be uncovered by standard double mutant genetic interactions.
Genetic interactions in mammalian cells
In recent years, we have worked towards extending the E-MAP technology to functionally interrogate mammalian biology. To this end, we developed a combinatorial RNA interference (RNAi) based platform for the systematic and quantitative generation of genetic and chemical-genetic interaction maps in mammalian cells. The approach relies on high-throughput microscopy and automated liquid handling to minimize time, labor and human error. The microscopy-based nature of our platform allows us to quantify not only proliferation, but also other phenotypes that can be identified via cell staining or reporter cassettes. By combining chemicals with RNAi knockdowns, we can further interrogate the interplay between cellular machinery and drug treatments. This feature also makes our platform an effective tool for identifying genetic backgrounds that are sensitive to particular drugs.
Most recently we have worked with the Qi lab at Stanford and Mali lab at UCSD to develop an experimental methodology for generating quantitative genetic interaction maps in mammalian cells utilizing CRISPR and CRISPRi/a. In a close collaboration with the Qi lab we developed a platform for performing pooled CRISPRi genetic interaction screens and applied it to a set of over 100 chromatin-related factors in a human cell culture model (HEK293). Our approach utilizes a stable expression of an inducible dCas9-KRAB construct combined with a library of single or double sgRNA (single guide RNA) constructs delivered via lentiviral transduction. Relative abundance of individual sgRNA combinations is determined through next generation sequencing. In collaboration with the Mali lab at UCSD we developed a similar approach utilizing the CRISPR gene editing toolkit and applied it to a set of tumor suppressor genes and druggable targets in two different human cell lines (HeLa and A549). We have also developed a suite of new software tools aiding the processing and scoring of the raw data that result from these pooled screens.
CRISPR/Cas9 Gene Editing
CRISPR/Cas9 gene editing strategies have revolutionized our ability to engineer the human genome for robust functional interrogation of complex biological processes. This system has given researchers the unprecedented power to direct permanent genetic changes in experimental cell lines and models. Unfortunately, many disease states that we study occur in specialized cellular contexts that are poorly reflected in these models. For this reason, more and more researchers are turning to primary cells and tissues as laboratory models, but these are less amenable to genetic manipulation. Recently, we have adapted CRISPR/Cas9 technology for the efficient, high-throughput editing of primary CD4+ T cells. Our multiplex platform supports the arrayed generation of hundreds of specific gene manipulations in only a few hours time and is widely adaptable to an array of culturing conditions, protocols, and downstream applications. We’ve recently demonstrated the power of this platform to discover and mechanistically probe host factors critical to HIV replication. Currently, we’re expanding this approach to explore thousands of manipulations in a single donor, to engineer single point mutations into a directed site, and to edit the genome of other primary tissues. We hope this approach will help pave the way for a better understanding of many different disease states, from HIV infection to cancer, directly in primary patient tissues.
We develop and employ computational techniques to distill mechanistic insights from massive biological datasets. Our efforts span low level scoring of raw data from our high-throughput pipelines to modeling of biological networks using different types of previously processed data.
We have developed a suite of statistical scoring tools for analyzing genetic interactions. Our toolbox handles raw data processing, as well as quality control and visualization modules for post-processing of the data. The toolbox was originally developed for yeast and bacterial screens, but we have since extended it to cover mammalian genetic interaction screens as well, including both microscopy-based and pooled screens. In a parallel effort, we developed MiST, a tool for scoring of affinity purification mass spectrometry data.
Our modeling work is currently focused on influenza virus infection, in our role as the Modeling Core of the FluOMICS Projects, funded by the National Institute of Allergy and Infectious Diseases. We are integrating time-resolved genomic, proteomic, RNAi knockdown, and metabolomic data in response to virus infections (H1N1, H3N2 and H5N1) in human cells and live mice. By integrating these data and taking into account the varying pathogenesis between the three virus types, our modeling aims to identify critical nodes of the viral-host network that are predictive of viral pathogenesis. As part of this effort, we are developing a web-based interactive network visualization tool that will allow other groups to integrate their own datasets into the flu interaction network in an effort to improve the models. While this modeling tools are currently focused on flu, they are built in a flexible framework and will be applicable to other disease conditions as well.