Computational Biology×

News

Comorbidity between Mendelian disease and cancer
Researchers in the Rabadan Lab have found that comorbidity between Mendelian diseases and cancer may result from shared genetic factors.

Genetic diseases can arise in a variety of ways. Mendelian disorders, for example, occur when specific mutations in single genes — called germline mutations — are inherited from either of one’s two parents. Well-known examples of Mendelian diseases include cystic fibrosis, sickle cell disease, and Duchenne muscular dystrophy. Other genetic diseases, including cancer, result from somatic mutations, which occur in individual cells during a person’s lifetime. Because the genetic origins of Mendelian diseases and cancer are so different, they are typically understood to be distinct phenomena. However, scientists in the Columbia University Department of Systems Biology have found evidence that there might be interesting genetic connections between them. 

In a paper just published in Nature Communications, postdoctoral research scientist Rachel Melamed and colleagues in the laboratory of Associate Professor Raul Rabadan report on a new method that uses knowledge about Mendelian diseases to suggest mutations involved in cancer. The study takes advantage of an enormous collection of electronic health records representing over 110 million patients, a substantial percentage of US residents. The authors show that clinical co-occurrence of Mendelian diseases and cancer, known as comorbidity, can be tied to genetic changes that play roles in both diseases. The paper also identifies several specific relationships between Mendelian diseases and the cancers melanoma and glioblastoma.

Some factors in the expo some

The exposome incorporates factors such as the environment we inhabit, the food we eat, and the drugs we take.

Although genomics has dramatically improved our understanding of the molecular origins of certain human genetic diseases, our health is also influenced by exposures to our surrounding environment. Molecules found in food, air and water pollution, and prescription drugs, for example, interact with genetic, molecular, and physiologic features within our bodies in highly personalized ways. The nature of these relationships is important in determining who is immune to such exposures and who becomes sick because of them.

In the past, methods for studying this interface have been limited because of the complexity of the problem. After all, how could we possibly cross-reference a lifetime’s worth of exposures with individual genetic profiles in any kind of meaningful way? Recently, however, an explosion in the generation of quantitative data related to the environment, health, and genetics — along with new computational methods based in machine learning and bioinformatics — have made this landscape ripe for exploration.

At this year’s South by Southwest Interactive Festival in Austin, Texas, Department of Systems Biology Assistant Professor Nicholas Tatonetti and his collaborator Chirag Patel (Harvard Medical School) discussed the remarkable new opportunities that “big data” approaches offer for investigating this landscape. Driving Tatonetti and Patel’s approach is a concept called the exposome. First proposed by Christopher Wild (University of Leeds) in 2005, an exposome represents all of the environmental exposures a person has experienced during his or her life that could play a role in the onset of chronic diseases. Tatonetti and Chirag’s presentation highlighted how investigation of the exposome has become tractable, as well as the important roles that individuals can play in supporting this effort.

In the following interview, Dr. Tatonetti discusses some of the approaches his team is using to explore the exposome, and how the project has evolved out of his previous research.

ALK-negative ALCL mutation map
A map of mutations observed in ALK-negative anaplastic large cell lymphoma. (Credit: Dr. Rabadan)

The following article is reposted with permission from the Columbia University Medical Center Newsroom. Find the original here.

The first-ever systematic study of the genomes of patients with ALK-negative anaplastic large cell lymphoma (ALCL), a particularly aggressive form of non-Hodgkin’s lymphoma (NHL), shows that many cases of the disease are driven by alterations in the JAK/STAT3 cell signaling pathway. The study also demonstrates, in mice implanted with human-derived ALCL tumors, that the disease can be inhibited by compounds that target this pathway, raising hopes that more effective treatments might soon be developed. The study, led by researchers at Columbia University Medical Center (CUMC) and Weill Cornell Medical College, was published today in the online edition of Cancer Cell.

Bacterial evolutionary relationships

Image courtesy of Germán Plata and Dennis Vitkup.

Columbia News has just published an article covering recent research by associate professor Dennis Vitkup and postdoctoral research scientist Germán Plata that uses simulations of bacterial metabolism as a lens for studying how phenotypes adapt and diversify across evolutionary time scales. The article reports:

Despite their omnipresence, microbial evolutionary adaptations are often challenging to study, partly due to the difficulty of growing diverse bacteria in the lab. “Probably less than a dozen bacteria are really well studied in the laboratory,” Vitkup says.

Writing in the journal Nature this past January, Vitkup and Plata applied computational tools to investigate bacterial evolutionary adaptations by simulating metabolism for more than 300 bacterial species, covering the entire microbial tree of life.

Andrea Califano and Aris Floratos
Andrea Califano and Aris Floratos will lead an effort to reclassify tumors catalogued in TCGA according to their master regulators.

Andrea Califano and Aris Floratos, faculty members in the Columbia University Department of Systems Biology, have received a two-year, $624,236 subcontract to develop a new classification system of cancer subtypes. The agreement was awarded through a subcontract from Leidos Biomedical Research, Inc., which operates the Frederick National Laboratory for Cancer Research for the federal government.  

By performing an integrative analysis of genomic data from the Cancer Genome Atlas (TCGA) and proteomic data from the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC), the researchers plan to recategorize tumors collected in TCGA based on the master regulator genes that determine their state. This is in contrast to other approaches based on expression of genes that reflect tissue lineage and proliferative processes. In addition, the team will link the genetics of each tumor sample to the specific master regulators that determine its state using a recently published novel algorithm (DIGGIT). Ultimately, the project aims to provide a more useful catalog of pan-cancer subtypes that could help to identify biomarkers and therapeutic targets for specific kinds of tumors, and ultimately provide a resource to guide the next generation of precision medicine.

“We have to reevaluate the way in which we organize tumors within subtypes, using both gene expression data and mutational data,” says Dr. Califano. “Right now the common approach is to classify tumor types based on rather generic genes that are differentially expressed between subtypes. But most of these genes play no role in actually driving the disease. We want to shift the emphasis and classify tumors based on the genes that truly regulate tumor state and survival.”

Autism Spectrum Disorders Genetic Network

Network of autism-associated genes. (Credit: Dennis Vitkup)

The following article is reposted with permission from the Columbia University Medical Center Newsroom. Find the original here.

People with autism have a wide range of symptoms, with no two people sharing the exact type and severity of behaviors. Now a large-scale analysis of hundreds of patients and nearly 1000 genes has started to uncover how diversity among traits can be traced to differences in patients’ genetic mutations. The study, conducted by researchers at Columbia University Medical Center, was published Dec. 22 in the journal Nature Neuroscience.

Autism researchers have identified hundreds of genes that, when mutated, likely increase the risk of developing autism spectrum disorder (ASD). Much of the variability among people with ASD is thought to stem from the diversity of underlying genetic changes, including the specific genes mutated and the severity of the mutation.

“If we can understand how different mutations lead to different features of ASD, we may be able to use patients’ genetic profiles to develop accurate diagnostic and prognostic tools and perhaps personalize treatment,” said senior author Dennis Vitkup, PhD, associate professor of systems biology and biomedical informatics at Columbia University’s College of Physicians & Surgeons.

Sequence of genomic alterations in CLLA graph representing the sequence of genomic alterations in chronic lymphocytic leukemia (CLL). Each node represents a mutation, with arrows indicating temporal relationships between them. The size of the nodes indicates the number of patients in the study who exhibited the alteration, while the thickness of the lines shows how often the temporal relationships between nodes were seen. The method the researchers use enabled them to identify multiple, distinct evolutionary patterns in CLL.

As biologists have gained a better understanding of cancer, it has become clear that tumors are often driven not by a single mutation, but by a series of genetic changes that correspond to particular stages of cancer progression. In this sense, a tumor is constantly evolving, with different groups of cells that harbor distinctive mutations multiplying at different rates, depending on their fitness for particular disease states. As the search for more effective cancer diagnostics and therapies continues, one key question is how to disentangle the order in which mutations occur in order to understand how tumors change over time. Being able to predict how a tumor will behave based on signs seen early in the course of disease could enable the development of new diagnostics that could better inform treatment planning.

In a paper just published in the journal eLife, a team of investigators led by Department of Systems Biology Associate Professor Raul Rabadan reports on a new computational strategy for addressing this challenge. Their framework, called tumor evolutionary directed graphs (TEDG), considers next-generation sequencing data from tumor samples from a large number of patients. Using TEDG to analyze cancer cells in patients with chronic lymphocytic leukemia (CLL), they were able to develop a model of how the disease’s mutational landscape changes from its initial onset to its late stages. Their findings suggest that CLL may not be just the result of a single evolutionary path, but can evolve in alternative ways.

Expanding the landscape of breast cancer drivers

In comparison with a previous study (Stephens et al., 2012, shown in gray), a new computational approach that focuses on somatic copy number mutations increased the number of known driver mutations in breast tumors to a median of five for each tumor. The findings could raise the likelihood of finding actionable targets in individual patients with breast cancer.

For many years, researchers have known that somatic copy number alterations (SCNA’s) — insertions, deletions, duplications, and transpositions of sections of DNA that are not inherited but occur after birth — play important roles in causing many types of cancer. Indeed, most recurrent drivers of epithelial tumors are copy number alterations, with some found in up to 40% of patients with specific tumor types. However, because SCNA’s occur when entire sections of chromosomes become damaged, biologists have had difficulty developing effective methods for distinguishing genes within SCNA’s that actually drive cancer from those genes that might lie near a driver but do not themselves cause disease.

Helios nearly doubled the number of high-confidence predictions of breast cancer drivers.

In a new paper published in Cell, researchers in the laboratories of Dana Pe’er (Columbia University Departments of Systems Biology and Biological Sciences) and Jose Silva (Icahn School of Medicine at Mount Sinai) report on a new computational algorithm that promises to dramatically improve researchers’ ability to identify cancer-driving genes within potentially large SCNA’s. The algorithm, called Helios, was used to analyze a combination of genomic data and information generated by functional RNAi screens, enabling them to predict several dozen new SCNA drivers of breast cancer. In follow-up in vitro experimental studies, they tested 12 of these predictions, 10 of which were validated in the laboratory. Their findings nearly double the number of breast cancer drivers, providing many new opportunities towards personalized treatments for breast cancer. Their methodology is general and could also be used to locate disease-causing SCNA’s in other cancer types.

Leading this effort was Felix Sanchez-Garcia, a recent PhD graduate from the Pe’er Lab and a first author on the paper. The story of how this breakthrough came about illuminates how the interdisciplinary research and education that take place at the Department of Systems Biology can address important challenges facing biological and biomedical research.

DIGGIT identifies mutations upstream of master regulators.

A new algorithm called DIGGIT identifies mutations that lie upstream of crucial bottlenecks within regulatory networks. These bottlenecks, called master regulators, integrate these mutations and become essential functional drivers of diseases such as cancer.

Although genome-wide association studies have made it possible to identify mutations that are linked to diseases such as cancer, determining which mutations actually drive disease and the mechanics of how they do so has been an ongoing challenge. In a paper just published in Cell, researchers in the lab of Andrea Califano describe a new computational approach that may help address this problem.

Ashkenazi Population Bottleneck Model
The consortium’s model of Ashkenazi Jewish ancestry suggests that the population’s history was shaped by three critical bottleneck events. The ancestors of both populations underwent a bottleneck sometime between 85,000 and 91,000 years ago, which was likely coincident with an Out-of-Africa event. The founding European population underwent a bottleneck at approximately 21,000 years ago, beginning a period of interbreeding between individuals of European and Middle Eastern ancestry. A severe bottleneck occurred in the Middle Ages, reducing the population to under 350 individuals. The modern-day Ashkenazi community emerged from this group.

An international research consortium led by Associate Professor Itsik Pe’er has produced a new panel of reference genomes that will significantly improve the study of genetic variation in Ashkenazi Jews. Using deep sequencing to analyze the genomes of 128 healthy individuals of Ashkenazi Jewish origin, The Ashkenazi Genome Consortium (TAGC) has just published a resource that will be much more effective than previously available European reference genomes for identifying disease-causing mutations within this historically isolated population. Their study also provides novel insights into the historical origins and ancestry of the Ashkenazi community. A paper describing their study has just been published online in Nature Communications.

The dataset produced by the consortium provides a high-resolution baseline genomic profile of the Ashkenazi Jewish population, which they revealed to be significantly different from that found in non-Jewish Europeans. In the past, clinicians’ only option for identifying disease-causing mutations in Ashkenazi individuals was to compare their genomes to more heterogeneous European reference sets. This new resource accounts for the historical isolation of this population, and so will make genetic screening much more accurate in identifying disease-causing mutations.

In an article that appears on the website of Columbia University’s Fu Foundation School of Engineering and Computer Science, Dr. Pe’er explains:

“Our study is the first full DNA sequence dataset available for Ashkenazi Jewish genomes... With this comprehensive catalog of mutations present in the Ashkenazi Jewish population, we will be able to more effectively map disease genes onto the genome and thus gain a better understanding of common disorders. We see this study serving as a vehicle for personalized medicine and a model for researchers working with other populations.”

In addition to offering an important resource for such future translational and clinical research, the paper’s findings also provide new insights that have implications for the much debated question of how European and Ashkenazi Jewish populations emerged historically.

Differential decay rates in MDA-LM2 vs. MDA cells

The presence of the structural RNA stability element (sSRE) family of mRNA elements distinguishes transcript stability in metastatic MDA-LM2 breast cancer cell lines from that seen in its parental MDA cell line. Each bin contains differential decay rate measurements for roughly 350 transcripts. From left (more stable in MDA) to right (more stable in MDA-LM2), sRSE-carrying transcripts were enriched among those destabilized in MDA-LM2 cells. The TEISER algorithm collectively depicts sSREs as a generic stem-loop with blue and red circles marking nucleotides with low and high GC content, respectively. Also included are mutual information (MI) values and their associated z-scores. 

Gene expression analysis has become a widely used method for identifying interactions between genes within regulatory networks. If fluctuations in the expression levels of two genes consistently shift in parallel over time, the logic goes, it is reasonable to hypothesize that they are regulated by the same factors. However, such analyses have typically focused on steady-state gene expression, and have not accounted for modifications that messenger RNAs (mRNAs) can undergo during the time between their transcription from DNA and their translation into proteins. Researchers now understand that certain stem loop structures in mRNAs make it possible for proteins to bind to them, often causing RNA degradation and subsequently modulating protein synthesis. From the perspective of systems biology, this can have implications for the activity of entire regulatory networks, and recent studies have even suggested that aberrations in mRNA stability can play a role in disease initiation and progression.

In a new paper published in the journal Nature, Department of Systems Biology Professor Saeed Tavazoie and collaborators at the Rockefeller University describe a new computational and experimental approach for identifying post-transcriptional modifications and investigating their effects in biological systems. In a study of metastatic breast cancer, they determined that when the protein TARBP2 binds to a specific structural element in mRNA transcripts, it increases the likelihood that cancer cells will become invasive and spread. Interestingly, they also found that TARBP2 causes metastasis by binding transcripts of two genes — amyloid precursor protein (APP) and zinc finger protein 395 (ZNF395) — that have previously been implicated in Alzheimer’s disease and Huntington’s disease, respectively. Although the nature of this intersection between the regulatory networks underlying cancer and neurodegenerative diseases is unclear, the finding raises a tantalizing question about whether these very different disorders might be linked at some basic biological level.

geWorkbench screenshot

A new version of geWorkbench lets researchers access a range of powerful, integrated bioinformatics tools using a standard web browser. Here, an ARACNe-generated gene regulatory network is displayed using the Cytoscape Web plugin.

Since its creation in 2005, investigators in Columbia University’s Center for the Multiscale Analysis of Genomic and Cellular Networks (MAGNet) have developed a large number of computational tools for studying biological systems from the perspectives of structural biology and systems biology. To consolidate and disseminate these tools to the wider research community, MAGNet developed geWorkbench (genomics Workbench), a free, open-source bioinformatics application that gathers all of the Center’s software and databases into one integrated software platform. These include applications for the analysis of cellular regulatory networks, protein structure, DNA and protein sequences, gene expression, and other kinds of biological data.

Initially, geWorkbench was made available as a software package that users could install and run on their local computers. Now, in a major upgrade, MAGNet has released a web-based version that makes these tools accessible through a browser interface.

Comparing human and mouse prostate cancer networks

Computational synergy analysis depicting FOXM1 and CENPF regulons from the human (left) and mouse (right) interactomes showing shared and nonshared targets. Red corresponds to overexpressed targets and blue to underexpressed targets.

Two genes work together to drive the most lethal forms of prostate cancer, according to new research by investigators in the Columbia University Department of Systems Biology.  These findings could lead to a diagnostic test for identifying those tumors likely to become aggressive and to the development of novel combination therapy for the disease.

The two genes—FOXM1 and CENPF—had been previously implicated in cancer, but none of the prior studies suggested that they might work synergistically to cause the most aggressive form of prostate cancer. The study was published today in the online issue of Cancer Cell.

“Individually, neither gene is significant in terms of its contribution to prostate cancer,” said co-senior author Andrea Califano, the Clyde and Helen Wu Professor of Chemical Biology in Biomedical Informatics and Chair of the Department of Systems Biology. “But when both genes are turned on, they work together synergistically to activate pathways associated with the most aggressive form of the disease.”

Co-principal investigator Andrea Califano discusses the new study.

“Ultimately, we expect this finding to allow doctors to identify patients with the most aggressive prostate cancer so that they can get the most effective treatments,” said co-senior author Cory Abate-Shen, the Michael and Stella Chernow Professor of Urologic Sciences and also a member of the Department of Systems Biology. “Having biomarkers that predict which patients will respond to specific drugs will hopefully provide a more personalized way to treat cancer.”

Distribution of marker expression across development

A new algorithm called Wanderlust uses single-cell measurements to detect how marker expression changes across development.

In a new paper published in the journal Cell, a team of researchers led by Dana Pe’er at Columbia University and Garry Nolan at Stanford University describes a powerful new method for mapping cellular development at the single cell level. By combining emerging technologies for studying single cells with a new, advanced computational algorithm, they have designed a novel approach for mapping development and created the most comprehensive map ever made of human B cell development. Their approach will greatly improve researchers’ ability to investigate development in cells of all types, make it possible to identify rare aberrations in development that lead to disease, and ultimately help to guide the next generation of research in regenerative medicine.

Pointing out why being able to generate these maps is an important advance, Dr. Pe’er, an associate professor in the Columbia University Department of Systems Biology and Department of Biological Sciences, explains, “There are so many diseases that result from malfunctions in the molecular programs that control the development of our cell repertoire and so many rare, yet important, regulatory cell types that we have yet to discover. We can only truly understand what goes wrong in these diseases if we have a complete map of the progression in normal development. Such maps will also act as a compass for regenerative medicine, because it’s very difficult to grow something if you don’t know how it develops in nature. For the first time, our method makes it possible to build a high-resolution map, at the single cell level, that can guide these kinds of research.”

Chris Wiggins

In a “Most Creative People” feature, Fast Company magazine recently interviewed associate professor Chris Wiggins, a faculty member of the Department of Systems Biology and Center for Computational Biology and Bioinformatics, about his new appointment at one of the world’s most respected outlets for digital journalism. In this role, he will lead the development of a machine learning team that will help the New York Times to better understand how its audience is using and navigating its content.

In the interview Dr. Wiggins explains why machine learning is becoming increasingly important in the age of big data, and about the shared challenges that the natural sciences and the media are now both facing.

Tuuli LappalainenTuuli Lappalainen has joined Columbia University as an assistant professor in the Department of Systems Biology. Dr. Lappalainen is a specialist in the analysis of RNA sequencing data, with research interests including functional variation in the human genome, population genetic background of variation in the human genome, and interpretation of genome function.

Dr. Lappalainen joins the Department of Systems Biology in co-appointment with the New York Genome Center (NYGC), where she will also serve as a Junior Investigator and Core Member. Based in lower Manhattan, NYGC is a consortium made up primarily of New York-area institutions that is designed to translate promising genomics-based research into new strategies for treating, preventing, and managing disease. This co-appointment with Columbia University — an institutional founding member of the NYGC — will enhance collaboration between the two institutions. (Read an interview with Dr. Lappalainen at the New York Genome Center website.)

Dr. Lappalainen earned her PhD in genetics at the University of Helsinki, Finland, and held appointments as a postdoctoral researcher in at the University of Geneva Medical School, Switzerland and at the Stanford University School of Medicine. She is the chair of the analysis group for the Genetic European Variation in Health and Disease (Geuvadis) Consortium’s RNA sequencing project, a member of the analysis group for the National Institute of Health’s Genotype Tissue Expression (GTEx) project, and a member of the analysis and functional interpretation groups for the 1000 Genomes Project.

Models of Evolution In Charles Darwin's seminal treatise On the Origin of Species there is only one image, which visualizes evolution as following a branching pattern in which species diverge into lineages over time like the limbs on a tree. With the increasing availability of genomic data, scientists have attempted to understand evolution at the molecular level by using a similar phylogenetic paradigm, but as Department of Systems Biology Assistant Professor Raul Rabadan , MD/PhD student Joseph Chan, and Stanford University mathematician Gunnar Carlsson point out in a new paper published in the Proceedings of the National Academy of Sciences , it has a number of shortcomings when applied in this way. By developing a new mathematical approach based on a method called persistent homology, the researchers produced several insights into viral evolution that could not be found using other means.

Researchers in the Columbia University Department of Systems Biology and Herbert Irving Comprehensive Cancer Center have determined that measuring the expression levels of three genes associated with aging can be used to predict the aggressiveness of seemingly low-risk prostate cancer. Use of this three-gene biomarker, in conjunction with existing cancer-staging tests, could help physicians better determine which men with early prostate cancer can be safely followed with “active surveillance” and spared the risks of prostate removal or other invasive treatment. The findings were published today in the online edition of Science Translational Medicine.

More than 200,000 new cases of prostate cancer are diagnosed each year in the U.S. “Most of these cancers are slow growing and will remain so, and thus they do not require treatment,” said study leader Cory Abate-Shen, Michael and Stella Chernow Professor of Urological Oncology at Columbia University Medical Center (CUMC). “The problem is that, with existing tests, we cannot identify the small percentage of slow-growing tumors that will eventually become aggressive and spread beyond the prostate. The three-gene biomarker could take much of the guesswork out of the diagnostic process and ensure that patients are neither overtreated nor undertreated.”

Rabadan, Nature Genetics

An analysis of all gene mutations in nearly 140 brain tumors has uncovered most of the genes responsible for driving glioblastoma. The analysis found 18 new driver genes (labeled red), never before implicated in glioblastoma and correctly identified the 15 previously known driver genes (labeled blue). The graphs show mutated genes that are commonly found in varying numbers in glioblastoma (left), that frequently contain insertions (middle), and that frequently contain deletions (right). Genes represented by blue dots in the graphs were statistically most likely to be driver genes.

A team of Columbia University Medical Center researchers has identified 18 new genes responsible for driving glioblastoma multiforme, the most common—and most aggressive—form of brain cancer in adults. The study was published August 5, 2013, in the journal Nature Genetics.

The Columbia team used a combination of high-throughput DNA sequencing and a new method of statistical analysis developed by co-author Raul Rabadan, an assistant professor in the Department of Systems Biology, to generate a short list of candidate gene mutations that were highly likely to drive cancer, as opposed to mutations that have no effect.

Considering these results along with a previous study this group conducted, Rabadan and collaborators Antonio Iavarone and Anna Lasorella point out that approximately 15% of glioblastomas could now be targeted with drugs that have already been approved by the FDA. As Lasorella remarks in an article for the CUMC Newsroom, “There is no reason why these patients couldn’t receive these drugs now in clinical trials.”

Searches for hyperglycemia-related terms

Percentage of users in each of the three user groups searching for hyperglycemia-related terms, computed per week over 12 months of search log data. Background refers to the fraction of all searchers who search for hyperglycemia-related symptoms or terminology independent of the presence of the drugs in the users’ search histories.

Although the US Food and Drug Organization and other agencies collect and analyze reports on adverse drug effects, alerts for single drugs and drug-drug interactions are often delayed due to the time it takes to accumulate evidence. Columbia University Department of Systems Biology faculty member Nicholas Tatonetti, in collaboration with investigators at Stanford University and Microsoft Research, hypothesized that Internet users can provide early clues of adverse drug events as they seek information on the web concerning symptoms they are experiencing. A new paper explains their results.

As a test, Tatonetti and colleagues asked whether it would be possible to detect evidence of an interaction between the antidepressant paroxetine and the anti-cholesterol drug pravastatin by analyzing web search logs from 2010. As a postfoc at Stanford, Tatonetti and colleagues used a data mining algorithm to analyze FDA adverse event reporting records, and retroactively found this combination to be associated with hyperglycemia (high blood sugar) in some patients. In this new project, the researchers analyzed the search logs of millions of Internet users from a period before the above association was identified to see how often they entered search terms related to hyperglycemia and to one or both medications under investigation. (Participants in this study opted in by voluntarily installing a web browser extension that tracked their activity anonymously.)

Pages