Mutations are the replacement of DNA bases known as Adenine (A), Cytosine (C), Guanine (G) and Thymine (T) with other bases. When mutations such as C to T or G to A are found within a specific DNA sequence, this is known as a mutation signature. These mutation signatures are like spelling mistakes that carry signs of the agents that caused the mutations. Ultraviolet light, tobacco smoke and other cancer-causing agents leave behind such signatures in the DNA of tumors.

Recently, a new mutation signature found in cancer cells was suspected to have been created by a family of enzymes found in human cells called the APOBEC3 family. The study, "Strand-biased Cytosine deamination at the Replication Fork causes Cytosine to Thymine Mutations in Escherichia coli," led by Ashok Bhagwat, Ph.D., professor of chemistry in the College of Liberal Arts and Sciences at Wayne State University, was recently published in the Proceedings of the National Academy of Sciences.

In addition to Bhagwat, collaborators from Wayne State University and Indiana University have determined the target within DNA that is attacked by APOBEC3 enzymes. Results from this basic science research project provide an understanding of a major source of mutations that may drive tumor growth and also explains a key finding in microbial evolution.

DNA consists of two thin strands that are made up of the four bases, which are arranged in specific sequences, creating words and chapters that contain the secrets of the cell. The two DNA strands are intertwined with each other to protect the bases from damage by chemicals and enzymes. Unfortunately, the cell must copy its DNA before it can divide. This copying process requires that the two DNA strands are briefly separated, making DNA "single-stranded" and thus susceptible to damage.

According to Bhagwat, an odd quirk of DNA biochemistry is that one of the DNA strands, known as the lagging-strand template (LGST), stays single much longer than its counterpart, the leading-strand template (LDST). The WSU/IU team showed that APOBEC3 enzymes preferentially attack the LGST, causing mutations during DNA copying.

"We did this work using the simple bacterium Escherichia coli as a model, introducing the active part of the human enzyme APOBEC3G in it," said Bhagwat. "The advantage of using E. coli is that its complete DNA sequence can be easily determined and the way it copies its DNA is well understood."

Bhagwat's research team has been studying the larger AID/APOBEC family of enzymes for the past 14 years and has helped show that this family of enzymes converts C to an abnormal base called Uracil (U). The U gets repaired back to C most of the time, but sometimes this process fails and U is fixed as a T. This is called a C-to-T mutation.

Bhagwat initiated collaboration with Patricia Foster, Ph.D., at Indiana University and provided her group the A3G gene to express in E. coli. They determined the DNA sequence of hundreds of such bacteria and cataloged more than 1,000 mutations caused by A3G. Weilong Hao, Ph.D., assistant professor of biological sciences at Wayne State, later analyzed the mutations, and noticed that when A3G was in the cells, C's in the LGST were replaced with T's three to four times the frequency at which they were getting replaced in the LDST. Statistical analysis of the data showed that this occurrence was extremely unlikely to happen by chance, which means that APOBEC3 enzymes must target the LGST. Cancer is often called a genetic disease because nearly all cancer-causing agents cause mutations. When the DNA sequence of breast tumors and other cancers was recently determined, C to T were the most frequently found mutations. These mutations were often found in clusters, suggesting that large stretches of DNA must become damaged in a single mutational event.

"These mutations had the signature of mutations caused by APOBEC3 enzymes, but it was unclear where these enzymes found the necessary long stretches of single-stranded DNA to mutate," said Bhagwat. "The work by our collaborative team has shown that during the copying of DNA, the LGST strand of DNA is accessible to APOBECs and this causes the mutations."

According to Bhagwat, bacteria like E. coli display a phenomenon called "GC skew" that is related to this discovery. Bacterial DNA typically has fewer C's in LGST than G's. In light of the results of work by WSU/IU scientists, this observation can be explained at the molecular level. Bacteria do not naturally contain APOBEC3 enzymes, but water and other cellular chemicals can also cause C to T mutations. However, they do so at a very slow rate compared to APOBECs. Despite this slowness, the bacteria have replaced many of the C's in their LGST with T's over millions of years of evolution, creating the GC skew. Thus, the act of copying DNA, which is essential to life, drives both microbial evolution and cancer development.

"Our results could have a great impact on identifying the source of mutations in many cancers and perhaps tailoring treatments based on this information," said Bhagwat. "Only some tumors have APOBEC mutational signatures and these can be identified using current DNA sequencing technology. Eventually, we may be able to treat these cancers in their early stages to prevent mutations caused by APOBEC3s."

"This study is a beautiful example of how the power of bioinformatics and genomics is valuable in addressing important biological questions," said Hao. "It has potential to make a positive impact on the health outcomes of people with cancer and possibly other diseases in the near future."