Mutations may be caused by mistakes during cell division, or they may be caused by exposure to DNA-damaging agents in the environment. Mutations can be harmful, beneficial, or have no effect. Certain mutations may lead to cancer or other diseases. A mutation is sometimes called a variant.

Gene mutations in cancer cells interfere with the normal instructions in a cell and can cause it to grow out of control or not die when it should. A cancer can continue to grow because cancer cells act differently than normal cells.

But what can we do about it?

The answer relies in understanding deeply each patient.

The advent of precision medicine is moving us closer to more precise, predictable and powerful health care that is customized for the individual patient. Our growing understanding of genetics and genomics — and how they drive health, disease and drug responses in each person — is enabling doctors to provide better disease prevention, more accurate diagnoses, safer drug prescriptions and more effective treatments for the many diseases and conditions that diminish our health.

Tailoring health care to each person’s unique genetic makeup – that’s the promising idea behind precision medicine, also variously known as individualized medicine, personalized medicine or genomic medicine.

This project consists in the identification of these mutations, throughout the knowledge of genes and variations the algorithm will be able to identify if a mutation is harmful or not.

This project uses NLP techniques once we had the need to analyse text about the genes. Stratication and improvement techniques were also applied.

To measure the performance of the algorithm, the log loss metric was applied, log loss is indicative of how close the prediction probability is to the corresponding actual/true value. The more the predicted probability diverges from the actual value, the higher is the log loss value.

Julia-py
Julia Hornick, data scientist.