Text Mining Approach to analyse the relation between obesity and breast cancer data
- Ashok Kumar, Priyanka Thakur, Kanika Gupta, and Amit Pal
Biomedical research needs to leverage and exploit large amount of information reported in
scientific publication. Literature data collected from publications has to be managed to extract
information, transforms into an understandable structure using text mining approaches. Text mining
refers to the process of deriving high-quality information from text by finding relationships between
entities which do not show direct associations. Therefore, as an example of this approach, we
present the link between two diseases i.e. breast cancer and obesity.Obesity is known to be
associated with cancer mortality, but little is known about the link between lifetime changes in BMI
of obese person and cancer mortality in both males and females. In this article, literature data for
obesity and breast cancer was obtained using PubMed database and then methodologies which
employs groups of common genes and keywords with their frequency of occurrence in the data
were used, aimed to establish relation between obesity and breast cancer visualized using Pi-charts
and bar graphs. From the data analysis, we obtained 1 gene which showed the link between both the
diseases and validated using statistical analysis and disease-connect web server. We also proposed 8
common higher frequency keywords which could be used for indexing while searching the
literature for obesity and breast cancer in combination.