Chinese medicine(CM) is a discipline with its own distinct methodologies and philosophical principles.The main method of treatment in CM is to use herbal prescriptions.Typically,a number of herbs are combined to form a formula and different formulae are prescribed for different patients.Regularities in the mixture of herbs in the prescriptions are important for both clinical treatment and novel patent medicine development.In this study,we analyze CM formula data using latent tree(LT) models.Interesting regularities are discovered.Those regularities are of interest to students of CM as well as pharmaceutical companies that manufacture medicine using Chinese herbs.
Induction of common knowledge or regularities from large-scale clinical data is a vital task for Chinese medicine(CM).In this paper,we propose a data mining method,called the Symptom-Herb-Diagnosis topic(SHDT) model,to automatically extract the common relationships among symptoms,herb combinations and diagnoses from large-scale CM clinical data.The SHDT model is one of the multi-relational extensions of the latent topic model,which can acquire topic structure from discrete corpora(such as document collection) by capturing the semantic relations among words.We applied the SHDT model to discover the common CM diagnosis and treatment knowledge for type 2 diabetes mellitus(T2DM) using 3 238 inpatient cases.We obtained meaningful diagnosis and treatment topics(clusters) from the data,which clinically indicated some important medical groups corresponding to comorbidity diseases(e.g.,heart disease and diabetic kidney diseases in T2DM inpatients).The results show that manifestation sub-categories actually exist in T2DM patients that need specific,individualised CM therapies.Furthermore,the results demonstrate that this method is helpful for generating CM clinical guidelines for T2DM based on structured collected clinical data.