Towards a smart cancer registry (#785)
Aims: Pathology notification for a Cancer Registry is regarded as the most valid information for the confirmation of a diagnosis of cancer. In view of the importance of pathology data, an automatic medical text analysis system (Medtex) is being developed to perform electronic Cancer Registry data extraction and coding of important clinical information embedded within pathology reports.
Methods: The system automatically scans HL7 messages received from a Queensland pathology information system and analyses the reports for terms and concepts relevant to a cancer notification. A multitude of data items for cancer notification such as primary site, histological type, stage, and other synoptic data are classified by the system. The underlying extraction and classification technology is based on SNOMED CT1 2. The Queensland Cancer Registry business rules3 and International Classification of Diseases – Oncology – Version 34 have been incorporated.
Results: The cancer notification services show that the classification of notifiable reports can be achieved with sensitivities of 98% and specificities of 96%5, while the coding of cancer notification items such as basis of diagnosis, histological type and grade, primary site and laterality can be extracted with an overall accuracy of 80%6. In the case of lung cancer staging, the automated stages produced were accurate enough for the purposes of population level research and indicative staging prior to multi-disciplinary team meetings2 7. Medtex also allows for detailed tumour stream synoptic reporting8.
Conclusions: Medtex demonstrates how medical free-text processing could enable the automation of some Cancer Registry processes. Over 70% of Cancer Registry coding resources are devoted to information acquisition. The development of a clinical decision support system to unlock information from medical free-text could significantly reduce costs arising from duplicated processes and enable improved decision support, enhancing efficiency and timeliness of cancer information for Cancer Registries.
- A. Nguyen, M. Lawley, D. Hansen, S. Colquist, “A Simple Pipeline Application for Identifying and Negating SNOMED Clinical Terminology in Free-text,” Health Informatics Conference, pp. 188-193, Canberra, Australia, 2009.
- A. Nguyen, M. Lawley, D. Hansen, R. Bowman, B. Clarke, E. Duhig, S. Colquist, “Symbolic rule-based classification of lung cancer stages from free-text pathology reports,” J Am Med Inform Assoc, 17(4): 440-445, 2010.
- Queensland Cancer Registry “Clinical Coding Manual v3.”
- International classification of diseases for oncology 3rd Edition, WHO 2000
- A. Nguyen, J. Moore, G. Zuccon, M. Lawley, S. Colquist, “Classification of pathology reports for Cancer Registry notifications,” Health Informatics Conference, pp. 150-156, Sydney, Australia, 2012.
- A. Nguyen, J. Moore, M. Lawley, D. Hansen, S. Colquist, “Automatic Extraction of Cancer Characteristics from Free-Text Pathology Reports for Cancer Notifications,” Health Informatics Conference, pp. 117-124, Brisbane, Australia, 2011.
- I. McCowan, D. Moore, A. Nguyen, R. Bowman, B. Clarke, E. Duhig, M. Fry, “Collection of Cancer Stage Data by Classifying Free-text Medical Reports,” J Am Med Inform Assoc, 14(6): 736-745, 2007.
- A. Nguyen, M. Lawley, D. Hansen, S. Colquist, “Structured Pathology Reporting for Cancer from Free-text: Lung Cancer Case Study,” eJHI, 7(1): e8, 2012.