Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2013

Open Access 01-12-2013 | Research article

A data mining approach for grouping and analyzing trajectories of care using claim data: the example of breast cancer

Authors: Nicolas Jay, Gilles Nuemi, Maryse Gadreau, Catherine Quantin

Published in: BMC Medical Informatics and Decision Making | Issue 1/2013

Login to get access

Abstract

Background

With the increasing burden of chronic diseases, analyzing and understanding trajectories of care is essential for efficient planning and fair allocation of resources. We propose an approach based on mining claim data to support the exploration of trajectories of care.

Methods

A clustering of trajectories of care for breast cancer was performed with Formal Concept Analysis. We exported Data from the French national casemix system, covering all inpatient admissions in the country. Patients admitted for breast cancer surgery in 2009 were selected and their trajectory of care was recomposed with all hospitalizations occuring within one year after surgery. The main diagnoses of hospitalizations were used to produce morbidity profiles. Cumulative hospital costs were computed for each profile.

Results

57,552 patients were automatically grouped into 19 classes. The resulting profiles were clinically meaningful and economically relevant. The mean cost per trajectory was 9,600€. Severe conditions were generally associated with higher costs. The lowest costs (6,957€) were observed for patients with in situ carcinoma of the breast, the highest for patients hospitalized for palliative care (26,139€).

Conclusions

Formal Concept Analysis can be applied on claim data to produce an automatic classification of care trajectories. This flexible approach takes advantages of routinely collected data and can be used to setup cost-of-illness studies.
Appendix
Available only for authorised users
Literature
1.
go back to reference Allotey P, Reidpath DD, Yasin S, Chan CK, de-Graft Aikins A:Rethinking health-care systems: a focus on chronicity. Lancet. 2011, 377 (9764): 450-451. 10.1016/S0140-6736(10)61856-9.CrossRefPubMed Allotey P, Reidpath DD, Yasin S, Chan CK, de-Graft Aikins A:Rethinking health-care systems: a focus on chronicity. Lancet. 2011, 377 (9764): 450-451. 10.1016/S0140-6736(10)61856-9.CrossRefPubMed
2.
go back to reference Mariotto AB, Yabroff KR, Shao Y, Feuer EJ, Brown ML:Projections of the cost of cancer care in the united states: 2010-2020. J Natl Cancer Inst. 2011, 103 (2): 117-128. 10.1093/jnci/djq495.CrossRefPubMedPubMedCentral Mariotto AB, Yabroff KR, Shao Y, Feuer EJ, Brown ML:Projections of the cost of cancer care in the united states: 2010-2020. J Natl Cancer Inst. 2011, 103 (2): 117-128. 10.1093/jnci/djq495.CrossRefPubMedPubMedCentral
3.
go back to reference Gill D, Bruce D, Tan PH:Controlling the cost of breast cancer. Eur J Cancer Care (Engl). 2011, 20 (6): 703-707. 10.1111/j.1365-2354.2011.01289.x.CrossRef Gill D, Bruce D, Tan PH:Controlling the cost of breast cancer. Eur J Cancer Care (Engl). 2011, 20 (6): 703-707. 10.1111/j.1365-2354.2011.01289.x.CrossRef
4.
go back to reference Lund JL, Yabroff KR, Ibuka Y, Russell LB, Barnett PG, Lipscomb J, Lawrence WF, Brown ML:Inventory of data sources for estimating health care costs in the united states. Med Care. 2009, 47 (7 Suppl 1): 127-142.CrossRef Lund JL, Yabroff KR, Ibuka Y, Russell LB, Barnett PG, Lipscomb J, Lawrence WF, Brown ML:Inventory of data sources for estimating health care costs in the united states. Med Care. 2009, 47 (7 Suppl 1): 127-142.CrossRef
5.
go back to reference Yabroff KR, Warren JL, Banthin J, Schrag D, Mariotto A, Lawrence W, Meekins A, Topor M, Brown ML:Comparison of approaches for estimating prevalence costs of care for cancer patients: what is the impact of data source?. Med Care. 2009, 47 (7 Suppl 1): 64-69.CrossRef Yabroff KR, Warren JL, Banthin J, Schrag D, Mariotto A, Lawrence W, Meekins A, Topor M, Brown ML:Comparison of approaches for estimating prevalence costs of care for cancer patients: what is the impact of data source?. Med Care. 2009, 47 (7 Suppl 1): 64-69.CrossRef
6.
go back to reference Beckowski MS, Goyal A, Goetzel RZ, Rinehart CL, Darling KJ, Yarborough CM:Developing alternative methods for determining the incidence, prevalence, and cost burden of coronary heart disease in a corporate population. J Occup Environ Med. 2012, 54 (8): 1026-1038. 10.1097/JOM.0b013e318256f636.CrossRefPubMed Beckowski MS, Goyal A, Goetzel RZ, Rinehart CL, Darling KJ, Yarborough CM:Developing alternative methods for determining the incidence, prevalence, and cost burden of coronary heart disease in a corporate population. J Occup Environ Med. 2012, 54 (8): 1026-1038. 10.1097/JOM.0b013e318256f636.CrossRefPubMed
7.
go back to reference Dombkowski KJ, Lamarand K, Dong S, Perng W, Clark SJ:Using medicaid claims to identify children with asthma. J Public Health Manag Pract. 2012, 18 (3): 196-203. 10.1097/PHH.0b013e31821a3fa7.CrossRefPubMed Dombkowski KJ, Lamarand K, Dong S, Perng W, Clark SJ:Using medicaid claims to identify children with asthma. J Public Health Manag Pract. 2012, 18 (3): 196-203. 10.1097/PHH.0b013e31821a3fa7.CrossRefPubMed
8.
go back to reference Bauer HM, Wright G, Chow J:Evidence of human papillomavirus vaccine effectiveness in reducing genital warts: an analysis of california public family planning administrative claims data, 2007-2010. Am J Public Health. 2012, 102 (5): 833-835. 10.2105/AJPH.2011.300465.CrossRefPubMedPubMedCentral Bauer HM, Wright G, Chow J:Evidence of human papillomavirus vaccine effectiveness in reducing genital warts: an analysis of california public family planning administrative claims data, 2007-2010. Am J Public Health. 2012, 102 (5): 833-835. 10.2105/AJPH.2011.300465.CrossRefPubMedPubMedCentral
9.
go back to reference van Walraven C, Austin PC, Manuel D, Knoll G, Jennings A, Forster AJ:The usefulness of administrative databases for identifying disease cohorts is increased with a multivariate model. J Clin Epidemiol. 2010, 63 (12): 1332-1341. 10.1016/j.jclinepi.2010.01.016.CrossRefPubMed van Walraven C, Austin PC, Manuel D, Knoll G, Jennings A, Forster AJ:The usefulness of administrative databases for identifying disease cohorts is increased with a multivariate model. J Clin Epidemiol. 2010, 63 (12): 1332-1341. 10.1016/j.jclinepi.2010.01.016.CrossRefPubMed
10.
go back to reference Aboa-Eboulé C, Mengue D, Benzenine E, Hommel M, Giroud M, Béjot Y, Quantin C:How accurate is the reporting of stroke in hospital discharge data? a pilot validation study using a population-based stroke registry as control. J Neurol. 2013, 260 (2): 605-613. 10.1007/s00415-012-6686-0.CrossRefPubMed Aboa-Eboulé C, Mengue D, Benzenine E, Hommel M, Giroud M, Béjot Y, Quantin C:How accurate is the reporting of stroke in hospital discharge data? a pilot validation study using a population-based stroke registry as control. J Neurol. 2013, 260 (2): 605-613. 10.1007/s00415-012-6686-0.CrossRefPubMed
11.
go back to reference Quantin C, Benzenine E, Ferdynus C, Sediki M, Auverlot B, Abrahamowicz M, Morel P, Gouyon JB, Sagot P:Advantages and limitations of using national administrative data on obstetric blood transfusions to estimate the frequency of obstetric hemorrhages. J Public Health (Oxf). 2013, 35 (1): 147-156. 10.1093/pubmed/fds057.CrossRef Quantin C, Benzenine E, Ferdynus C, Sediki M, Auverlot B, Abrahamowicz M, Morel P, Gouyon JB, Sagot P:Advantages and limitations of using national administrative data on obstetric blood transfusions to estimate the frequency of obstetric hemorrhages. J Public Health (Oxf). 2013, 35 (1): 147-156. 10.1093/pubmed/fds057.CrossRef
12.
go back to reference Quantin C, Benzenine E, Hägi M, Auverlot B, Abrahamowicz M, Cottenet J, Fournier E, Binquet C, Compain D, Monnet E, Bouvier AM, Danzon A:Estimation of national colorectal-cancer incidence using claims databases. J Cancer Epidemiol. 2012, 2012: 298369-CrossRefPubMedPubMedCentral Quantin C, Benzenine E, Hägi M, Auverlot B, Abrahamowicz M, Cottenet J, Fournier E, Binquet C, Compain D, Monnet E, Bouvier AM, Danzon A:Estimation of national colorectal-cancer incidence using claims databases. J Cancer Epidemiol. 2012, 2012: 298369-CrossRefPubMedPubMedCentral
13.
go back to reference Husain MJ, Brophy S, Macey S, Pinder LM, Atkinson MD, Cooksey R, Phillips CJ, Siebert S:Herald (health economics using routine anonymised linked data). BMC Med Inform Decis Mak. 2012, 12: 24-10.1186/1472-6947-12-24.CrossRefPubMedPubMedCentral Husain MJ, Brophy S, Macey S, Pinder LM, Atkinson MD, Cooksey R, Phillips CJ, Siebert S:Herald (health economics using routine anonymised linked data). BMC Med Inform Decis Mak. 2012, 12: 24-10.1186/1472-6947-12-24.CrossRefPubMedPubMedCentral
14.
go back to reference Fetter R, Shin Y, Freeman J, Averill R, Thompson JD:Case mix definition by diagnosis-related groups. Med Care. 1980, 18 (2): 1-53. Fetter R, Shin Y, Freeman J, Averill R, Thompson JD:Case mix definition by diagnosis-related groups. Med Care. 1980, 18 (2): 1-53.
15.
go back to reference Fayyad U, Piatetsky-Shapiro G, Smyth P:The kdd process for extracting useful knowledge from volumes of data. Commun ACM. 1996, 29 (11): 27-34.CrossRef Fayyad U, Piatetsky-Shapiro G, Smyth P:The kdd process for extracting useful knowledge from volumes of data. Commun ACM. 1996, 29 (11): 27-34.CrossRef
16.
go back to reference Wille R:Restructuring lattice theory: an approach based on hierarchies of concepts. Ordered Sets. NATO Advanced Study Institutes Series, vol. 83. 1982, Springer Netherlands: Reidel, Wille R:Restructuring lattice theory: an approach based on hierarchies of concepts. Ordered Sets. NATO Advanced Study Institutes Series, vol. 83. 1982, Springer Netherlands: Reidel,
17.
go back to reference Priss U:Formal concept analysis in information science. Ann Rev Information Sci Technol. 2006, 40: 521-543.CrossRef Priss U:Formal concept analysis in information science. Ann Rev Information Sci Technol. 2006, 40: 521-543.CrossRef
18.
go back to reference Agrawal R, Imielski T, Swami A:Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Int’l Conference on Management of Data. 1993, New York: ACM, 207-216. Agrawal R, Imielski T, Swami A:Mining association rules between sets of items in large databases. Proceedings of the ACM SIGMOD Int’l Conference on Management of Data. 1993, New York: ACM, 207-216.
19.
go back to reference Pasquier N, Bastide Y, Taouil R, Lakhal L:Efficient mining of association rules using closed itemset lattices. J Info Syst. 1999, 24: 25-46. 10.1016/S0306-4379(99)00003-4.CrossRef Pasquier N, Bastide Y, Taouil R, Lakhal L:Efficient mining of association rules using closed itemset lattices. J Info Syst. 1999, 24: 25-46. 10.1016/S0306-4379(99)00003-4.CrossRef
20.
go back to reference Zaki MJ, Hsiao CJ:Charm: an efficient algorithm for closed itemset mining. SDM. Edited by: Grossman RL, Han J, Kumar V, Mannila H, Motwani R. 2002, Arlington: SIAM, Zaki MJ, Hsiao CJ:Charm: an efficient algorithm for closed itemset mining. SDM. Edited by: Grossman RL, Han J, Kumar V, Mannila H, Motwani R. 2002, Arlington: SIAM,
21.
go back to reference Wang J, Han J, Pei J:Closet+: searching for the best strategies for mining frequent closed itemsets. KDD. Edited by: Getoor L, Senator TE, Domingos P, Faloutsos C. 2003, ACM, 236-245. Wang J, Han J, Pei J:Closet+: searching for the best strategies for mining frequent closed itemsets. KDD. Edited by: Getoor L, Senator TE, Domingos P, Faloutsos C. 2003, ACM, 236-245.
22.
go back to reference Valtchev P, Missaoui R, Godin R:Formal concept analysis for knowledge discovery and data mining: the new challenges. ICFCA Lecture Notes in Computer Science, vol. 2961. Edited by: Eklund PW. 2004, Berlin, Heidelberg: Springer, 352-371. Valtchev P, Missaoui R, Godin R:Formal concept analysis for knowledge discovery and data mining: the new challenges. ICFCA Lecture Notes in Computer Science, vol. 2961. Edited by: Eklund PW. 2004, Berlin, Heidelberg: Springer, 352-371.
23.
go back to reference Cole R, Eklund P:Scalability in formal concept analysis. Comput Intell. 1999, 15: 11-27. 10.1111/0824-7935.00079.CrossRef Cole R, Eklund P:Scalability in formal concept analysis. Comput Intell. 1999, 15: 11-27. 10.1111/0824-7935.00079.CrossRef
24.
go back to reference Jiang G, Ogasawara K, Endoh A, Sakurai T:Context-based ontology building support in clinical domains using formal concept analysis. Int J Med Inform. 2003, 71 (1): 71-81. 10.1016/S1386-5056(03)00092-3.CrossRefPubMed Jiang G, Ogasawara K, Endoh A, Sakurai T:Context-based ontology building support in clinical domains using formal concept analysis. Int J Med Inform. 2003, 71 (1): 71-81. 10.1016/S1386-5056(03)00092-3.CrossRefPubMed
25.
go back to reference Jay N, Kohler F, Napoli A:Using formal concept analysis for mining and interpreting patient flows within a healthcare network. Concept Lattices and Their Applications. Lecture Notes in Computer Science, vol. 4923. Edited by: Yahia S, Nguifo E, Belohlavek R. 2008, Berlin, Heidelberg: Springer, 263-268. Jay N, Kohler F, Napoli A:Using formal concept analysis for mining and interpreting patient flows within a healthcare network. Concept Lattices and Their Applications. Lecture Notes in Computer Science, vol. 4923. Edited by: Yahia S, Nguifo E, Belohlavek R. 2008, Berlin, Heidelberg: Springer, 263-268.
26.
go back to reference Aswani Kumar C, Srinivas S:Mining associations in health care data using formal concept analysis and singular value decomposition. J Biol Syst. 2010, 18 (04): 787-807. 10.1142/S0218339010003512.CrossRef Aswani Kumar C, Srinivas S:Mining associations in health care data using formal concept analysis and singular value decomposition. J Biol Syst. 2010, 18 (04): 787-807. 10.1142/S0218339010003512.CrossRef
27.
go back to reference Kaytoue M, Kuznetsov SO, Napoli A, Duplessis S:Mining gene expression data with pattern structures in formal concept analysis. Inf Sci. 2011, 181 (10): 1989-2001. 10.1016/j.ins.2010.07.007.CrossRef Kaytoue M, Kuznetsov SO, Napoli A, Duplessis S:Mining gene expression data with pattern structures in formal concept analysis. Inf Sci. 2011, 181 (10): 1989-2001. 10.1016/j.ins.2010.07.007.CrossRef
28.
go back to reference Kumar CA:Fuzzy clustering-based formal concept analysis for association rules mining. Appl Art Intell. 2012, 26 (3): 274-301. 10.1080/08839514.2012.648457.CrossRef Kumar CA:Fuzzy clustering-based formal concept analysis for association rules mining. Appl Art Intell. 2012, 26 (3): 274-301. 10.1080/08839514.2012.648457.CrossRef
29.
go back to reference Stumme G, Taouil R, Bastide Y, Pasquier N, Lakhal L:Computing iceberg concept lattices with titanic. Data Knowl Eng. 2002, 42 (2): 189-222. 10.1016/S0169-023X(02)00057-5.CrossRef Stumme G, Taouil R, Bastide Y, Pasquier N, Lakhal L:Computing iceberg concept lattices with titanic. Data Knowl Eng. 2002, 42 (2): 189-222. 10.1016/S0169-023X(02)00057-5.CrossRef
30.
go back to reference Kuznetsov S, Obiedkov S, Roth C:Reducing the representation complexity of lattice-based taxonomies. Proc. of ICCS 15th Intl Conf Conceptual Structures. LNCS/LNAI vol. 4604. Edited by: Priss U, Polovina S, Hill R. 2007, Berlin, Heidelberg: Springer, 241-254. Kuznetsov S, Obiedkov S, Roth C:Reducing the representation complexity of lattice-based taxonomies. Proc. of ICCS 15th Intl Conf Conceptual Structures. LNCS/LNAI vol. 4604. Edited by: Priss U, Polovina S, Hill R. 2007, Berlin, Heidelberg: Springer, 241-254.
31.
go back to reference Jay N, Kohler F, Napoli A:Analysis of social communities with iceberg and stability-based concept lattices. International Conference on Formal Concept Analysis (ICFCA’08). Lecture Notes in Artificial Intelligence, vol. 4933. 2008, Berlin, Heidelberg: Springer, 258-272. Jay N, Kohler F, Napoli A:Analysis of social communities with iceberg and stability-based concept lattices. International Conference on Formal Concept Analysis (ICFCA’08). Lecture Notes in Artificial Intelligence, vol. 4933. 2008, Berlin, Heidelberg: Springer, 258-272.
32.
go back to reference Breiman L, Friedman J, Stone CJ, Olshen RA: Classification and Regression Trees. 1984, New York: Chapman and Hall/CRC Breiman L, Friedman J, Stone CJ, Olshen RA: Classification and Regression Trees. 1984, New York: Chapman and Hall/CRC
35.
go back to reference Ministère des affaires sociales et de la santé:Guide méthodologique de production des informations relatives à l’activité médicale et à sa facturation en médecine, chirurgie, obstétrique et odontologie. Technical Rep, Bulletin officiel, 2012/6 bis, Fascicule spécial 2012, Ministère des affaires sociales et de la santé:Guide méthodologique de production des informations relatives à l’activité médicale et à sa facturation en médecine, chirurgie, obstétrique et odontologie. Technical Rep, Bulletin officiel, 2012/6 bis, Fascicule spécial 2012,
36.
go back to reference Couris CM, Schott AM, Ecochard R, Morgon E, Colin C:A literature review to assess the use of claims databases in identifying incident cancer cases. Health Serv Outcomes Res Method. 2003, 4 (1): 49-63. 10.1023/A:1025828911298.CrossRef Couris CM, Schott AM, Ecochard R, Morgon E, Colin C:A literature review to assess the use of claims databases in identifying incident cancer cases. Health Serv Outcomes Res Method. 2003, 4 (1): 49-63. 10.1023/A:1025828911298.CrossRef
37.
go back to reference Mitton N, Colonna M, Trombert B, Olive F, Gomez F, Iwaz J, Polazzi S, Schott-Petelaz AM, Uhry Z, Bossard N, Remontet L:A suitable approach to estimate cancer incidence in area without cancer registry. J Cancer Epidemiol. 2011, 2011: 418968-CrossRefPubMedPubMedCentral Mitton N, Colonna M, Trombert B, Olive F, Gomez F, Iwaz J, Polazzi S, Schott-Petelaz AM, Uhry Z, Bossard N, Remontet L:A suitable approach to estimate cancer incidence in area without cancer registry. J Cancer Epidemiol. 2011, 2011: 418968-CrossRefPubMedPubMedCentral
38.
go back to reference Quantin C, Benzenine E, Fassa M, Hägi M, Fournier E, Gentil J, Compain D, Monnet E, Arveux P, Danzon A:Evaluation of the interest of using discharge abstract databases to estimate breast cancer incidence in two french departments. Stat J IAOS: J Int Assoc Official Stat. 2012, 28 (1): 73-85. Quantin C, Benzenine E, Fassa M, Hägi M, Fournier E, Gentil J, Compain D, Monnet E, Arveux P, Danzon A:Evaluation of the interest of using discharge abstract databases to estimate breast cancer incidence in two french departments. Stat J IAOS: J Int Assoc Official Stat. 2012, 28 (1): 73-85.
39.
go back to reference Starfield B, Weiner J, Mumford L, Steinwachs D:Ambulatory care groups: a categorization of diagnoses for research and management. Health Serv Res. 1991, 26 (1): 53-74.PubMedPubMedCentral Starfield B, Weiner J, Mumford L, Steinwachs D:Ambulatory care groups: a categorization of diagnoses for research and management. Health Serv Res. 1991, 26 (1): 53-74.PubMedPubMedCentral
40.
go back to reference Grubinger T, Kobel C, Pfeiffer KP:Regression tree construction by bootstrap: Model search for drg-systems applied to austrian health-data. BMC Med Info Dec Mak. 2010, 10 (1): 9-10.1186/1472-6947-10-9.CrossRef Grubinger T, Kobel C, Pfeiffer KP:Regression tree construction by bootstrap: Model search for drg-systems applied to austrian health-data. BMC Med Info Dec Mak. 2010, 10 (1): 9-10.1186/1472-6947-10-9.CrossRef
41.
go back to reference Smidth M, Sokolowski I, Kærsvang L, Vedsted P:Developing an algorithm to identify people with chronic obstructive pulmonary disease (copd) using administrative data. BMC Med Inform Decis Mak. 12: 38- Smidth M, Sokolowski I, Kærsvang L, Vedsted P:Developing an algorithm to identify people with chronic obstructive pulmonary disease (copd) using administrative data. BMC Med Inform Decis Mak. 12: 38-
42.
go back to reference Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A:Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. J Clin Epidemiol. 2011, 64 (8): 821-829. 10.1016/j.jclinepi.2010.10.006.CrossRefPubMed Benchimol EI, Manuel DG, To T, Griffiths AM, Rabeneck L, Guttmann A:Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. J Clin Epidemiol. 2011, 64 (8): 821-829. 10.1016/j.jclinepi.2010.10.006.CrossRefPubMed
Metadata
Title
A data mining approach for grouping and analyzing trajectories of care using claim data: the example of breast cancer
Authors
Nicolas Jay
Gilles Nuemi
Maryse Gadreau
Catherine Quantin
Publication date
01-12-2013
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2013
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/1472-6947-13-130

Other articles of this Issue 1/2013

BMC Medical Informatics and Decision Making 1/2013 Go to the issue