Background: The volume of health-related user-created content, especially hospital-related questions and answers in online health communities, has rapidly increased. Patients and caregivers participate in online community activities to share their experiences, exchange information, and ask about recommended or discredited hospitals. However, there is little research on how to identify hospital service quality automatically from the online communities. In the past, in-depth analysis of hospitals has used random sampling surveys. However, such surveys are becoming impractical owing to the rapidly increasing volume of online data and the diverse analysis requirements of related stakeholders.
Objective: As a solution for utilizing large-scale health-related information, we propose a novel approach to identify hospital service quality factors and overtime trends automatically from online health communities, especially hospital-related questions and answers.
Methods: We defined social media-based key quality factors for hospitals. In addition, we developed text mining techniques to detect such factors that frequently occur in online health communities. After detecting these factors that represent qualitative aspects of hospitals, we applied a sentiment analysis to recognize the types of recommendations in messages posted within online health communities. Korea's two biggest online portals were used to test the effectiveness of detection of social media-based key quality factors for hospitals.
Results: To evaluate the proposed text mining techniques, we performed manual evaluations on the extraction and classification results, such as hospital name, service quality factors, and recommendation types using a random sample of messages (ie, 5.44% (9450/173,748) of the total messages). Service quality factor detection and hospital name extraction achieved average F1 scores of 91% and 78%, respectively. In terms of recommendation classification, performance (ie, precision) is 78% on average. Extraction and classification performance still has room for improvement, but the extraction results are applicable to more detailed analysis. Further analysis of the extracted information reveals that there are differences in the details of social media-based key quality factors for hospitals according to the regions in Korea, and the patterns of change seem to accurately reflect social events (eg, influenza epidemics).
Conclusions: These findings could be used to provide timely information to caregivers, hospital officials, and medical officials for health care policies.