OBJECTIVESymptoms are significant kind of phenotypes for managing and controlling of the burst of acute infectious diseases, such as COVID-19. Although patterns of symptom clusters and time series have been considered the high potential prediction factors for the prognosis of patients, the elaborated subtypes and their progression patterns based on symptom phenotypes related to the prognosis of COVID-19 patients still need be detected. This study aims to investigate patient subtypes and their progression patterns with distinct features of outcome and prognosis.METHODSThis study included a total of 14,139 longitudinal electronic medical records (EMRs) obtained from four hospitals in Hubei Province, China, involving 2,683 individuals in the early stage of COVID-19 pandemic. A deep representation learning model was developed to help acquire the symptom profiles of patients. K-means clustering algorithm is used to divide them into distinct subtypes. Subsequently, symptom progression patterns were identified by considering the subtypes associated with patients upon admission and discharge. Furthermore, we used Fisher's test to identify significant clinical entities for each subtype.RESULTSThree distinct patient subtypes exhibiting specific symptoms and prognosis have been identified. Particularly, Subtype 0 includes 44.2% of the whole and is characterized by poor appetite, fatigue and sleep disorders; Subtype 1 includes 25.6% cases and is characterized by confusion, cough with bloody sputum, encopresis and urinary incontinence; Subtype 2 includes 30.2% cases and is characterized by dry cough and rhinorrhea. These three subtypes demonstrate significant disparities in prognosis, with the mortality rates of 4.72%, 8.59%, and 0.25% respectively. Furthermore, symptom cluster progression patterns showed that patients with Subtype 0 who manifest dark yellow urine, chest pain, etc. in the admission stage exhibit an elevated risk of transforming into the more severe subtypes with poor outcome, whereas those presenting with nausea and vomiting tend to incline towards entering the milder subtype.CONCLUSIONThis study has proposed a clinical meaningful approach by utilizing the deep representation learning and real-world EMR data containing symptom phenotypes to identify the COVID-19 subtypes and their progression patterns. The results would be potentially useful to help improve the precise stratification and management of acute infectious diseases.