AbstractBackground:Understanding cellular behavior within the context of the tumor-immune microenvironment is essential to developing next-generation cancer therapies and advancing precision medicine. The inherent challenges of this problem reflect the complexity of human biology - patient and tissue heterogeneity, a multitude of interacting signaling pathways, dynamic short- and long-range interactions between the tumor and the immune system, and the intrinsic limitations of measurement techniques. Machine Learning foundation models trained on multimodal patient data present an opportunity to grapple with this complexity and push the field forward leveraging recent advances in spatial biology.Method:A custom multimodal transformer was trained on 1399 primary resections from lung cancer patients profiled via H&E, CosMx spatial transcriptomics, whole-exome sequencing, and a custom multiplex immunofluorescence panel. To our knowledge this is the largest extant spatial transcriptomics dataset, comprising more than 40 million cells. The transformer was trained via self-supervised learning to predict expression of each CosMx panel gene for a single cell, conditioned on both spatially proximal and patient-level data from all modalities. This training task induces the model to learn fundamental rules that govern cell state and cell-cell interactions within the context of disease. The resulting model, Celleporter, can generate gene expression in a “virtual” cell at a particular location within real or simulated patient tissue. Counterfactual simulations with modified patient data can be used to predict the effects of genetic alterations, gene expression changes, or external interventions on the tumor-immune microenvironment.Result:Celleporter accurately predicted spatial gene expression patterns from sparsely sampled data, resolving the limitations of traditional experimental approaches. Virtual cell simulations reproduced distinct biological states, such as cytotoxic and naïve transitions of CD8+ T cells within and outside tumor regions and reproduced foundational immunology, including the relationship between MHC-I and T cell activation. Comparative analyses across patient cohorts identified immune-suppressive mechanisms in STK11-mutant tumors resistant to immunotherapy. Perturbation simulations highlighted therapeutic targets predicted to restore cytotoxic activity in STK11-mutant tumor microenvironments.Conclusion:This study demonstrates that a self-supervised foundation model trained on large-scale multimodal patient data can learn fundamental aspects of cancer immunology, and accurately reproduce the impact of the tumor-immune microenvironment on cell state in a patient-specific manner. This flexible system for interrogating cell and tissue biology has direct application to patient stratification and target discovery.Citation Format:Yubin Xie, Eshed Margalit, Tyler Van Hensbergen, Dexter Antonio, Jake Schmidt, Yu Phoebe Guo, Jérémie Decalf, Maxime Dhainaut, Lucas Cavalcante, Hargita Kaplan, Rodney Collins, Francis Fernandez, Rob Schiemann, Eric Siefkas, Michela Meister, Joy Tea, Carl Ebeling, Anastasia Mavropoulos, Nicole Snell, Shafique Virani, Ron Alfa, Lacey Padron, Jacob Rinaldi, Daniel Bear. Celleporter: a foundation model of cell and tissue biology with application to patient stratification and target discovery [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2025; Part 1 (Regular Abstracts); 2025 Apr 25-30; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2025;85(8_Suppl_1):Abstract nr 3652.