About Me
Welcome to my homepage! My name is Bojian Hou (also Bo-Jian Hou), an AI Research Scientist at Meta, with a focus on large-scale machine learning and artificial intelligence systems. My current work centers on ranking, recommendation, and retrieval, as well as large language models (LLMs) and multimodal generative AI.
Before joining Meta, I was a postdoctoral researcher in the Department of Biostatistics, Epidemiology and Informatics at the University of Pennsylvania, where I had the privilege of being advised by Prof. Li Shen.
I received my B.Sc. and Ph.D. degrees in the Department of Computer Science at Nanjing University in 2014 and 2020, respectively. I was fortunate to be supervised by Prof. Zhi-Hua Zhou in the LAMDA Group.
Recent Highlights
- 10-23-2025: Our paper "IRIS: Interpretable Risk Clustering Intelligence for Survival Analysis" with was accepted by IEEE BigData 2025.
- 10-22-2025: Our paper "Advanced Topic Modeling with Large Language Models: Analyzing Social Media Content from Dementia Caregivers" with was accepted by Innovation in Aging.
- 09-18-2025: Our paper "Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization" with was accepted by NeurIPS 2025.
- 08-27-2025: Our paper "Fair CCA for Fair Representation Learning: An ADNI Study" with was accepted by ACM BCB 2025.
- 08-27-2025: Our paper "Enabling Few-Shot Alzheimer's Disease Diagnosis on Biomarker Data with Tabular LLMs" with was accepted by ACM BCB 2025. (This paper won the Best Paper Award!)
- 07-28-2025: I joined Meta as an AI Research Scientist, focusing on large-scale machine learning and artificial intelligence systems.
- 06-27-2025: I am organizing the Vision-Based AI for Digital Health: From Pixels to Practice (VADH'25) workshop at ICCV 2025.
- 05-15-2025: Our paper "MentalChat16K: A Benchmark Dataset for Conversational Mental Health Assistance" with was accepted by KDD 2025 Oral.
- 05-01-2025: Our paper "Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach" with was accepted by ICML 2025.
- 01-22-2025: Our paper "Fine-Tuning Attention Modules Only: Enhancing Weight Disentanglement in Task Arithmetic" with was accepted by ICLR 2025.
- 11-27-2024: Our paper "Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models" with was accepted by AMIA 2025 Informatics Summit.
- 11-27-2024: Our paper "Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs" with was accepted by AMIA 2025 Informatics Summit.
- 11-27-2024: Our paper "Understanding the Clinical Modalities Important in NeuroDegenerative Disorders, Alzheimer's Disease, and Risk of Patient Injury Using Machine Learning and Survival Analysis" with was accepted by AMIA 2025 Informatics Summit.
- 10-26-2024: Our paper "SEFD: Semantic-Enhanced Framework for Detecting LLM-Generated Text" with was accepted by 2024 IEEE International Conference on Big Data (IEEE BigData 2024).
- 10-13-2024: Our paper "Manifoldron: Direct Space Partition via Manifold Discovery" with was accepted by IEEE Transactions on Neural Networks and Learning Systems (TNNLS).
- 09-25-2024: Our paper "Fairness-Aware Estimation of Graphical Models" with was accepted by NeurIPS 2024.
- 09-21-2024: Our paper "MG-TCCA: Tensor Canonical Correlation Analysis across Multiple Groups" with was accepted by IEEE/ACM Transactions on Computational Biology and Bioinformatics.
- 09-20-2024: Our paper "DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature" with was accepted by EMNLP 2024.
- 09-10-2024: Our paper "Uncovering Important Diagnostic Features for Alzheimer's, Parkinson's and Other Dementias Using Interpretable Association Mining Methods" with was accepted by Pacific Symposium on Biocomputing (PSB).
- 06-29-2024: Our paper "Analyzing Dementia Caregivers' Experiences on Twitter: A Term-Weighted Topic Modeling Approach" with was accepted by AMIA 2024 Annual Symposium.
- 06-29-2024: Our paper "Ensuring Fairness in Detecting Mild Cognitive Impairment with MRI" with was accepted by AMIA 2024 Annual Symposium. (This paper won the Distinguished Paper Award!)
- 06-29-2024: Our paper "MentalGPT: Harnessing AD for compassionate mental health support" with was accepted by AMIA 2024 Annual Symposium.
- 06-03-2024: Our paper "Interpretable Deep Clustering Survival Machines for Alzheimer's Disease Subtype Discovery" with was accepted by Medical Image Analysis.
- 04-24-2024: Our paper "Quadratic Neuron-empowered Heterogeneous Autoencoder for Unsupervised Anomaly Detection" with was accepted by IEEE Transactions on Artificial Intelligence (TAI).
- 01-19-2024: Our paper "Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods" with was accepted by AISTATS'24.
- 12-21-2023: Our paper "PFERM: A Fair Empirical Risk Minimization Approach with Prior Knowledge" with was accepted by AMIA 2024 Informatics Summit.
- 12-21-2023: Our paper "Interpretability Study for Long Interview Transcripts from Behavior Intervention Sessions for Family Caregivers of Dementia Patients" with was accepted by AMIA 2024 Informatics Summit.
- 12-21-2023: Our paper "Cluster Analysis of Cortical Amyloid Burden for Identifying Imaging-driven Subtypes in Mild Cognitive Impairment" with was accepted by AMIA 2024 Informatics Summit.
- 10-10-2023: Our paper "Online Learning from Evolving Feature Spaces with Deep Variational Models" with was accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE).
- 09-21-2023: Our paper "Fair Canonical Correlation Analysis" with was accepted by NeurIPS'23.
- 07-24-2023: Our paper "Multi-Group Tensor Canonical Correlation Analysis" with was accepted by ACM BCB'23. (This paper won the Best Paper Award!)
- 05-08-2023: Our paper "Fairness-Aware Class Imbalanced Learning on Multiple Subgroups" with was accepted by UAI'23.
- 04-18-2023: Our paper "Enhancing thoracic disease detection using chest X-rays from PubMed Central Open Access" with was accepted by Computers in Biology and Medicine.
- 01-22-2023: Our paper "Deep Clustering Survival Machines with Interpretable Expert Distributions" with was accepted by ISBI'23.
- 12-16-2022: Our paper "Evaluate underdiagnosis and overdiagnosis bias of deep learning model on primary open-angle glaucoma diagnosis in under-served populations" with was accepted by AMIA 2023 Informatics Summit.
- 06-29-2022: Our paper "Online Deep Learning from Doubly-Streaming Data" with was accepted by ACMMM'22.
- 06-16-2022: Our paper "Automated diagnosing primary open-angle glaucoma from fundus image by simulating human's grading with deep learning." with was accepted by Scientific Report.
- 11-22-2021: Winning the Excellent Doctoral Dissertation Award of Jiangsu Province.
- 09-13-2021: Winning the Excellent Doctoral Dissertation Award of Nanjing University.
- 08-31-2021: Our paper "Online Learning in Variable Feature Spaces with Mixed Data" with was accepted by ICDM'21.
- 04-01-2021: Our paper "Prediction with Unpredictable Feature Evolution" with was accepted by IEEE Transactions on Neural Networks and Learning Systems.
- 12-24-2020: Winning the JSAI Excellent Doctoral Dissertation Award.
- 12-02-2020: Our paper "Storage Fit Learning with Feature Evolvable Streams" with was accepted by AAAI'21.
- 06-18-2020: Winning the CS Excellent Doctoral Dissertation Award of Nanjing University.
- 05-27-2020: I have successfully defended my PhD dissertation and became a Ph.D.
- 04-21-2020: Winning the Outstanding Graduate Student Award of Nanjing University.
Research Interests
As a machine learning researcher, I am interested in both the theoretical foundations and real-world applications of AI. My research spans:
- Interpretability: studying and improving transparency of black-box machine learning models
- Fairness Learning: developing fair and unbiased machine learning algorithms
- Feature Evolvable Learning: learning with dynamic and shifting feature spaces
- Large Language Models: calibration, uncertainty estimation, and response control
- Multimodal Learning: vision-language modeling and cross-modal retrieval
- Recommendation Systems: multi-stage ranking, retrieval, and user modeling
- Semi-Supervised Learning: learning models from both labeled and unlabeled data
- Online Learning: learning models continuously from streaming data
- Biomedical Data Mining: developing machine learning methods for Alzheimer’s disease, dementia, and mental health
I am passionate about bridging fundamental ML research with real-world deployment, and designing AI systems that are not only technically rigorous, but also trustworthy and socially responsible.
