Mahdi Erfanian
Ph.D. Candidate at UIC
5453, Computer Design Research and Learning Center (CDRLC)
850 W Taylor street,
Chicago, IL 60607
I’m Mahdi Erfanian, a Ph.D. candidate in Computer Science at the University of Illinois Chicago, where I am a member of the IndexLab under the supervision of Dr. Abolfazl Asudeh. I received my B.Sc in Computer Engineering from Sharif University of Technology. My research spans Multimodal Data Management, Generative AI, Algorithmic Fairness, and Foundation Models.
I am particularly interested in employing foundation models, including large language models, to address different challenges in data management—such as mitigating bias in training data and enhancing multi-modal data retrieval through synthetic data generation. My work has led to systems that outperform state-of-the-art baselines like OpenAI’s CLIP by 200% in mean average precision on complex natural language queries. Additionally, I am passionate about Algorithm Design, with a focus on optimizing both fairness and efficiency in data-driven systems.
I am seeking a 2026 Research Scientist Internship to build scalable AI products in areas including multimodal retrieval, foundation models, and vector databases.
news
| Oct 15, 2025 | Excited to serve as a reviewer and PC member for top-tier venues in 2024-2025! 🎯 Including ICLR 2025, NeurIPS 2025 DynaFront Workshop, CIKM 2025 (PC Member), TKDE 2025, and more. |
|---|---|
| Oct 15, 2025 | 🎓 Successfully passed my Ph.D. preliminary exam! My thesis proposal “Generative AI for Multimodal Data Management” has been approved. Excited to continue advancing research in this cutting-edge area! 🚀 |
| Jun 01, 2025 | Our paper “An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural Networks” has been accepted to ICML 2025! 🎉 This work achieves 24x speedup over NumPy baseline and 2.5x improvement on Quantized LLMs. |
| Nov 01, 2024 | Our work on optimized inference for binary and ternary neural networks is now available on arXiv! This groundbreaking research achieves significant speedup improvements for quantized LLMs. |
| Aug 26, 2024 | Chameleon (full research paper) and FairEM360 (demo paper) have been published and presented in VLDB 2024 |
selected publications
- Task-aware Data Augmentation using Generative AI for Group-distributional RobustnessarXiv preprint arXiv:TBD, 2025Manuscript submitted for publication in SIGMOD 2026
- Needle: A Generative AI-Powered Multi-modal Database for Answering Complex Natural Language QueriesarXiv preprint arXiv:2412.00639, 2025Manuscript submitted for publication in ICLR 2025
- An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural NetworksIn The 2025 International Conference on Machine Learning, 2024arXiv preprint arXiv:2411.06360
- Chameleon: Foundation Models for Fairness-Aware Multi-Modal Data Augmentation to Enhance Coverage of MinoritiesProceedings of the VLDB Endowment, 2024
-
- Coverage-based Data-centric Approaches for Responsible and Trustworthy AIBulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2024