Mahdi Erfanian

5453, Computer Design Research and Learning Center (CDRLC)

850 W Taylor street,

Chicago, IL 60607

I’m Mahdi Erfanian, a Ph.D. candidate in Computer Science at the University of Illinois Chicago, where I am a member of the IndexLab under the supervision of Dr. Abolfazl Asudeh. I received my B.Sc in Computer Engineering from Sharif University of Technology. My research spans Multimodal Data Management, Generative AI, Algorithmic Fairness, and Foundation Models.

I am particularly interested in employing foundation models, including large language models, to address different challenges in data management—such as mitigating bias in training data and enhancing multi-modal data retrieval through synthetic data generation. My work has led to systems that outperform state-of-the-art baselines like OpenAI’s CLIP by 200% in mean average precision on complex natural language queries. Additionally, I am passionate about Algorithm Design, with a focus on optimizing both fairness and efficiency in data-driven systems.

I am seeking a 2026 Research Scientist Internship to build scalable AI products in areas including multimodal retrieval, foundation models, and vector databases.

news

Oct 15, 2025	Excited to serve as a reviewer and PC member for top-tier venues in 2024-2025! 🎯 Including ICLR 2025, NeurIPS 2025 DynaFront Workshop, CIKM 2025 (PC Member), TKDE 2025, and more.
Oct 15, 2025	🎓 Successfully passed my Ph.D. preliminary exam! My thesis proposal “Generative AI for Multimodal Data Management” has been approved. Excited to continue advancing research in this cutting-edge area! 🚀
Jun 01, 2025	Our paper “An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural Networks” has been accepted to ICML 2025! 🎉 This work achieves 24x speedup over NumPy baseline and 2.5x improvement on Quantized LLMs.
Nov 01, 2024	Our work on optimized inference for binary and ternary neural networks is now available on arXiv! This groundbreaking research achieves significant speedup improvements for quantized LLMs.
Aug 26, 2024	Chameleon (full research paper) and FairEM360 (demo paper) have been published and presented in VLDB 2024

selected publications

Needle: A Generative AI-Powered Multi-modal Database for Answering Complex Natural Language Queries

Mahdi Erfanian, Mohsen Dehghankar, and Abolfazl Asudeh

arXiv preprint arXiv:2412.00639, 2025

Manuscript submitted for publication in ICLR 2025
An Efficient Matrix Multiplication Algorithm for Accelerating Inference in Binary and Ternary Neural Networks

Mohsen Dehghankar, Mahdi Erfanian, and Abolfazl Asudeh

In The 2025 International Conference on Machine Learning, 2024

arXiv preprint arXiv:2411.06360
Chameleon: Foundation Models for Fairness-Aware Multi-Modal Data Augmentation to Enhance Coverage of Minorities

Mahdi Erfanian, H. V. Jagadish, and Abolfazl Asudeh

Proceedings of the VLDB Endowment, 2024

DOI
FairEM360: A Suite for Responsible Entity Matching

Nima Shahbazi, Mahdi Erfanian, Abolfazl Asudeh, and 2 more authors

Proceedings of the VLDB Endowment, 2024

DOI
Coverage-based Data-centric Approaches for Responsible and Trustworthy AI

Nima Shahbazi, Mahdi Erfanian, and Abolfazl Asudeh

Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 2024