cv

My academic CV and research experience.

General Information

Full Name Mahdi Erfanian
Location Chicago, Illinois, USA
Languages English, Persian (Farsi)

Education

  • December 2022 - Present
    Direct Ph.D., Computer Science
    University of Illinois Chicago, Chicago, USA
    • {"Thesis"=>"Generative AI for Multimodal Data Management"}
    • M.S. in Computer Science (awarded en route to Ph.D.), Fall 2025
    • {"GPA"=>"4.00/4.00"}
  • September 2017 - June 2022
    Bachelor of Science, Computer Engineering
    Sharif University of Technology, Tehran, Iran
    • {"GPA"=>"3.70/4.00"}
    • Ranked 60th nationwide among ~150,000 participants in the Iranian University Entrance Exam (Konkour)

Research Experience

  • Jan 2026 - Present
    Ph.D. Research Intern
    Microsoft (CodeAI team), Redmond, WA (Remote)
    • Researching on mitigating hallucination in LLMs and GenAI as a Code Agent (Copilot, Codex)
    • Fine-tuning LLMs using curated datasets to reduce hallucinations
  • June 2023 - Present
    Research Assistant
    IndexLab, University of Illinois Chicago
    • Researching under supervision of Dr. Abolfazl Asudeh on databases, LLMs and responsible data management
    • Developed Needle, an efficient powerful text-to-image retrieval framework that outperformed OpenAI's CLIP by 200% in mean average precision on complex natural language queries
    • Implemented RSR, an efficient binary/ternary matrix multiplication method, accelerating model inference time by 24x compared to the standard NumPy baseline and 2.5x on Quantized LLMs
    • Developed Chameleon, a fairness-aware data augmentation method that improved model accuracy on under-represented groups by 22% on the FERETDB benchmark on average
    • Developed FairEM360, a framework for auditing and mitigating bias in entity matching
  • December 2022 - August 2023
    Research Assistant
    Dreese Lab, The Ohio State University
    • Researched graph and pattern mining under the supervision of Dr. Srinivasan Parthasarathy
    • Developed SYSML+, an enhanced stylometry framework that improved author identification accuracy by 3% over the baseline SYSML system

Industry Experience

  • March 2022 - August 2022
    Software Engineer
    Software Engineering Lab, Sharif University of Technology
    • {"B.Sc. Project"=>"Architected and built a CI/CD pipeline for a containerized, microservice-based web application"}
    • Optimized a stock market application's deployment from a monolithic to a distributed architecture, improving scalability and deployment speed
    • Decreased production Docker image size by over 95% compared to traditional build methods, enabling faster deployments
  • September 2019 - March 2022
    Data Engineer
    Divar Corp.
    • Divar is Iran's largest classifieds platform with over 40 million active users and 200 TB of data
    • Engineered and maintained the core data pipeline (Airflow, Spark, S3) processing over 200TB of data, improving data availability for a team of 40+ data analysts
    • Deployed a distributed JupyterHub on Kubernetes, enabling on-demand, scalable analysis environments for the data science team and cutting down experiment setup time
    • Automated ETL deployment workflows using GitLab CI/CD, reducing manual deployment time from hours to minutes and eliminating release errors
  • August 2019 - November 2019
    Software Engineering Intern
    Rahnema College
    • Designed and developed an E-commerce auction application using a microservice architecture (Spring Boot, ReactJS, Docker)
    • Implemented a push notification service from scratch using Spring Boot and WebSocket

Professional Activities

  • 2025 - 2026
    Invited Lectures/Talks
    University of Illinois Chicago
    • CS516. Responsible Data Science and Alg. Fairness: Guest Lecture on Generative AI and its applications in Fairness, UIC (2026)
    • CS418. Introduction to Data Science: Guest Lecture on Generative AI and Multimodal Data Management, UIC (2025)
  • 2018 - Present
    Teaching Assistant
    University of Illinois Chicago, The Ohio State University, Sharif University of Technology
    • {"UIC"=>"Databases (Spring 24, Fall 23)"}
    • {"OSU"=>"Introduction to Java (Summer 23, Spring 23)"}
    • {"SUT"=>"Computer Networks (Spring 21), Computer Architecture (Spring 21, Fall 20), Computer Structure and Languages (Fall 20), Systems Analysis and Design (Fall 20), Design of Algorithms (Spring 20), Fundamentals of Programming (Fall 18)"}
  • 2024 - Present
    Program Committee & Reviewing
    Various
    • PC Member: WWW 2026, CIKM 2026/2025/2024, KDD 2026, NeurIPS 2026
    • Reviewer: ICLR 2026, NeurIPS 2025 (DynaFront), TKDE 2025/2024, PETRA 2024 (ETHER-AI)

Research Interests

  • Multimodal Data Management
    • Text-to-image retrieval systems
    • Foundation models for data augmentation
    • Vector databases and information retrieval
  • Generative AI and LLMs
    • Large language model applications
    • Synthetic data generation
    • Algorithmic fairness in AI systems
  • Algorithmics and Distributed Systems
    • Efficient matrix multiplication algorithms
    • Binary and ternary neural networks
    • Scalable data processing systems

Notable Courses

  • Cloud Computing (A/Grad)
  • Advanced Algorithms (A/Grad)
  • Introduction to Network Science (A/Grad)
  • Algorithms I (A/Grad)
  • Software Engineering (4/4)
  • Modern Information Retrieval (4/4)
  • Functional Programming (4/4)
  • Design of Algorithms (4/4)
  • Data Structures and Algorithms (4/4)
  • Databases Design (4/4)