Evaluating “Graphical Perception” with Multimodal Large Language Models

Rami Huu Nguyen | Kenichi Maeda | Mahsa Geshvadi | Daniel Haehn

Presented at IEEE Pacific Visualization Symposium (PacificVis) 2025 | 👉 View our Paper

Abstract

Multimodal Large Language Models (MLLMs) have remarkably progressed in analyzing and understanding images. Despite these advancements, accurately regressing values in charts remains an underexplored area for MLLMs. For visualization, how do MLLMs perform when applied to graphical perception tasks?

Our paper investigates this question by reproducing Cleveland and McGill's seminal 1984 experiment and comparing it against human task performance. Our study primarily evaluates fine-tuned and pretrained models and zero-shot prompting to determine if they closely match human graphical perception.

Our findings highlight that MLLMs outperform human task performance in some cases but not in others. We highlight the results of all experiments to foster an understanding of where MLLMs succeed and fail when applied to data visualization.

Fast Forward Video

Poster

Presentation Slides

Can’t view on your phone? Tap here to open the PDF directly

View more our experiments

GitHub 🔵 Results 🔵 Supplemental 🔵 User Study

Our Snapshots

IEEE PacificVis 2025 Taipei City, Taiwan

IEEE PacificVis 2025, Taiwan, Apr 24th, 2025

AI Frontier Symposium, May 9, 2025

CSM Showcase - May 16, 2025

Authors and Acknowledgement

Connect with us, we are open to collaboration at: rami@mpsych.org

We would like to thank:

Professor Daniel Haehn for helping us build the foundation for this research and advising us to integrate the latest technology into our paper.
Kenichi Maeda and Mahsa Geshvadi for their assistance in writing and reviewing multiple scripts and building up our paper together.

We have also learned a great deal from this project, spanning #artificalintelligent #programming, #datavisualization, #imageprocessing, #machinelearning, and #computergraphics #machinepsychology, and applied these insights to our study.

Most importantly, we all contributed to this CS460 - Computer Graphics project and our lab's group (Machine Psychology) at the University of Massachusetts Boston. Here's the link to the course website: CS460.org and our lab's group Machine Psychology.

Citation

    @article {nguyen2025evaluating,
      title={Evaluating 'Graphical Perception' with Multimodal LLMs},
    author={Nguyen, Rami Huu and Maeda, Kenichi and Geshvadi, Mahsa and Haehn, Daniel},
      abstract={Multimodal Large Language Models (MLLMs) have remarkably progressed in analyzing and understanding images. Despite these advancements, accurately regressing values in charts remains an underexplored area for MLLMs. For visualization, how do MLLMs perform when applied to graphical perception tasks? Our paper investigates this question by reproducing Cleveland and McGill's seminal 1984 experiment and comparing it against human task performance. Our study primarily evaluates fine-tuned and pretrained models and zero-shot prompting to determine if they closely match human graphical perception. Our findings highlight that MLLMs outperform human task performance in some cases but not in others. We highlight the results of all experiments to foster an understanding of where MLLMs succeed and fail when applied to data visualization.},
      journal={IEEE Pacific Visualization (PacificVis)},
      year={2025},
      code={https://github.com/raminguyen/LLMP2},
      data={https://github.com/raminguyen/LLMP2},
      supplemental={https://mpsych.org/papers/nguyen2025_supplemental.pdf},
      shortvenue={PacificVis 2025}
  }

With the tremedous support of