Rami Huu Nguyen | Kenichi Maeda | Mahsa Geshvadi | Daniel Haehn
Presented at IEEE Pacific Visualization Symposium (PacificVis) 2025 | 👉 View our Paper
Multimodal Large Language Models (MLLMs) have remarkably progressed in analyzing and understanding images. Despite these advancements, accurately regressing values in charts remains an underexplored area for MLLMs. For visualization, how do MLLMs perform when applied to graphical perception tasks?
Our paper investigates this question by reproducing Cleveland and McGill's seminal 1984 experiment and comparing it against human task performance. Our study primarily evaluates fine-tuned and pretrained models and zero-shot prompting to determine if they closely match human graphical perception.
Our findings highlight that MLLMs outperform human task performance in some cases but not in others. We highlight the results of all experiments to foster an understanding of where MLLMs succeed and fail when applied to data visualization.
Can’t view on your phone? Tap here to open the PDF directly
Connect with us, we are open to collaboration at: rami@mpsych.org
We would like to thank:
We have also learned a great deal from this project, spanning #artificalintelligent #programming, #datavisualization, #imageprocessing, #machinelearning, and #computergraphics #machinepsychology, and applied these insights to our study.
Most importantly, we all contributed to this CS460 - Computer Graphics project and our lab's group (Machine Psychology) at the University of Massachusetts Boston. Here's the link to the course website: CS460.org and our lab's group Machine Psychology.
@article {nguyen2025evaluating, title={Evaluating 'Graphical Perception' with Multimodal LLMs}, author={Nguyen, Rami Huu and Maeda, Kenichi and Geshvadi, Mahsa and Haehn, Daniel}, abstract={Multimodal Large Language Models (MLLMs) have remarkably progressed in analyzing and understanding images. Despite these advancements, accurately regressing values in charts remains an underexplored area for MLLMs. For visualization, how do MLLMs perform when applied to graphical perception tasks? Our paper investigates this question by reproducing Cleveland and McGill's seminal 1984 experiment and comparing it against human task performance. Our study primarily evaluates fine-tuned and pretrained models and zero-shot prompting to determine if they closely match human graphical perception. Our findings highlight that MLLMs outperform human task performance in some cases but not in others. We highlight the results of all experiments to foster an understanding of where MLLMs succeed and fail when applied to data visualization.}, journal={IEEE Pacific Visualization (PacificVis)}, year={2025}, code={https://github.com/raminguyen/LLMP2}, data={https://github.com/raminguyen/LLMP2}, supplemental={https://mpsych.org/papers/nguyen2025_supplemental.pdf}, shortvenue={PacificVis 2025} }