An Eye for an AI: Evaluating GPT-4o's Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions

Feng, Tony Haoran; Denny, Paul; Wünsche, Burkhard C; Luxton-Reilly, Andrew; Whalley, Jacqueline

doi:10.1145/3680533.3697064

An Eye for an AI: Evaluating GPT-4o's Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions

dc.contributor.author	Feng, Tony Haoran
dc.contributor.author	Denny, Paul
dc.contributor.author	Wünsche, Burkhard C
dc.contributor.author	Luxton-Reilly, Andrew
dc.contributor.author	Whalley, Jacqueline
dc.date.accessioned	2025-02-14T01:56:33Z
dc.date.available	2025-02-14T01:56:33Z
dc.date.issued	2024-10-22
dc.description.abstract	CG (Computer Graphics) is a popular field of CS (Computer Science), but many students find this topic difficult due to it requiring a large number of skills, such as mathematics, programming, geometric reasoning, and creativity. Over the past few years, researchers have investigated ways to harness the power of GenAI (Generative Artificial Intelligence) to improve teaching. In CS, much of the research has focused on introductory computing. A recent study evaluating the performance of an LLM (Large Language Model), GPT-4 (text only), on CG questions, indicated poor performance and reliance on detailed descriptions of image content, which often required considerable insight from the user to return reasonable results. So far, no studies have investigated the abilities of LMMs (Large Multimodal Models), or multimodal LLMs, to solve CG questions and how these abilities can be used to improve teaching. In this study, we construct two datasets of CG questions requiring varying degrees of visual perception skills and geometric reasoning skills, and evaluate the current state-of-the-art LMM, GPT-4o, on the two datasets. We find that although GPT-4o exhibits great potential in solving questions with visual information independently, major limitations still exist to the accuracy and quality of the generated results. We propose several novel approaches for CG educators to incorporate GenAI into CG teaching despite these limitations. We hope that our guidelines further encourage learning and engagement in CG classrooms.
dc.identifier.citation	SIGGRAPH Asia 2024 Educator’s Forum (SA Educator’s Forum ’24), December 03-06, 2024, Tokyo, Japan. Conference proceedings Article No.: 5, Pages 1 - 8 ACM ISBN 979-8-4007-1136-7/24/12 https://doi.org/10.1145/3680533.3697064
dc.identifier.doi	10.1145/3680533.3697064
dc.identifier.uri	http://hdl.handle.net/10292/18658
dc.publisher	ACM
dc.relation.uri	https://dl.acm.org/doi/10.1145/3680533.3697064
dc.rights	© 2024 Copyright held by the owner/author(s). This is the author’s version of the work. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).
dc.rights.accessrights	OpenAccess
dc.subject	46 Information and Computing Sciences
dc.subject	4608 Human-Centred Computing
dc.subject	4602 Artificial Intelligence
dc.subject	Evaluation, Assessment
dc.subject	Computing Education
dc.subject	Computer Graphics
dc.subject	Geometric Reasoning
dc.subject	Visual Perception
dc.subject	GPT-4o
dc.subject	GPT-4
dc.subject	GenAI
dc.subject	Generative Artificial Intelligence
dc.subject	Visual Language Models (VLMs)
dc.subject	Large Multimodal Models (LMMs)
dc.subject	Large Language Models (LLMs)
dc.title	An Eye for an AI: Evaluating GPT-4o's Visual Perception Skills and Geometric Reasoning Skills Using Computer Graphics Questions
dc.type	Journal Article
pubs.elements-id	574940

Files

Original bundle

Now showing 1 - 1 of 1

Name:: An eye for AI_preprint.pdf
Size:: 1.36 MB
Format:: Adobe Portable Document Format
Description:: Author Accepted Manuscript

Download

Collections

School of Engineering, Computer and Mathematical Sciences - Te Kura Mātai Pūhanga, Rorohiko, Pāngarau