Investigating Novel Representations and Deep Reinforcement Learning for General Game Playing
General Game Playing (GGP) is a platform for developing general Artificial Intelligence algorithms for agents that can play any game given a description encoded in Game Description Language (GDL). The recent accomplishments of AlphaGo and AlphaZero have motivated new work in applying deep learning approaches to GGP. As GGP's general nature requires the inference of game rules prior to game playing, there are two main learning tasks: (i) inference of game rules in GDL and (ii) strategy optimisation to maximise utility. This requires an investigation into novel representations and deep reinforcement learning architectures that can be applied to the general nature of the games in GGP and take advantage of the logical nature of game rules and descriptions. Accordingly to the two main tasks mentioned before, we tackle each task separately with our contributions are as follows. Firstly, we present GGPDeepRL, an AlphaZero-style GGP agent that utilises deep reinforcement learning to learn to play general games. This agent adapts the deep reinforcement learning algorithm originally presented in AlphaZero to accommodate the general games of GGP as they can be multi-player, general-sum and simultaneous action. It is able to overcome the game-specific limitations of AlphaZero as it constructs its neural network based on the game description provided. Our results confirm the feasibility of applying deep reinforcement learning to GGP but note certain limitations with regards to the neural network architecture used. Secondly, we present a graph neural network based neural reasoner, utilising novel graph-based representations of the game rules to learn to approximate the inference task. The neural reasoner uses instantiated rule graphs as input, a general, game-agnostic graph-based representation of game states described in GDL. Instantiated rule graphs allow for a single neural reasoner to be trained to infer over states across multiple games, as the representation is capable of capturing game state features while still being consistent across different games. We investigate the effect of different labelling functions for the instantiated rule graph as well as varying graph neural network architectures and parameters. The neural reasoner is capable of accurately inferring legal actions and subsequent states as well as transfer these learned inferences across different games. Our findings suggest that deep learning techniques can be applied to GGP effectively if the representations and architectures used are capable of taking advantage of general structural features present in games. These new approaches open the door to an entirely new class of GGP agents that would be capable of learning to play games in a truly general manner.