Addressing the Problem of ‘Big Data’ in Sports: A Framework for Performance Analysts
An explosion of data producing technologies has changed the requirements of the modern performance analyst. It has been claimed that in 2020 approximately 50 billion ‘connected devices’ existed, signalling that the ‘era of big data’ had begun. Coaches now have access to unlimited, unstructured and potentially uncontextualized information, which could reinforce their biases. New technologies exist for collecting, organising, storing, and presenting ‘large’ or ‘big data,’ however there are limited framework for performance analysts to follow when using these tools in sport. Without a framework key aspects of tool development may be missed, and errors or poor ecological validity may eventuate. The objective of this PhD was to create a framework for performance analysts to use when building information presenting tools. The over-arching question of this thesis is: what process, or framework, will allow an analyst to present large data sets to coaches in a relevant, meaningful and ecologically valid manner? A literature review was conducted to identify how a framework for ‘large’ or ‘big data’ may best be approached. The following areas were identified as requiring investigation: a) coach tactical behaviours during matches; b) valid data sources in sport; c) data organisation systems; and d) visualisation techniques for coaches. A systems design approach and action design research methodologies were used to guide the development of the framework. The first stage of the framework encourages performance analysts to observe the behaviours of coaches, during matches, and summarise these identifying the most common themes. This was conducted in the sport of netball, which was used as a context for the other stages of the development in this PhD. Once these themes were identified, data sources were investigated and evaluated in Stage Two. In Stage Three, a large sample of suitable data was collected, and a cloud-based data pipeline was developed to store it in a database as information. Stage Four took this information, used it to create population normative values, for each performance indicator, and coaches guided the representation of these. Coaches were then presented, in Stage five, with the aligned information, live during netball matches, and their behaviours were observed. At the completion of the five stages a working example of the framework, embodied in a tool, was presented to coaches and sport science practitioners and survey feedback collected. This tool was found to be ecologically valid, to the users, and future research was identified to improve these tools. The framework is proposed as valid method, for performance analysts, in creating tools that assisted coaches navigate large datasets in netball and other sports.