Metrics for Data-driven Energy Efficiency
MetadataShow full metadata
Currently, the world is witnessing a mounting avalanche of data due to the tremendous growth of the Information and Communications Technologies (ICT). This trend is continuing to develop in a quick and diverse manner in the form of big data, which is emerging as one of the most powerful technological drivers to improve productivity and support innovation for humanity. But it also gives a non-negligible contribution to world electricity consumption and carbon dioxide ($CO_2$) footprint as well as their consequences on climate change, which are urgently calling for energy efficient solutions. Lots of research and development efforts have emphasized on studying energy efficiency metrics, because these metrics are measures and indicators of energy efficiency. Understanding those metrics provides us a better view on how energy efficiency can be achieved at every corner, e.g., process, component, equipment, service, application and network/system level, of an ICT system/network. From our observation, the energy efficiency metrics in ICT area are conventionally introduced according to the physical-thermodynamic definition, and the measure of the ICT-based output in physical unit is the number of bits of the data sequence. The problem emerges when using physical measurements of data sequence, because it only measures the quantity of bits, and does not necessarily factor in data quality considerations. In other words, the metric is not making any distinction between low and higher quality data sequences. From this basis, it could consequently argue that the data, when measured in physical amount, cannot be added up or compared because it has different qualities. The data quality ignorance is therefore a fundamental problem in constructing conceptually sound ICT related energy efficiency metrics. This insight led to the new development of data quality-aware energy efficiency metrics for more efficient network/system approaches and mechanisms which can be reconfigured depending on the difference level of data quality. This thesis selects data processing and storage as an example from the life cycle of big data. It is proposed that before data processing and storage, the value of data quality is calculated and prioritized according to the calculation formula of data quality. First, the concept of data quality classification is proposed, and specific calculation formulas are given from the aspects of data integrity, consistency, and timeliness. Secondly, on the premise of data priority determination, an energy-saving scheduling algorithm based on data quality (DQ-TSA) and an energy-saving storage algorithm based on data quality (DQ-HSA) are proposed. Finally, the two algorithms are extended and implemented on the simulation platform Cloudsim, and to verify that whether the concept of pre-graded data quality can help the data center energy efficiency.