SERG - Software Engineering Research Group
Permanent link for this collectionhttps://hdl.handle.net/10292/1680
The Software Engineering Research Group (SERG) at AUT University [formerly Software Engineering Research Laboratory (SERL)] undertakes world-class research directed at understanding and improving the practice of software professionals in their creation and preservation of software systems. We are interested in all models of software provision – bespoke development, package and component customisation, free/libre open source software (FLOSS) development, and delivery of software as a service (SaaS). The research we carry out may relate to just one or all of these models.
Browse
Recent Submissions
Item Just-in-Time Crash Prediction for Mobile Apps(Springer Science and Business Media LLC, 2024-05-08) Wimalasooriya, C; Licorish, SA; da Costa, DA; MacDonell, SGJust-In-Time (JIT) defect prediction aims to identify defects early, at commit time. Hence, developers can take precautions to avoid defects when the code changes are still fresh in their minds. However, the utility of JIT defect prediction has not been investigated in relation to crashes of mobile apps. We therefore conducted a multi-case study employing both quantitative and qualitative analysis. In the quantitative analysis, we used machine learning techniques for prediction. We collected 113 reliability-related metrics for about 30,000 commits from 14 Android apps and selected 14 important metrics for prediction. We found that both standard JIT metrics and static analysis warnings are important for JIT prediction of mobile app crashes. We further optimized prediction performance, comparing seven state-of-the-art defect prediction techniques with hyperparameter optimization. Our results showed that Random Forest is the best performing model with an AUC-ROC of 0.83. In our qualitative analysis, we manually analysed a sample of 642 commits and identified different types of changes that are common in crash-inducing commits. We explored whether different aspects of changes can be used as metrics in JIT models to improve prediction performance. We found these metrics improve the prediction performance significantly. Hence, we suggest considering static analysis warnings and Android-specific metrics to adapt standard JIT defect prediction models for a mobile context to predict crashes. Finally, we provide recommendations to bridge the gap between research and practice and point to opportunities for future research.Item Improving Transfer Learning for Software Cross-Project Defect Prediction(Springer Science and Business Media LLC, 2024-04-24) Omondiagbe, OP; Licorish, SA; MacDonell, SGSoftware cross-project defect prediction (CPDP) makes use of cross-project (CP) data to overcome the lack of data necessary to train well-performing software defect prediction (SDP) classifiers in the early stage of new software projects. Since the CP data (known as the source) may be different from the new project’s data (known as the target), this makes it difficult for CPDP classifiers to perform well. In particular, it is a mismatch of data distributions between source and target that creates this difficulty. Transfer learning-based CPDP classifiers are designed to minimize these distribution differences. The first Transfer learning-based CPDP classifiers treated these differences equally, thereby degrading prediction performance. To this end, recent research has the Weighted Balanced Distribution Adaptation (W-BDA) method to leverage the importance of both distribution differences to improve classification performance. Although W-BDA has been shown to improve model performance in CPDP and tackle the class imbalance by balancing the class proportion of each domain, research to date has failed to consider model performance in light of increasing target data. We provide the first investigation studying the effects of increasing the target data when leveraging the importance of both distribution differences. We extend the initial W-BDA method and call this extension the W-BDA+ method. To evaluate the effectiveness of W-BDA+ for improving CPDP performance, we conduct eight experiments on 18 projects from four datasets, where data sampling was performed with different sampling methods. Data sampling was only performed on the baseline methods and not on our proposed W-BDA+ and the original W-BDA because data sampling issues do not exist for these two methods. We evaluate our method using four complementary indicators (i.e., Balanced Accuracy, AUC, F-measure and G-Measure). Our findings reveal an average improvement of 6%, 7.5%, 10% and 12% for these four indicators when W-BDA+ is compared to the original W-BDA and five other baseline methods (for all four of the sampling methods used). Also, as the target to source ratio is increased with different sampling methods, we observe a decrease in performance for the original W-BDA, with our W-BDA+ approach outperforming the original W-BDA in most cases. Our results highlight the importance of having an awareness of the effect of the increasing availability of target data in CPDP scenarios when using a method that can handle the class imbalance problem.Item Progress Report on a Proposed Theory for Software Development(SciTePress, 2015-08) Kirk, D; MacDonell, SThere is growing acknowledgement within the software engineering community that a theory of software development is needed to integrate the myriad methodologies that are currently popular, some of which are based on opposing perspectives. We have been developing such a theory for a number of years. In this position paper, we overview our theory along with progress made thus far. We suggest that, once fully developed, this theory, or one similar to it, may be applied to support situated software development, by providing an overarching model within which software initiatives might be categorised and understood. Such understanding would inevitably lead to greater predictability with respect to outcomes.Item Packaged Software Implementation Requirements Engineering by Small Software Enterprises(IEEE Computer Society, 2013) Jebreen, I; Wellington, R; MacDonell, SGSmall to medium sized business enterprises (SMEs) generally thrive because they have successfully done something unique within a niche market. For this reason, SMEs may seek to protect their competitive advantage by avoiding any standardization encouraged by the use of packaged software (PS). Packaged software implementation at SMEs therefore presents challenges relating to how best to respond to mismatches between the functionality offered by the packaged software and each SME's business needs. An important question relates to which processes small software enterprises - or Small to Medium-Sized Software Development Companies (SMSSDCs) - apply in order to identify and then deal with these mismatches. To explore the processes of packaged software (PS) implementation, an ethnographic study was conducted to gain in-depth insights into the roles played by analysts in two SMSSDCs. The purpose of the study was to understand PS implementation in terms of requirements engineering (or 'PSIRE'). Data collected during the ethnographic study were analyzed using an inductive approach. Based on our analysis of the cases we constructed a theoretical model explaining the requirements engineering process for PS implementation, and named it the PSIRE Parallel Star Model. The Parallel Star Model shows that during PSIRE, more than one RE process can be carried out at the same time. The Parallel Star Model has few constraints, because not only can processes be carried out in parallel, but they do not always have to be followed in a particular order. This paper therefore offers a novel investigation and explanation of RE practices for packaged software implementation, approaching the phenomenon from the viewpoint of the analysts, and offers the first extensive study of packaged software implementation RE (PSIRE) in SMSSDCs.Item A Taxonomy of Data Quality Challenges in Empirical Software Engineering(IEEE, 2013) Bosu, MF; Macdonell, SGReliable empirical models such as those used in software effort estimation or defect prediction are inherently dependent on the data from which they are built. As demands for process and product improvement continue to grow, the quality of the data used in measurement and prediction systems warrants increasingly close scrutiny. In this paper we propose a taxonomy of data quality challenges in empirical software engineering, based on an extensive review of prior research. We consider current assessment techniques for each quality issue and proposed mechanisms to address these issues, where available. Our taxonomy classifies data quality issues into three broad areas: first, characteristics of data that mean they are not fit for modeling, second, data set characteristics that lead to concerns about the suitability of applying a given model to another data set, and third, factors that prevent or limit data accessibility and trust. We identify this latter area as of particular need in terms of further research. © 2013 IEEE.Item Onshore to Near-Shore Outsourcing Transitions: Unpacking Tensions(IEEE, 2015-07-13) Raza, B; Clear, Tony; MacDonell, SGThis study is directed towards highlighting tensions of incoming and outgoing vendors during outsourcing in a near-shore context. Incoming-and-outgoing of vendors generate a complex form of relationship in which the participating organizations cooperate and compete simultaneously. It is of great importance to develop knowledge about this kind of relationship typically in the current GSE-related multi-sourcing environment. We carried out a longitudinal case study and utilized data from the 'Novo pay' project, which is available in the public domain. This project involved an outgoing New Zealand based vendor and incoming Australian based vendor. The results show that the demand for the same human resources, dependency upon cooperation and collaboration between vendors, reliance on each other system's configurations and utilizing similar strategies by the client, which worked for the previous vendor, generated a set of tensions which needed to be continuously managed throughout the project.Item An empirical cognitive model of the development of shared understanding of requirements(Springer, 2014-06-01) Buchan, JIt is well documented that customers and software development teams need to share and refine understanding of the requirements throughout the software development lifecycle. The development of this shared understand- ing is complex and error-prone however. Techniques and tools to support the development of a shared understanding of requirements (SUR) should be based on a clear conceptualization of the phenomenon, with a basis on relevant theory and analysis of observed practice. This study contributes to this with a detailed conceptualization of SUR development as sequence of group-level state transi- tions based on specializing the Team Mental Model construct. Furthermore it proposes a novel group-level cognitive model as the main result of an analysis of data collected from the observation of an Agile software development team over a period of several months. The initial high-level application of the model shows it has promise for providing new insights into supporting SUR development.Item Bridging the research-practice gap in requirements engineering(National Advisory Committee on Computing Qualifications (NACCQ), 2009) Pais, S; Talbot, A; Connor, AMThis paper examines the perceived research-practice gap in software requirements engineering, with a particular focus on requirements specification and modelling. Various contributions by researchers to write requirements specifications are reviewed and in addition practitioners viewpoints are also taken into consideration. On comparing the research and practice in this field, possible causes for the gap are identified. The barriers to adopt research contributions in practice are also reviewed. Finally recommendations to overcome this gap are made.Item Building services integration: a technology transfer case study(National Advisory Committee on Computing Qualifications (NACCQ), 2007) Connor, AM; Siringoringo, WS; Clements, N; Alexander, NThis paper details the development of a relationship between Auckland University of Technology (AUT) and the Building Integration Software Company (bisco) and how projects have been initiated that add value to the bisco product range by conducting applied research utilising students from AUT. One specific project related to producing optimal layout designs is discussed.Item Signposting, a dynamic approach to design process management(Cambridge University Engineering Department, 1999) Clarkson, PJ; Connor, AM; Melo, AFThis paper presents an overview of a dynamic guidance tool that has been developed to address a need for design support in the aerospace sector. The tool, called signposting, provides the means of directing activity by suggesting the next appropriate task in the design process. This suggestion, based on the presence of key parameters and their associated confidences, allows design to be a reactive process. The underlying logic of the design process is captured using confidence mappings which determine when a task is possible, sensible or not achievable. The lack of prescriptive process structure also allows new design tasks to be added at any time. The signposting technique is described with reference to a simple mechanical design process example.Item Using genetic algorithms to solve layout optimisation problems in residential building construction(dblp, 2007) Connor, AM; Siringoringo, WSThis paper outlines an approach for the automatic design of material layouts for the residential building construction industry. The goal is to cover a flat surface using the minimum number of rectangular stock panels by nesting the off cut shapes in an efficient manner. This problem has been classified as the Minimum Cost Polygon Overlay problem. Results are presented for a typical problem and two algorithms are compared.Item Tendering for engineering contracts(Professional Engineering Publishing, 2000) Barr, G; Burgess, SG; Connor, AM; Clarkson, PJThis paper describes a generalised model that embodies the criteria assessed during the tendering process, where if all the pertinent criteria are met, then the risk when submitting the tender is minimised. This provides a framework for isolating the important tender criteria and relating this to the control of activities in the tendering process. A process model is presented that describes the tendering process at a sufficiently high level that is independent of corporate rationale. A number of subprocess models are included to show how the high level model can be adapted in order to introduce a dynamic element into the tendering process. The results of an initial case study are presented. The subject of the case study is a major engineering systems supplier and it is shown how their tendering process for design and build contracts sits within the tender classification and process models.Item Pole shape optimization using a tabu search scheme(IEEE, 2000) Connor, AM; Leonard, PJThe pole shape optimization of an electromagnet typical of an MRI type application is investigated. We compare the use of different parameterizations of the pole shape and compare the convergence of the optimizations using a discrete variable step length Tabu Search scheme.Item Parameter sizing for fluid power circuits using taguchi methods(Taylor & Francis, 1999) Connor, AMThis paper describes the application of Taguchi methods [1,2,3] to the parameter sizing stage of fluid power system design. Taguchi methods have become almost synonymous with robust design and are used to design systems that are tolerant to the effects of noise factors. Noise factors are defined as anything that causes changes in the functional characteristics or performance of the system that are not controllable. In the hydraulic circuit example used in this paper, these noise factors are assumed to be effects of component failure. The method is therefore being used to select design parameter values such that the resulting circuits exhibit some tolerance to the initial development of faults in the system which will allow the system to continue to operate for a short period of time without catastrophic failure occurring.Item A tabu search method for the optimisation of fluid power circuits(SAGE Publications Ltd., 1998) Connor, AM; Tilley, DGThis paper describes the development of an efficient algorithm for the optimization of fluid power circuits. The algorithm is based around the concepts of Tabu search, where different time-scale memory cycles are used as a metaheuristic to guide a hill climbing search method out of local optima and locate the globally optimum solution. Results are presented which illustrate the effectiveness of the method on mathematical test functions. In addition to these test functions, some results are presented for real problems in hydraulic circuit design by linking the method to the Bathfp dynamic simulation software. In one such example the solutions obtained are compared to those found using simple steady state calculations.Item The optimal synthesis of mechanisms using harmonic information(Taylor & Francis, 1998) Connor, AM; Douglas, SS; Gilmartin, MJThis paper reviews several uses of harmonic information in the synthesis of mechanisms and shows that such information can be put to even greater use in this ®eld. Results are presented for both single and multi-degree of freedom systems which support this claim. In both cases, the inclusion of harmonic information into the objective function aids the search to locate high-quality solutions.Item Minimum cost polygon overlay with rectangular shape stock panels(Taylor & Francis, 2008) Siringoringo, WS; Connor, AM; Clements, N; Alexander, NMinimum Cost Polygon Overlay (MCPO) is a unique two-dimensional optimization problem that involves the task of covering a polygon shaped area with a series of rectangular shaped panels. This has a number of applications in the construction industry. This work examines the MCPO problem in order to construct a model that captures essential parameters of the problem to be solved automatically using numerical optimization algorithms. Three algorithms have been implemented of the actual optimization task: the greedy search, the Monte Carlo (MC) method, and the Genetic Algorithm (GA). Results are presented to show the relative effectiveness of the algorithms. This is followed by critical analysis of various findings of this research.Item A comparison of semi-deterministic and stochastic search techniques(Springer, 2000) Connor, AM; Shea, KThis paper presents an investigation of two search techniques, tabu search (TS) and simulated annealing (SA), to assess their relative merits when applied to engineering design optimisation. Design optimisation problems are generally characterised as having multi-modal search spaces and discontinuities making global optimisation techniques beneficial. Both techniques claim to be capable of locating globally optimum solutions on a range of problems but this capability is derived from different underlying philosophies. While tabu search uses a semi-deterministic approach to escape local optima, simulated annealing uses a complete stochastic approach. The performance of each technique is investigated using a structural optimisation problem. These performances are then compared to each other as well as a steepest descent (SD) method.Item The kinematic synthesis of path generating mechanisms using genetic algorithms(WIT Press, 1995) Connor, AM; Douglas, SS; Gilmartin, MJThis paper presents a methodology for the synthesis of path generating mechanisms using Genetic Algorithms GAs). GAs are a novel search and optimisation technique inspired by the principles of natural evolution and survival of the fittest . The problem used to illustrate the use of GAs in this way is the synthesis of a four bar mechanism to provide a desired output path.Item The synthesis of five bar path generating mechanisms using genetic algorithms(IEE/IEEE, 1995) Connor, AM; Douglas, SS; Gilmartin, MJThis paper presents a methodology for the synthesis of multi-degree of freedom mechanisms using genetic algorithms. A Five-Bar mechanism is a 2-DOF system which requires two inputs to fully describe the output motion. In a hybrid mechanism, one of these inputs is supplied by a constant velocity (CV) motor and one is supplied by a programmable servo motor. Such configurations can have considerable savings in power consumption, when the armature inertia of the servo motor is low when compared to the load inertia. In the presented synthesis of such mechanisms the two inputs required are provided by the CV input and the desired position of the end effector. The genetic algorithm is used to search for the optimum link lengths and ground point positions to minimise a multi-criteria objective function. The criteria which contribute to the objective function value are the error between the actual path of the end effector and the desired path, the mobility of the mechanism, and the RMS value of the servo motor displacements
