Abstract:
Bug triage is essential in efficiently assigning bugs to developers by leveraging past 
experiences. Without this crucial process, experienced developers may be inundated with 
assignments, while newer developers may be underutilized. Furthermore, improper bug 
distribution among different developer types can lead to various issues, including delays, 
errors, decreased capacity, and diminished job satisfaction. Previous bug triaging methods 
often do not account for newly joined developers, making them ineffective in recommending 
these developers for bug assignments. Consequently, these methods lead to improper task 
allocation, denying new team members valuable learning opportunities during bug resolution. 
Furthermore, prior research tends to overlook workload distribution among different 
developer categories, neglecting the need to balance bug assignments among experienced 
developers, newcomers, and those with varying skill levels. To address these issues, there is a 
need for an automated bug triaging technique that not only includes new developers but also 
prioritizes workload distribution among different developer categories. Therefore, this study 
introduces a novel bug triaging strategy that combines two pivotal models:  Bug Solving 
Developer Recommendation Model (BSDRM) and Developer Scheduler (DevSched). 
The first model, known as the BSDRM, forms the core of automated bug triaging. 
BSDRM harnesses the power of Machine Learning (ML) algorithms and historical bug 
reports to intelligently suggest developers for specific bug resolution tasks. To achieve this, 
Eclipse, Mozilla, and NetBeans datasets are aggregated and split into training and testing sets. 
Subsequently, a sentence-embedded model is crafted from the training set, generating a 
developer-specific word repository. In contrast, the test set is transformed into a vocabulary 
list using an embedded model. BSDRM identifies eligible developers by matching their 
developer-specific word repository with the bug report vocabulary list via K-Nearest 
Neighbour (KNN) analysis. These developers are then categorized into three groups: 
experienced, newly experienced, and fresh graduate developers, utilizing a classification 
model comprising various ML algorithms Decision Tree (DT), Extra Tree (ET), AdaBoost 
(AdC), Bagging Classifier (BC), Gradient Boosting (GB), KNN, Nearest Centroid (NC), 
Bernoulli Na¨ıve Bayes (BNB), Multinomial Na¨ıve Bayes (MNB), Complement Na¨ıve 
iii 
Bayes (CoNB), Gaussian Na¨ıve Bayes (GNB), Logistic Regression (LR), Perceptron (Pr), 
and Multi-Layer Perceptron (MLP). Remarkably, the Bagging Classifier exhibits outstanding 
performance, achieving 96.59% accuracy in classifying developers with varying experience 
levels.  
In tandem with BSDRM, this study introduces the second model, DevSched, which 
assumes a critical role in balancing developer workloads. DevSched factors in workload 
distribution, developer proficiency, and bug characteristics. It generates multiple developer 
profiles based on historical bug reports and assigns bugs to developers by assessing the 
highest similarity between bug vectors and developer corpora. DevSched also dynamically 
adjusts developer workloads and refines their ratings based on performance. The study 
utilizes bug reports from Eclipse, Mozilla, and NetBeans to evaluate developer performance 
in the bug-triaging process. DevSched efficiently assigns and balances bugs among various 
developer categories, resulting in significantly reduced standard deviations for Eclipse, 
NetBeans, and Mozilla datasets compared to conventional bug distribution processes. This 
meticulous process is reiterated for each bug, ensuring optimal resource allocation and timely 
resolution of critical issues.  
The proposed study will collectively enhance bug resolution efficiency, optimize 
developer workloads, and ensure that both experienced and newer developers are judiciously 
utilized in the bug triaging process.