TY - JOUR
T1 - Improved Regression Analysis with Ensemble Pipeline Approach for Applications Across Multiple Domains
AU - Banik, Debajyoty
AU - Paul, Rahul
AU - Rathore, Rajkumar Singh
AU - Jhaveri, Rutvij H.
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/3/9
Y1 - 2024/3/9
N2 - In this research, we introduce two new machine learning regression methods: the Ensemble Average and the Pipelined Model. These methods aim to enhance traditional regression analysis for predictive tasks and have undergone thorough evaluation across three datasets, Kaggle House Price, Boston House Price, and California Housing, using various performance metrics. The results consistently show that our models outperform existing methods in terms of accuracy and reliability across all three datasets. The Pipelined Model, in particular, is notable for its ability to combine predictions from multiple models, leading to higher accuracy and impressive scalability. This scalability allows for their application in diverse fields like technology, finance, and healthcare. Furthermore, these models can be adapted for real-time and streaming data analysis, making them valuable for applications such as fraud detection, stock market prediction, and IoT sensor data analysis. Enhancements to the models also make them suitable for big data applications, ensuring their relevance for large datasets and distributed computing environments. It is important to acknowledge some limitations of our models, including potential data biases, specific assumptions, increased complexity, and challenges related to interpretability when using them in practical scenarios. Nevertheless, these innovations advance predictive modeling, and our comprehensive evaluation underscores their potential to provide increased accuracy and reliability across a wide range of applications. The results indicate that the proposed models outperform existing models in terms of accuracy and robustness for all three datasets. The source code can be found at https://huggingface.co/DebajyotyBanik/Ensemble-Pipelined-Regression/tree/main
AB - In this research, we introduce two new machine learning regression methods: the Ensemble Average and the Pipelined Model. These methods aim to enhance traditional regression analysis for predictive tasks and have undergone thorough evaluation across three datasets, Kaggle House Price, Boston House Price, and California Housing, using various performance metrics. The results consistently show that our models outperform existing methods in terms of accuracy and reliability across all three datasets. The Pipelined Model, in particular, is notable for its ability to combine predictions from multiple models, leading to higher accuracy and impressive scalability. This scalability allows for their application in diverse fields like technology, finance, and healthcare. Furthermore, these models can be adapted for real-time and streaming data analysis, making them valuable for applications such as fraud detection, stock market prediction, and IoT sensor data analysis. Enhancements to the models also make them suitable for big data applications, ensuring their relevance for large datasets and distributed computing environments. It is important to acknowledge some limitations of our models, including potential data biases, specific assumptions, increased complexity, and challenges related to interpretability when using them in practical scenarios. Nevertheless, these innovations advance predictive modeling, and our comprehensive evaluation underscores their potential to provide increased accuracy and reliability across a wide range of applications. The results indicate that the proposed models outperform existing models in terms of accuracy and robustness for all three datasets. The source code can be found at https://huggingface.co/DebajyotyBanik/Ensemble-Pipelined-Regression/tree/main
KW - Ensemble average
KW - XgBoost
KW - light gradient boost
KW - pipeline scheme
KW - random forest
UR - http://www.scopus.com/inward/record.url?scp=85188351513&partnerID=8YFLogxK
U2 - 10.1145/3645110
DO - 10.1145/3645110
M3 - Article
SN - 2375-4699
VL - 23
SP - 1
EP - 13
JO - ACM Transactions on Asian and Low-Resource Language Information Processing
JF - ACM Transactions on Asian and Low-Resource Language Information Processing
IS - 3
ER -