digilib@itb.ac.id +62 812 2508 8800

Despite the presence of other alternative transportation options for oil and gas, steel pipelines remain the most cost-effective means of transport. However, pipelines are prone to degradation and damage when exposed to corrosive substances. The economic impact of corrosion failure and environmental damage is estimated to be around 3.4% of the world's GDP, equivalent to $2.5 trillion annually. Currently, common flow simulators rely on semi-empirical models to predict corrosion rates. The widely used semi-empirical model, known as De Waard 95 by De Waard, Lotz, and Dugstaad, faces challenges in accurately representing the complex non-linear interactions involved in CO2 internal pipeline corrosion. Therefore, Machine Learning has emerged as a promising alternative, leveraging sophisticated tools to train prediction models using data and achieve better representation of the non-linear relationships between multiple corrosion factors. This study employs a single machine learning model, k-nearest neighbor (KNN), and four ensemble learning models (random forest, gradient boosting tree, adaboost, and xgboost) to address the problem. Ensemble learning combines single machine learning model to make accurate predictions, and its implementation is simpler than hybrid models. Most of the proposed ensemble learning models demonstrate acceptable predictive performance for internal corrosion rates. Among them, the XGBoost model consistently exhibits the best overall performance. When comparing the ensemble learning models with the single machine learning model (KNN), XGBoost outperforms KNN in terms of overall performance. However, there are certain ensemble learning models, such as AdaBoost, that perform worse than KNN. It should be noted that hyperparameter tuning has not yet been conducted, which may impact the models' performance. Nonetheless, the ensemble learning models still exhibit better predictive abilities for CO2 internal corrosion rates compared to the single machine learning model. Additionally, the performance of CO2 internal corrosion rate prediction using machine learning approaches is compared to the commonly used semi-empirical correlation method (De Waard 95), revealing the superiority of machine learning over the traditional method. Therefore, AI/ML is proven more better solution to predict pipeline corrosion rates with multiple variables over used traditional methods in several common flow simulator which is semi-empirical method. Furthermore, feature importance analyses are conducted to assess the significance of input variables and understand their influence on the output. The study identifies the partial pressure of CO2 as the most crucial feature, playing a vital role in initiating or accelerating CO2 internal corrosion rate in oil and gas pipelines. One practical application of this study is to determine pipeline leakage time. By utilizing the best model, XGBoost, and evaluating the recorded data from condition where the dataset is taken, the model can provide the CO2 internal corrosion rate. Based on this rate, an estimate can be made regarding the time before pipeline leakage occurs. If the leakage happens within a specified time limit, it serves as a notice to engineers to prepare a replacement program for the pipe.