digilib@itb.ac.id +62 812 2508 8800

Abstrak - Muhammad Ryanrahmadifa
Terbatas  Irwan Sofiyan
» Gedung UPT Perpustakaan

PT Bioma Bersama Indonesia is a company that develops quantitative finance solutions based on artificial intelligence for the crude oil sector. The existing crude oil futures trading model performed poorly on deployment, potentially producing negative 8% yearly return which signified a major problem. Multiple different investment strategies are usually weighed on the significance of their return and risk probabilities, especially towards a volatile market such as commodity futures. Therefore, this study proposes an ensemble and hierarchical Multi-agent Deep Reinforcement Learning (MDRL) approach, using multiple agents designed with different reward functions integrated with state-of-the-art temporal feature extraction, to solve crude oil futures trading as a Markov decision problem. This study implements MDRL using daily Brent Crude Oil Futures price data, technical, and fundamental analysis. Along with an integration of TimesNet for temporal feature extraction by leveraging MobileNetV3 vision backbone, with MDRL using two actor-critic based algorithms, A2C and PPO, and one value-based algorithm, DDQN. Each algorithm trains three agents each with unique reward functions: risk-aversion based on Sharpe ratio, mid-term profit, and immediate profit. Agent training uses in-sample data ranges from November 2014 – 2022 with a growing rolling window of 2 (two) years followed with the validation step, using out-of-sample data ranging until November 2024. Experiments show that the ensemble MDRL framework has better performance than both the baseline Buy-and-Hold strategy and hierarchical MDRL based on accumulated rewards. With A2C with midterm profit achieving the highest yearly performance with 13.031% return, -14.366% maximum drawdown, 0.554 Sharpe ratio, and 0.795 Sortino ratio, showing great profitability and risk-aversion. With the agent successfully generating profits for the company despite the bearish nature of the market, this study attempts to make recommendations towards the deployment of the framework regarding the topic of balancing model effectivity and computational resource efficiency.