TY - GEN
T1 - A Hybrid Reinforcement Learning Framework for Dynamic Resource Allocation in Malware Analysis Systems
AU - Mohanty, Anoushka
AU - Nayak, Sambhav
AU - Yang, Tiansheng
AU - Singh Rathore, Rajkumar
AU - Mo, Danyu
AU - Wang, Lu
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/10/24
Y1 - 2024/10/24
N2 - The emerging strains of malware demand adaptive cybersecurity systems with the ability to shift resourcefully to performance versus good detection – with optimal resource use, especially as threats quickly evolve. This paper presents a policy using reinforcement learning (RL) for dynamic resource allocation in malware analysis: an RL agent observes files and assigns computer resources based on their potential level of threat. The agent learns by trial and error to give precedence to sample examination while maximizing the usage of resources; the study also describes RL environment design, the structure of the agent, and the reward system. The model has shown good accuracy at different stages which include 96.1% for training, 92.2% for testing, and 90.5% for validation data against previous methods in fast-changing threats circumstances. This continuous learning ability of the model allows it to be effective in the light of new threats, which has helped develop resource-aware, scalable, and adaptable malware analysis systems against changing risks in cybersecurity. This model provides benefits and some of the main challenges in this domain for dynamic malware analysis and resource allocation.
AB - The emerging strains of malware demand adaptive cybersecurity systems with the ability to shift resourcefully to performance versus good detection – with optimal resource use, especially as threats quickly evolve. This paper presents a policy using reinforcement learning (RL) for dynamic resource allocation in malware analysis: an RL agent observes files and assigns computer resources based on their potential level of threat. The agent learns by trial and error to give precedence to sample examination while maximizing the usage of resources; the study also describes RL environment design, the structure of the agent, and the reward system. The model has shown good accuracy at different stages which include 96.1% for training, 92.2% for testing, and 90.5% for validation data against previous methods in fast-changing threats circumstances. This continuous learning ability of the model allows it to be effective in the light of new threats, which has helped develop resource-aware, scalable, and adaptable malware analysis systems against changing risks in cybersecurity. This model provides benefits and some of the main challenges in this domain for dynamic malware analysis and resource allocation.
KW - adaptive malware analysis
KW - continuous learning
KW - dynamic resource allocation
KW - reinforcement learning
KW - threat level estimation
UR - http://www.scopus.com/inward/record.url?scp=85208802827&partnerID=8YFLogxK
U2 - 10.1109/iacis61494.2024.10721700
DO - 10.1109/iacis61494.2024.10721700
M3 - Conference contribution
SN - 979-8-3503-6067-7
T3 - International Conference on Intelligent Algorithms for Computational Intelligence Systems, IACIS 2024
SP - 1
EP - 6
BT - International Conference on Intelligent Algorithms for Computational Intelligence Systems, IACIS 2024
PB - Institute of Electrical and Electronics Engineers (IEEE)
T2 - 2024 International Conference on Intelligent Algorithms for Computational Intelligence Systems, IACIS 2024
Y2 - 23 August 2024 through 24 August 2024
ER -