In this paper, an autonomous maneuver decision-making method for UCAV that considers the partially observable states of Human (the adversary) is proposed, building on a game-theoretic approach. In human-computer gaming scenarios, the autonomous decision-making problem of an unmanned combat air vehicle (UCAV) is a complex sequential decision-making problem involving multiple decision-makers. Simulation results show that the expert actor‐based soft actor critic algorithm can find the most favourable policy for unmanned aerial vehicles to defeat the opponent faster, and converge more quickly, compared with the soft actor critic algorithm. The model enables the one‐to‐one air combat to be simulated under different initial battlefield situations. And to simulate the complex battlefield environment, a one‐to‐one air combat model is established and the concept of missile's attack region is introduced. Specifically, the algorithm uses a small amount of expert experience to increase the diversity of the samples, which can largely improve the exploration and utilisation efficiency of deep reinforcement learning. To meet the challenge, this article proposes an autonomous manoeuvre decision model using an expert actor‐based soft actor critic algorithm that reconstructs empirical replay buffer with expert experience. Unmanned aerial vehicles inability to manoeuvre autonomously during air combat that features highly dynamic and uncertain manoeuvres of the enemy however, limits their combat capabilities, which proves to be very challenging. The demand for autonomous motion control of unmanned aerial vehicles in air combat is boosted as taking the initiative in combat appears more and more crucial. The relevant experiments have demonstrated that the proposed model can effectively improve the prediction accuracy and convergence rate in the prediction of maneuver control variables. 3) the model takes the maneuver control variables as the output to control the maneuver, making the maneuver process more flexible. 2) using stacked sparse auto-encoder network to reduce the dimension of time series data to predict the result more accurately. This model features: 1) time series data is used as the basi of decision-making, which is more in line with the actual decision-making process. Series combat-related data after dimensionality reduction. The model consists of stacked sparse auto-encoder network for dimensionality reduction of high-dimensional, dynamic time series combat-related data and long short term memory network for capturing the quantitative relationship between maneuver control variables and the time In this paper, a hybrid deep learning network-based model is proposed and implemented for maneuver decision-making in an air combat environment.
0 Comments
Leave a Reply. |