For my 2024/2025 research project, I investigated whether Deep Q-Learning (DQL) could overcome the combinatorial explosion associated with the Travelling Salesman Problem. I ran DQL with different parameter tunings, finding that by adjusting the exploration decay, minimum exploration and learning rate could improve performance. I then compared with the Genetic Algorithm (GA) for both its solution cost and runtime. Some of the results are shown below:



The results found showed that DQL could indeed be applied to the TSP, but the results were inferior to that of the GA, both with regards to runtime and final cost.
My full report can be viewed Here.
Very cool