Travelling Salesman Problem toy

Points	One line per point. Each line holds x and y coordinates separated by a comma.
Lines	Start with any point number (first point is point 1). Then list other points separated by a comma. Program assumes that the tour will return to the first point after the last point listed.

Points	Lines
Replace this text with the points in the problem set.	Replace this text with the lines in the problem set.	Plot random dots

Then improve by …
Then improve by …

Currently not init. Improvements: not init. Current length: not init.

Warning

This program does not produce what is definitely the shortest route. This program is more for letting you understand the issues involved and see how various operations change the route. The operations this program implements are not guaranteed to find the shortest route. They're pretty good but they suffer from the problem of local minima.

When creating …

in order – string points together in the order they were listed
at random – string points together in a random order
add in best – add points in turn in the place where they causes least expansion
add furthest – repeatedly add whichever point which will cause the most expansion

When improving …

move dot – take each dot of the route and try it in every other position
flip subroute – take each section of the route and try it reversed
swap – Alternate between the above two until no improvements are made
move route – take each section of the route and try it in every other position

Discussion

The Travelling Salesman Problem (TSP) is a much-explored task which has led to discoveries in both psychology and computer science. The problem involves a salesman who leaves his company's headquarters, visits a number of dealers, then returns to his headquarters. The task is to find the route which lets the salesman visit all his dealers with the minimum time spent and distance travelled. There are many versions of this problem with different constraints. This program looks at the one in which the distance travelled between any two points is just proportional to the 2D distance between the two points. In other words, the salesman can always travel along any straight line he chooses, and the cost of that journey is proportional to the length of that line.

The definitive answer to the TSP can be arrived at by considering every possible combination of routes. This would be n!/2n where n is the number of points involved. (There are n! possible orders to put the points in, the route can be travelled either way around (so divide by 2) and start and stop at any of the points (so divide by n)). For n=50, p≈3×10⁶². You do not want to wait while a program tries that many routes to see which is shortest.

Instead this program relies on repeated small improvements to figure out a good (not necessarily the best) route.

It starts by making a good choice of where to insert each new point on the route. Starting with a route of three points (there's only one route that can connect three points) it inserts each new point cleverly, inserting the point at the place in the route where it will add the least extra distance. So for the fourth point there are three possibilities, for the fifth point there are four possibilities, etc.. For n=50, p=122. So for very little added work we can get what already looks like (but isn't) a very good route. Check this out using the 'add in best' button.

Once we have a route (clever or not) we can improve on it. Ideally we'd check each subroute of every possible length and see whether this subroute would be better somewhere else in the sequence, either way around. This would give us the maximum scope in improvements. But it does lead to a huge number of possible things to check: Σ(Σ(i×(n-i))). For n=50 it's okay, but for n=300 pack a lunch, and for n=500 pack a calendar.

So this program also implements two simpler faster algorithms which check fewer things but can do that far more quickly. One just tries to move each point to each possible position on the route. There are only n possible places to do that so it's quick, though each iteration can lead only to a small improvement. The other looks for places on the route where two paths cross. Almost by definition, the route would be shorter if they didn't. There are only n×(n-1) possible cases where that might happen. Both of these algorithms need to be run repeatedly until they can no longer find any way to shorten the tour. Check them out using the 'keep moving dot' and 'keep flipping subroutes' buttons.

Both of the above algorithms spot bad routes and make them better but they both miss possible improvements which will leave you pointing at the screen shouting "Here ! Do this bit !". Fortunately, the two algorithms complement one-another very well: most things one doesn't notice the other does and vice versa. Using them in combination, switching between one and the other, leads quickly to a very good route. So I've included the 'keep swapping' button which makes the program keep switching between them until neither improve the route.

The last button in the row, 'keep moving routes', implements the slow but very thorough algorithm I mentioned before the two fast ones. You can use the fast ones first to find the simple improvements, to avoid having to run the slower algorithm more times.

The disproof

So these algorithms should give a pretty good result, right ? Do something like

put in your points or tell the system to make up 100 random points
hit 'Route add furthest' to make up a good starting route
hit 'Keep swapping' to improve it a lot quickly
hit 'Keep moving routes' to improve it slowly but diligently

Look at what it says for 'Current length' and you get a pretty good result. Yes you do. But if you have more than 50 point you can get a much better result with a little more work:

hit 'Route at random'
hit 'Keep swapping' to improve it a lot quickly
hit 'Keep moving routes' to improve it slowly but diligently

Look at what it says for 'Current length'. Then do the above three steps again, and again and again. You get different lengths. For some reason the random starting route (which is, of course, a terrible route) can change the final result quite a lot. Doing the above three steps ten times and picking the best result often gives a shorter route than any simple improvement algorithm. Why ? Well it seems that repeated small improvements doesn't cut it for the perfect route. It's a very fast way to a very good route but does not necessarily ever find the best route.

This program was created and is maintained by Simon Slavin for and of the Department of Psychology, Lancaster University.