The objective of our final project was to develop a program to calculate the shortest distance between two airports. There were 3 different aspects to consider when creating a project such as this: Dataset acquisition, algorithm implementation, and correctness of output. We converted data from openFlights’ “airport.dat“ into .csv files, in an attempt to improve the manipulability of the data. The data set has a multitude of information on each airport, but we only required the Airport Code, Latitude, longitude, and city name. We parse and clean the data through scanner.cpp. We use readAirport to create a map of airport objects which we will later use to create the graph. We decided to use a unidirectional graph with airport codes as the vertex and data from routes.dat to create the edges of the graph. We have used DFS to check if there is a path between the two given airports and this function essentially recursively calls itself till a path between the two airports can be found. The runtime of this function is O(n^2) as we are running it on an adjacency matrix. Floyd Warshall takes in an adjacency matrix with each entry (i,j) corresponding to start vertex i and destination vertex j with the distance between them as the number stored at that index. If there is no path between i and j, then there is a very high value inputted there. The algorithm uses three loops to create a next matrix with numbers showing which vertex to go to next. First, the algorithm goes through an immediate vertex k, and and goes through all pairs of vertices within the graph and checks if it is shorter to go through k or just go directly. This algorithm runs at O(V^3) since there are three loops going through the graph. The space complexity is O(V^2) because it is storing the start and end destinations twice in a matrix. We are also using A* algorithm to find the shortest distance between two airports. The reason we decided to do use this algorithm even though it gives the same thing as Floy Warshall is because A* is more efficient because of the heuristics used. We calculated the heuristic for each node as the distance between that node and the destination as this will always be an underestimate of the distance from the path. In our implementation we have two 2D vectors of doubles where the first index stores the current index, and the 2nd index stores the previous distance. The runtime would be O(b^d) where b is the number of children or possible destination airports in our case and d is the depth of the final solutions. The space complexity would also be the same.
The tests we used were detailed in checking most of the important methods of each class we implemented with different graphs not just using the airport data so we can test if the classes work with the general data. We also had test cases to make sure our data is being cleaned properly. The main aspects of the graph that we checked were the size and checking if there are any null elements. For the Floyd Warshall algorithm we have 3 test cases that test the algorithm on three different graphs. These three test cases test most possible situations that the graph could have. We were able to accomplish our goals and complete the program we chose, and fully implement the algorithms to provide a user the shortest distance between two airports they input, along with data on each airport. They would also receive a brief report on all other possible routes they could take. Something we would do differently given the opportunity to restart the project and complete it more efficiently would be to better optimize our testing. We believe we spent valuable time manually testing moderately large data sets, and would have been better off had we done manual tests on very small subsets of our data, and utilized superior test cases to determine the correctness of the output.