Today we live in a world where data science and machine learning are getting bigger and bigger. The current output of data in the world is roughly 2.5 quintillion bytes a day and a quintillion means a number followed by 18 zeros! But what about situations when there is very little data but actions need to be taken? For example, when the coast guard is searching for a person lost at sea.
The United States Coast Guard (USCG) encounters about 20,000 search and rescue cases in a year. This can be anything: a person who went paddle boarding and lost their way, a boat which is out of fuel or a cruise ship which is on fire. The emergency is relatively easy to approach if the USCG knows the coordinates. Unfortunately, this is not the case a lot of the time. To tackle this problem and to get to the person or boat as fast as possible, the USCG designed a sophisticated system called SAROPS. This stands for “Search and Rescue Optimal Planning System”. In this system they use several mathematical and statistical techniques.
When you encounter a problem in data science, most of the time you have a bunch of data and your main task is to make sense of this data. In contrast, every search and rescue operation is unique so there is not a lot of data, which brings a lot more obstacles. To tackle this, the designers of SAROPS used a statistical method which is also useful when there is little data, it is called the Bayesian approach.
The Bayesian approach
The basis of this approach is Bayes’ Theorem which Thomas Bayes introduced in 1770.
This is a formula which you probably remember from your first probability theory course. It is given by:
I will illustrate Bayes’ Theorem with an example. Suppose that you have taken a look at four F1 races between Lewis Hamilton and Max Verstappen. Hamilton won three of them and Verstappen only one. So, if you had to bet who would the winner of the next race, what would you guess? Lewis Hamilton, right? But what if I now tell you that it rained during one of the races which Hamilton won and during the race which Verstappen won. And it is for sure that it will rain next race. This definitely increases the probability of Verstappen winning it, but by how much? Let be the event of Verstappen winning the race and let be the event of raining. We want to know the probability . We already know that since Verstappen won one out of four races. Next to that, since it rained two out of four races. At last, , it rained during the only race which Verstappen won. This yields, using Bayes’ Theorem, that . Hence, the information that it will rain during the next race drastically changed the probability.
This was of course a pretty simple example, how can we use this to find a person a person lost at sea? We have in this case very little data, but we do definitely have some. Firstly, we do have some information about where the person is seen for the last time. Secondly, we know things like the speed and direction of the wind, the weight of the victim and the state of the sea (e.g. whether there are big waves or not). SAROPS then conditions the probabilities on this data and uses Bayes’ Theorem to draw a probability map where the person might be right now. During the search of the person we gather more data, we know for example where the person is not. It could also be that we get a call from someone who saw the person an hour ago. The useful thing about the Bayesian approach is that we can just add all this information to the data which we have already and we get a new probability. This way, the probability map of where the person might be is constantly updated during the search.
A probability map in SAROPS (source: Wikipedia)
In SAROPS, there are four basic components: an environmental data server (EDS), a simulator (SIM), a search planner and a graphical user interface (GUI). For us, the econometricians, the simulator and the search planner are most interesting. The simulator is also the part of SAROPS which creates the probability map. This is done by using the Monte Carlo method. It constructs about 10,000 possible scenarios how the person could have reached a certain position. One of the uncertainties is, what would be the exact starting position of the person, most of the time the coast guard only knows that the person went swimming in some bay. Another uncertainty is, at what time the person would have stopped swimming and only drifting around. There are lot more uncertainties and the simulator then constructs 10,000 different paths the person could have taken. But, of course, some scenario is more likely than the other and this way the system constructs a probability map.
Such a probability map is really cool and useful but a question that might come to you right now is, where do you start? If you are not fast enough, the person might die so this can’t be a random guess about what would be the best route. Next to that, the costs of a USCG rotary-wing aircraft is $9-14K per hour and a USCG cutter (a big boat) costs $3-15K per hour. This clearly shows that the boats and ships should work effectively instead of just wandering around in the search area. To address this issue, SAROPS also has its own search planner. This planner recommends search plans based on search assets (number of boats, helicopters, etc.) and it also computes the probability of success. Based on this, the USCG can also make a better decision whether they should recruit extra aircrafts and boats. In the picture below, you can see some blue and green lines which show the routes which the vehicles of the U.S. coast guard should take.
Output of the planner
So, SAROPS turns out to be a super useful tool in search and rescue operations. It bases the search procedure on mathematics and statistics. This increases the probability of finding that lost person and saving their life. So, yes, econometrics can save lives at sea. Hopefully this motivates you a bit to start studying again after the past summer break.
Source: Kratzke, Thomas & Stone, Lawrence & Frost, J.R.. (2010). Search and Rescue Optimal Planning System
Dit artikel is geschreven door Stan Koobs