Summary:The lack of a standardized procedure for collecting data about elusive and hard to find species like the great white shark has to date seriously hampered efforts to manage and protect these animals. But now a marine biologist, an applied mathematician and a software developer joined expertise to develop a custom-made software package, called Identifin, which may offer a solution to this problem.
The lack of a standardized procedure for collecting data about elusive and hard to find species like the great white shark has to date seriously hampered efforts to manage and protect these animals.
But now a marine biologist, an applied mathematician and a software developer from Stellenbosch University joined expertise to develop a custom-made software package, called Identifin, which may offer a solution to this problem.
Dr Sara Andreotti, a marine biologist in the Department of Botany and Zoology at SU, have collected over 5000 photographic images of the dorsal fins of white sharks along the South African coastline as part of her research on the population structure of South Africa’s great white sharks. This is because the trailing edge of the dorsal fin provides a unique trade, analogous to a human fingerprint.
Over six years she managed to manually build a database with information on when and where an individual white shark was sighted. In those cases where she was able to collect a biopsy from the shark, the genetic information was linked to its profile.
But she was doing all this manually on her personal computer.
“I nearly lost my head. I quickly realised that in the long term updating the database was going to consume more and more of my time. That is when I headed over campus to the applied mathematics division and asked for help. I was stunned when they became all excited about my data,” she laughs.
Prof. Ben Herbst, a specialist in machine learning, and Dr Pieter Holtzhausen, a software engineer then busy with his PhD in Applied Mathematics, were literally overjoyed to be able to work with Dr Andreotti’ s data base.
Dr Holtzhausen explains: “We used an algorithmic technique called dynamic time-warping to match the fingerprints. With this technique, any data that can be turned into a linear sequence can be analysed. The technique is often used in speech recognition software.”
The image recognition software they developed, called Identifin, compares a semi-automatically drawn trace of the back edge of the dorsal fin to existing images in the database. The images in the database are then re-arranged and ranked by probability of match. If there is a match, the database photograph in the first position will be the correct one (see multimedia images).
However, while working with Michael Meyer, a marine scientist from the Department of Environmental Affairs, and shark conservationist Michael Rutzen from Shark Diving Unlimited, Dr Andreotti realised that the software needed some more tweaking if it were to fit the ideal of sustaining a large database for the long-term monitoring of the white shark population.
“The software had to be capable of quickly matching the fin identification of a newly photographed shark with a possible existing match in the database, and to automatically update the sharks’ id catalogue. The database also had to be user-friendly and structured in such a way so that different researchers can use it over the long term,” she explains.
While there is still room for improvement, the success of the first trials boosted their hope that in the near future they will be able to use Identifin to monitor white shark populations on a large scale.
“Previously, while at sea, I had to try and memorize which shark is which, to prevent sampling the same individual more than once. Now Identifin can take over. I will only need to download the new photographic identifications from my camera onto a small field laptop and run the software to see if the sharks currently around the boat have been sampled or not.
“By knowing which sharks had not been sampled before we can focus the biopsy collections on them. This saves us both time and money when it comes to genetic analysis in the laboratory,” she adds.
Dr Andreotti says to date the lack of standardization of data collection has been a major limitation to combining datasets of worldwide distributed species: “We hope Identifin will offer a solution for the development of a South African and then global adaptive management plan for great white sharks.”
The next step is to adapt Identifin for the identification of other large marine species and help other researchers facing the same kind of struggles.