A unique project is using some very old, and very cutting edge, IT to track the biggest fish in the sea.
Someone grabbed my wrist and pointed sharply below us: ‘Quick, look down!’ I swam to face the seabed and gasped through my snorkel. A shark stretching some 8m long, as long as a bus, had moved silently underneath us — and I hadn’t noticed it at all.
The whale shark, a beautiful creature, a filter-feeder whose giant mouth gives it a benign look, is good at escaping detection. It’s still not known how large the global population of whale sharks is, how they migrate across the oceans, and where they give birth, despite being the largest fish species alive today.
For a few minutes off the coast of Ningaloo in Western Australia, I was able to swim with one such shark, before it disappeared off into the Indian Ocean, too deep and too fast for humans to follow. All that remained of our encounter was a short video, taken by another snorkeller who’d been with me that day.
The tourist that made the video might have thought they were just creating a holiday souvenir, but unknowingly he was making a scientific record for marine researchers, and tagging the shark we’d both seen in a way that would allow us to reconnect with the animal many years later.
Stills from the video of our encounter were sent to a project that’s helping to find out more about these enigmatic animals, using a technological toolbox that draws on computer vision, social media, and neural networks to track whale shark movements across the globe.
Whaleshark.org has been curating whale shark photos from all over the world for around 15 years, and uses the distinctive spot patterns surrounding the animals’ left pectoral fins to identify particular individuals from their pictures. With the spot patterns as unique as a fingerprint, animals’ movements can be tracked across the world using photos and locations uploaded to the site by everyone from professional whale shark researchers to holiday makers that have snapped the creatures during a scuba dive.
Whaleshark.org is run by Wild Me, a not-for-profit that aims to help wildlife research and conservation using technology. The idea for Whaleshark.org came to Jason Holmberg, Wild Me’s information architect, in the early 2000s after he took park in an expedition in La Paz, Mexico, that tracked animals by attaching plastic tags with spearguns.
Holmberg asked a member of the expedition how often the tags were subsequently resighted. “He said ‘less than one percent of the time’. I said, ‘Oh, there’s some room for improvement then!'” Holmberg told ZDNet. “So I sat down and started programming and said, ‘OK, what if we were to use these natural spots [on a whale shark] like a human fingerprint and just allow people to photograph them?'”
While the spot patterns are distinctive enough for human researchers to identify one shark from any number of animals in the species, Holmberg’s efforts to code an algorithm that could do the same were proving fruitless.
Holmberg’s friend and subsequently co-founder of Wild Me, NASA pulsar astronomer Zavan Arzoumanian, persuaded Holmberg to put aside coding for one night and join him and a Dutch astronomer for a drink.
The chance meeting proved to be the answer Holmberg had needed: after dejectedly explaining to the astronomer the pattern-matching problem the whale shark project was facing, the Dutchman told him that NASA was already doing exactly the same thing using an algorithm that had come out of the software development for the Hubble space telescope.
“When the Hubble telescope takes pictures of the night sky, it tries to turn those pictures into a larger mosaic. What happens is it needs to match star patterns so it can position the photos correctly [within the mosaic]. That process of matching the stars correctly is exactly the process we need to match whale sharks’ spots,” Holmberg said.
After uncovering the original paper that had led to the creation of the algorithm — created by a Princeton physics professor for NASA’s Hubble program — and spending some time refining it, the algorithm was rolled out on Whaleshark.org and has been used to identify whale sharks ever since.
“The algorithm was developed around 1984. It was really well ahead of its time in terms of its elegance. There are many computer vision algorithms that have been developed since then, even for whale sharks, that don’t work anywhere near as well. It’s the only one that scales to a global dataset… and reliably identify the right whale shark across 50,000 photos,” Holmberg said.
The system is underpinned by Amazon’s EC2 in the Portland, Oregon region, where Wild Me is based. A public cloud service allows the organisation to scale up and down the servers at its disposal, according to how much computing grunt it needs at any one time.
Surprisingly, the hardest part of building a system that can identify a single whale shark from a cast of thousands is the data management layer involved: “getting data into a manageable format so [researchers] can identify individual animals and use computer vision systems, which require good well managed datasets,” Holmberg said.
To deal with the problem, Wild Me built Wildbook, an open source data management framework for use in wildlife and ecology studies.
“In 2003, I didn’t understand how bad the data management challenge for wildlife was, and it still is, 13 years later. Most people who began using Wildbook were migrating off of 1990s desktop applications — off of Access, off of Excel — which don’t allow them to share data, pool data, collect data from citizen scientists.”