Thousand of herbarium samples have been successfully classified using deep learning algorithms.
Computer algorithms can identify herbarium sheets that have been pressed, dried, and mounted automatically using images of preserved plants, researchers report.
This is the first attempt to tackle the difficult taxonomic task of identifying species in natural history collections with deep learning – and artificial intelligence technique that teaches neural networks by using large, complex data sets.
Palaeobotanist Peter Wilf of University Park, Pennsylvania, says it probably won’t be the last attempt. Natural history is headed in this direction. This is the future.”
Researchers and educators all over the world are digitizing their collections and launching open databases to share images of their specimens. One program, iDigBio, boasts more than 150 million images from natural history museums around the country.
Current Stats
According to a recent study, there are roughly 3,000 herbal collections around the world, holding 350 million specimens that have not yet been digitized.
However, the increasing data sets, alongside advances in computing techniques, enticed computer scientist Erick Mata-Montero of the Costa Rican Institute of Technology in Cartago to work with botanist Pierre Bonnet of the French Agricultural Research Centre for International Development in Montpellier.
During the course of the Pl@ntNet project, Bonnet’s team gathered millions of images of fresh plants, typically taken in the field by users of their smartphone apps.
Researchers trained similar algorithms on more than 260,000 scans of herbarium sheets representing more than 1,000 species. The algorithms produced the correct identification 90% of the time: the correct answer was within the top five picks from the algorithms. This is more than a human taxonomist would have done.
It is often thought that such results will lessen the value of botanists’ expertise, says Bonnet. “But in fact, it will never reduce the value of our botanical expertise.” Researchers would also still need to verify the results, he says.
Assisting Hands
It can also be used with other projects that call for manual annotation. For example, a current crowdsourcing project has people tick off which samples contain flowers or fruits. Researchers would welcome an automated way to handle this, says Gil Nelson, a botanist at Florida State University.
According to Bonnet, his algorithm could aid collections with identifying species from relatively sparse data sets, such as those in less-well-developed regions with fewer resources for botanical research. This could be of particular importance in areas that are rich in biodiversity but have small collections of plants.
With the AI plant identifier, further analyses can be performed. Herbaria samples contain much information: they contain when and where a sample was taken, how densely clustered the flowers are, and what the plant was doing when it was collected.
In light of concerns about climate change, we are increasingly interested in plants that have adapted to shifting climates since some samples are hundreds of years old.
Moving forward
We have been looking to transition to methods that can be used to extract useful information from images, including the identification study, Nelson says. “We’re focused on that right now,” he says.
Besides finding the larvae of insects, Nelson and Wilf are trying to understand the evolution of plant fossils. Though plant fossils can pose a variety of problems, herbarium sheets are comparatively unvarnished: flat, dry, and mounted on a standard size of paper.
However, an expert in the field believes that these details will ultimately be resolved. “It will be better in the future,” he predicts. “Students will not be able to recall what it was like before such tools existed.”