Artificial intelligence tidies up.
Text: Yvonne Vahlensieck
The number of photos and videos that people accumulate over their lifetimes is becoming immeasurably large. In order to maintain an overview, we have no choice but to rely on technical solutions. But this has its disadvantages.
Humanity has now saved almost eight trillion (8,000,000,000,000) photos on smartphones, on computers and in the cloud — and the American market-research company Rise Above Research estimates that this figure is growing by at least another 1.5 trillion each year. That being said, most of these memories will probably never be looked at again. After all — hand on heart — who still has the time to neatly sort through all of this material nowadays?
Thankfully, this onerous task can now be delegated, for example, to apps that recognize motifs and faces, compile photo albums or store photos neatly in their various categories. “As we look to the future, we’ll increasingly be reliant on services such as these,” says Heiko Schuldt, Professor of Computer Science at the Department of Mathematics and Computer Science at the University of Basel. Schuldt deals with the technical aspects of these tools: How can such huge volumes of data be stored in a way that allows rapid access? How can large collections be searched effectively and in a targeted manner?
In recent years, Schuldt’s research group has developed an innovative system that can do much more than just manage collections of photos. The multimedia search engine “vitrivr” also sifts through other types of media such as videos and audio recordings, and allows users to search using more than just keywords. “You can also search based on sketches, sounds, motion sequences and much more — and in all kinds of media,” says Schuldt. This year, vitrivr led the Basel-based team of researchers to victory in a competition where the aim was to find certain video sequences as quickly as possible within thousands of hours of video material.
In order for the system to perform search queries this quickly, ideally in fractions of a second, features such as colors, shapes and objects are extracted from the photos and videos offline and stored in a database in the form of gigantic combinations of numbers. In online searches, the computer also converts the search query into a numerical pattern and searches the database for similarities.
Many human abilities are lost
“The computer doesn’t see a sunset — it just sees a bunch of numbers,” says Ivan Dokmanić, professor of data analytics at the Department of Mathematics and Computer Science. He is an expert in machine learning — a method that trains a computer to solve a problem with the help of large datasets. Since the computer thereby assembles the suitable algorithms itself, so to speak, machine learning is an important step toward artificial intelligence (AI). Dokmanić is researching the application of machine learning in imaging — for example, to reconstruct higher-quality CT images with reduced radiation exposure. Applications that help find and sort photos operate according to similar principles, having undergone training using millions of photos that humans have previously tagged with keywords.
Dokmanić actually takes something of a critical view of this application of machine learning: “Computers learn differently from people. It’s popular to call it artificial intelligence, but there’s nothing intelligent about it.” The automated systems do deliver results that seem to make sense at first glance. Still, many subtleties are lost — perhaps without our noticing: For example, the app might identify a blurred photo as bad and choose not to display it — even though it shows our daughter’s first steps. Or, conversely, the program might not know that the beach photo includes our ex-girlfriend and therefore doesn’t belong in an album of our best holiday memories.
There is also another problem: Both Dokmanić and Schuldt point out that there are risks associated with unthinkingly entrusting our personal data to the various photo apps and cloud providers. “Although these programs provide some nice added value, that value can come at a very high price. A healthy dose of skepticism is called for,” says Schuldt.
More transparency is needed
The psychologist Florian Brühlmann also believes it is important to gain a better understanding of how these programs work. “The modern algorithms used in machine learning are actually an opaque box, where users can’t understand how decisions are made,” says Brühlmann, who is director of the Human-Computer Interaction Research Group of the University of Basel. Accordingly, there are already calls for these algorithms to satisfy certain ethical criteria, such as reliability, fairness and transparency.
Brühlmann and his colleague Nicolas Scharowski are particularly interested in the last point: “We’re searching for methods to make the behavior and decisions of artificial intelligence more comprehensible to humans. The more complex the systems, the more difficult this becomes. At some point, even the programmers don’t know what exactly is going on inside the opaque box.”
More recent research shows, however, that it may not be necessary to understand everything down to the last detail — for example, it may be helpful simply to know the most relevant decision-making criteria or to give users an indication of what they could change about the query in order to obtain a different result. In everyday terms, Brühlmann wants to evaluate how and whether these explanations actually deliver greater transparency and trust in algorithms as part of a series of studies over the coming years.
How do the experts themselves handle their own private floods of data? “Perhaps one should dare enjoy reality tête-à-tête instead of endlessly snapping pictures,” says Ivan Dokmanić. But he adds that even for him it’s an uphill battle given how addictive smartphones are.
Florian Brühlmann attempts to sort through photos in a timely manner, immediately marking the favorites that he may want to look at again in the future. In contrast, Heiko Schuldt generally archives his images without looking through them (albeit not in the cloud!). Of course, if he wants to find something again, he can always use the search program that he played a part in developing.
More articles in the current issue of UNI NOVA.