An article in the peer-reviewed journal Science Advances reports that researchers have developed an innovative technique for generating captions from brain activity. Before you mentally review all the thoughts you had today, take a deep breath. The technology can’t read the thoughts of random people. The process requires brain scans and AI. Your secrets and inappropriate thoughts are safe—for now.
Although this isn’t the first research team to attempt to create captions for human thoughts, it is a different approach. It’s difficult enough to produce single words from brain activity, but what researchers are doing in this case is attempting to build complete captions that describe what a person sees as they view a short video. Describing the “complex mental content” (as the researchers put it) involved in thinking about a video that one is seeing is an understandably complicated endeavor.
As Medical Xpress notes, the researchers employed an iterative optimization method to generate high-quality captions. The system utilizes functional magnetic resonance imaging (fMRI) to scan each participant’s brain, along with a large language model that assists in producing captions. The use of AI appears to be the key to their success, although it’s worth noting that this isn’t the first time AI has been utilized in decoding brain activity.
“Crucially, we leveraged an LM pretrained for masked language modeling (MLM) to constrain the search space during optimization,” the researchers wrote in their article. “By directly optimizing word sequences to match brain-decoded features, our method minimizes dependence on external resources such as caption databases or nonlinear captioning models, thereby ensuring the generation of descriptions more closely aligned with brain representations while maintaining the interpretability of structured visual semantics in the brain.”
The researchers used several short videos to test their method, including one that shows a metal sphere with rotating pieces, a person jumping off a cliff into water, and a crowd of people rejoicing. The team displayed the system’s captions for a participant watching each of the movies and demonstrated how the captions evolved from nonsensical to accurate descriptions through a series of iterations.
The caption for the video of a person jumping off a cliff into water, for example, started as “15 ha” and eventually evolved into “Above rapid falling water fall,” and continued until it became “A person jumps over a deep water fall on a mountain ridge.”
The team’s technique also showed some promise when it came to captioning recalled content, as opposed to content the participant was actively viewing. The researchers pointed out that there could be “thought-based brain-to-text communication” applications for people who have trouble speaking.
© 2001-2025 Ziff Davis, LLC., a Ziff Davis company. All Rights Reserved.
ExtremeTech is a federally registered trademark of Ziff Davis, LLC and may not be used by third parties without explicit permission. The display of third-party trademarks and trade names on this site does not necessarily indicate any affiliation or the endorsement of ExtremeTech. If you click an affiliate link and buy a product or service, we may be paid a fee by that merchant.
Tags brain captioning develop researchers scans using
Check Also
Apple Pushing Forward With iPhone Air Successor Despite Weak Sales: Rumor
Apple reportedly plans to release an iPhone Air 2 despite how poorly the first-generation iPhone …
#Bizwhiznetwork.com Innovation ΛI |Technology News