February 25, 2024

Groundbreaking AI Technology Decodes Brain Waves into Text, Enabling Communication for Individuals Unable to Speak

Researchers from the GrapheneX-UTS Human-centric Artificial Intelligence Centre at the University of Technology Sydney (UTS) have achieved a significant breakthrough by developing a portable and non-invasive system that can interpret silent thoughts and convert them into text. This groundbreaking technology has the potential to revolutionize communication for individuals who are unable to speak due to conditions like stroke or paralysis. Moreover, it could pave the way for seamless interaction between humans and machines, such as the operation of prosthetic limbs or robots.

The research conducted by Professor CT Lin, Director of the GrapheneX-UTS HAI Centre, along with first author Yiqun Duan and Ph.D. candidate Jinzhou Zhou from UTS’s Faculty of Engineering and IT, has been selected as the spotlight paper at the NeurIPS conference. NeurIPS is an esteemed annual event that showcases cutting-edge research on artificial intelligence and machine learning, and this recognition emphasizes the significance of their findings.

During the study, participants silently read passages of text while wearing a cap that captured electrical brain activity using an electroencephalogram (EEG). The EEG waves were then processed through an AI model known as DeWave, which was developed by the researchers. DeWave effectively translates EEG signals into words and sentences by analyzing a vast amount of EEG data.

This research represents a pioneering breakthrough as it directly translates raw EEG waves into language, incorporating discrete encoding techniques in the brain-to-text translation process. Professor Lin described this as a significant stride forward in the field of neural decoding. The integration of large language models also presents new possibilities in the realms of neuroscience and AI.

Unlike previous methods that required invasive surgery or the use of bulky and expensive equipment, this system offers a non-invasive and user-friendly solution. Previous approaches involved the implantation of electrodes in the brain or the use of MRI machines, which are impractical for daily use. Additionally, these methods struggled to accurately transform brain signals into word-level segments without aids like eye-tracking. However, the new technology developed by UTS is compatible with or without the use of eye-tracking.

To enhance the robustness and adaptability of the system, the research was conducted with 29 participants. This broader scope of testing distinguishes it from previous decoding technology that was limited to one or two individuals, considering the variations in EEG waves across different people.

While the use of EEG signals received through a cap introduces more noise, the study reported state-of-the-art performance in EEG translation, surpassing previous benchmarks. The model demonstrates greater proficiency in matching verbs rather than nouns. According to Duan, this is partially attributed to the brain’s processing of semantically similar words, which generate comparable brain wave patterns. Despite these challenges, the model consistently produces meaningful results that align keywords and form similar sentence structures.

Currently, the translation accuracy score stands at around 40% on the BLEU-1 scale, which measures the similarity of machine-translated text to high-quality reference translations. The researchers are hopeful that further improvements will bring the accuracy level closer to traditional language translation or speech recognition programs, which typically achieve scores around 90%.

This groundbreaking research builds upon previous advancements in brain-computer interface technology developed by UTS in collaboration with the Australian Defence Force. The earlier innovation used brainwaves to command a quadruped robot, demonstrating the potential for brain-computer interfaces to revolutionize various fields.

The future implications of this AI technology are vast and diverse. It has the potential to greatly enhance communication and quality of life for individuals with conditions that affect speech, opening up new possibilities for human-machine interaction. As further advancements are made, this technology could redefine the boundaries of communication and bridge the gaps between minds.

1. Source: Coherent Market Insights, Public sources, Desk research
2. We have leveraged AI tools to mine information and compile it