At the recent Radiodays Asia conference, Peter Lucas-Jones, the CEO of Te Reo Irirangi o Te Hiku o Te Ika (Te Hiku Media), captivated attendees with an enlightening presentation about an innovative project that leverages Māori language radio broadcasts to enhance speech and language technologies. This initiative is a profound step toward revitalizing and preserving the Māori language, relying on an extensive archive of radio content that spans decades.
Peter Lucas-Jones, who is not only a seasoned Māori language broadcaster but also a digital content creator, is steering this ambitious initiative. He passionately underscored the significance of Indigenous data sovereignty and the role of natural language processing (NLP) in facilitating the revival of Indigenous languages. His work is a testament to the power of technology to foster cultural preservation and continued language use.
The roots of this project run deep, having developed over 30 years to safeguard the extensive archive of Māori language broadcasts. Lucas-Jones emphasized, “We are protectors of data. We have the largest archive in the tribal radio network.” This archive serves as an invaluable resource, allowing for training AI models that not only aid in language preservation but also ensure that the cultural heritage encapsulated in the language is not lost to future generations.
As part of this effort, Te Hiku Media has created a Māori language speech-to-text system boasting an impressive error rate of just 8%. This remarkable achievement opens up diverse applications, from transcribing historical audio recordings to enabling speech interfaces for various technologies. It’s a leap forward in integrating language technology with everyday use and accessibility.
Community involvement is at the core of Te Hiku Media’s strategy. In 2014, the organization began streaming videos and formed online consumer groups to broaden its reach. Lucas-Jones pointed out, “More than 80% of tribal members live outside our tribal area, but we find that people want to be connected with the content from their tribal area. People are using devices to consume our content every day everywhere.” This connectivity not only strengthens community ties but also encourages engagement with Māori culture and language among those residing beyond traditional boundaries.
Te Hiku Media’s pioneering efforts have not gone unnoticed. The Hawaiian community is now refining their indigenous language speech-to-text models, drawing inspiration from the methodologies employed by Te Hiku Media. This cross-cultural collaboration is a profound illustration of how one community’s efforts in Indigenous language preservation can resonate and empower others around the globe.
However, the path to revitalizing the Māori language has not been without challenges. Lucas-Jones recounted the historical oppression faced by the Māori, stating, “When we first started to work with elders, we discovered that the original language was beaten out of them. It had a devastating effect on the language, and tribal radio is helping to revive those languages.” The focus on transcribing early archives is a conscious effort to reclaim and restore language that has been adversely impacted by colonization.
The project has garnered support from several quarters, including Nvidia, who provided GPUs pivotal for establishing the AI environment. This collaboration allows Te Hiku Media to maintain control over their data, keeping it rooted within the community, with Lucas-Jones asserting, “We are not teaching other computers to speak Māori… we are keeping it at home.” Such stewardship of data underscores the commitment to preserving Māori language and culture from being commodified by larger tech entities.
Moreover, this initiative has led to the development of synthetic bilingual voices, which further enhance the accessibility and usability of the Māori language. These advancements are essential in making the language available in contemporary contexts, ensuring it remains relevant in a rapidly changing digital landscape.
Lucas-Jones’ profound statement, “We don’t want our sovereign languages sold back to our grandchildren as data as a service. Our land was already stolen; we don’t want our language stolen by big data companies. Data is land,” encapsulates the ethos driving these efforts. The Māori language radio broadcasts utilized for training AI models signify a groundbreaking progression in the ongoing journey of cultural preservation and revitalization.
Through all these strides, Peter Lucas-Jones and his dedicated team at Te Hiku Media illuminate the significant impact of community-driven initiatives, showcasing that technology can indeed be a formidable ally in safeguarding cultural heritage for generations to come. Lucas-Jones captures this sentiment beautifully: “Language is the key to culture. It springs from the life and the landscape. It is often imbued with philosophical memory; it is the ideal vehicle to transmit culture.”
In recognition of his contributions to the field, Lucas-Jones has also been honored as one of Time Magazine’s top 100 people in AI Tech, a testament to his impact and commitment to Indigenous language revitalization through innovative technology.