AI for Language Revitalization

Why It's Important

AI presents a transformative opportunity to support and accelerate First Nations language revitalization efforts, directly contributing to cultural resilience and creating new economic possibilities. AI-powered tools can help create language learning apps, automate the creation of teaching materials, and develop text-to-speech systems, making the language more accessible to a new generation of learners. This work is critical for strengthening cultural identity and well-being. According to the Assembly of First Nations, language is a foundational pillar of culture and nationhood. Investing in these technologies can lead to LED outcomes by creating local jobs in linguistics, software development, and culturally-focused education, while also preserving an invaluable cultural asset.

History

Language revitalization has a long history, moving from oral-only teaching to print dictionaries and audio tapes in the 20th century. The digital age brought new tools, with a key Canadian milestone being the launch of FirstVoices in 2003, a platform for communities to digitally record their languages. The recent rise of powerful AI, specifically Large Language Models (LLMs) and speech recognition technology like OpenAI's Whisper model, represents another massive leap. These technologies, developed in the last five years, make it possible to process language data at a scale and speed that was previously unimaginable, opening up entirely new avenues for creating immersive and interactive learning experiences.

Examples

The National Research Council of Canada (NRC) has been working with Indigenous communities for years on language technologies, including developing text-to-speech for Inuktitut and creating predictive text keyboards for several First Nations languages.

Sanyakola Project:This North Island College project supports the resurgence and restoration of Kwakwaka̱’wakw pedagogy and methodology encoded in Kwak’wala Utilizing AI and augmented VR technologies.

UBC Language Revitalization Research: Researchers on Vancouver Island are working on innovative ways, including artificial intelligence and immersive technology, to revitalize Indigenous languages

The First Peoples' Cultural Council in B.C. has supported projects that explore using digital tools for language learning, creating a foundation of data that could be used to train future AI models.

Software and Tools

FirstVoices: While not an AI tool itself, this platform is critical for building the foundational, community-verified datasets of words and phrases that are necessary to train AI models for a specific language.

Whisper by OpenAI: A highly accurate, open-source AI model for speech-to-text. It can be used to create first-draft transcriptions of recordings of Elders speaking the language, which is often the first step in creating a language dataset.

Google's "Your Language" project: While still in development, this initiative aims to make it possible for communities to add their languages to Google Translate. This requires a significant amount of existing translated text to work effectively.

Mother Tongues: An organization that provides a platform and resources for communities to create their own custom language-learning dictionaries and apps without needing extensive technical skills.

SayIt: An open-source tool developed by the NRC that helps users create pronunciation guides and audio datasets for language learning.

AI Considerations

The use of AI in language revitalization is an area of immense opportunity and significant risk, requiring careful, community-led governance. AI can be a powerful assistant, but it cannot replace the essential role of fluent speakers and Elders. The primary risk is data sovereignty. When a community uses a commercial AI tool, they are often sending their language data—a sacred cultural asset—to a corporate server. This could lead to the language being used in ways the community did not consent to. It is crucial to develop clear data governance agreements and, where possible, use open-source, self-hosted AI models to maintain control. Furthermore, AI models can make errors, and if these are not corrected by fluent speakers, they risk introducing inaccuracies into the language record.

FAQ

Pro Tips

Contribute to language revitalisation by learning how AI‑powered speech recognition and translation tools work and collaborating with fluent speakers and language keepers. Participate in ethical audio data collection, respecting consent and cultural protocols, and assist with training and validating models. Use the resulting tools to develop learning apps or subtitles and pair them with human‑led instruction. Your involvement helps preserve language for future generations.

Checklist

External Resources

The National Research Council of Canada (NRC) Indigenous Language Technologies project: The official page for the NRC's work in this area, showcasing projects and research.

First Peoples' Cultural Council: A key B.C.-based organization that provides funding and resources for a wide range of language revitalization initiatives.

Canada Language Museum: Provides a PDF that introduces the linguistic study of the Indigenous languages spoken in Canada.

The Canadian Encyclopedia's section on Indigenous Languages in Canada: Provides important historical and cultural context on the status of languages across the country.

First Nations Information Governance Centre (FNIGC): The home of OCAP® and a leading voice on Indigenous data sovereignty in Canada.