Arabic language

Abu Dhabi TII Unveils Largest Arabic Language Processing Model

Technology Innovation Institute (TII), a global research center and applied research pillar of the Advanced Technology Research Council in Abu Dhabi, has announced the launch of Noor, the world’s largest Arabic natural language processing (NLP) model at this day.

TII’s team of advanced researchers and artificial intelligence (AI) specialists within its AI cross-center unit, has partnered in this initiative with LightOn, a technology company that unlocks artificial intelligence at large scale for companies, in order to revolutionize the Arab models of NLP.

The Noor model performs varied and cross-domain tasks simply from natural language instructions.

To build Noor, TII researchers designed an end-to-end pipeline for high-quality data collection, including large-scale exploration, filtering, and curation. TII specialists have also designed services optimized for large-scale distributed training and serving, to deliver applications with efficient inference and model specialization.

Dr. Ray O. Johnson, CEO of TII and ASPIRE, said: “With this development, we are well on our way to strengthening our research capabilities and AI credentials, as well as elevating Abu Dhabi’s status and of the United Arab Emirates as a serious research ecosystem. . Our teams of experts have demonstrated once again that this region can achieve breakthrough R&D results that impact the world. »

Dr. Ebtesam Almazrouei, Director, AI Cross-Center Unit, TII, said, “Large language models have taken the world of natural language processing by storm, and we are proud to present this cutting-edge model with 10 billion of parameters – the world’s largest Arab model of NLP. The unique and large Arabic dataset collected to train the model is the result of months of work including curation, disposal and filtering from varied sources. Special thanks to the entire team who worked on this project to make NOOR the go-to Arabic exploration model for academics and businesses around the world.

Speaking about the upcoming launch, Professor Mérouane Debbah, Chief Researcher, Digital Science Research Center and AI Cross-Center Unit, TII, said: “With NOOR, TII has expanded the scope of the modern standard Arabic model by pulling leveraged know-how in large language models to build cutting-edge interdisciplinary expertise in this new generation of AI research.

NOOR’s training dataset is the world’s largest high-quality cross-domain Arabic dataset, combining web data with books, poetry, news articles and technical information to greatly expand applicability of the model.

Abdulaziz Alshamsi, AI researcher and PhD student, TII, said: “As an Emirati researcher, I am proud to be part of TII. I enjoyed meeting and working with the passionate researchers and trainers of the AI ​​Inter-Center Unit, who bring deep value to the Arabic language and the field of NLP.”

“I acquired technical and advanced skills that will pave the way for me as a researcher to discover the world beyond the horizons of NLP. Training workshops have allowed us to perfect ourselves and enlighten us on new concepts and gave us the right tools to bring the Noor project to life,” he added.

Copyright 2022 Al Hilal Publishing and Marketing Group Provided by SyndiGate Media Inc. (