UAE’s Falcon 3 challenges open-source leaders amid surging demand for small AI models
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
The UAE government-backed Technology Innovation Institute (TII) has announced the launch of Falcon 3, a family of open-source small language models (SLMs) designed to run efficiently on lightweight, single GPU-based infrastructures.
Falcon 3 features four model sizes — 1B, 3B, 7B, and 10B — with base and instruct variants, promising to democratize access to advanced AI capabilities for developers, researchers, and businesses. According to the Hugging Face leaderboard, the models are already outperforming or closely matching popular open-source counterparts in their size class, including Meta’s Llama and category leader Qwen-2.5.
The development comes at a time when the demand for SLMs, with fewer parameters and simpler designs than LLMs, is rapidly growing due to their efficiency, affordability, and ability to be deployed on devices with limited resources. They are suitable for a range of applications across industries, like customer service, healthcare, mobile apps and IoT, where typical LLMs might be too computationally expensive to run effectively. According to Valuates Reports, the market for these models is expected to grow, with a CAGR of nearly 18% over the next five years.
What does Falcon 3 bring to the table?
Trained on 14 trillion tokens — more than double its predecessor Falcon 2 — the Falcon 3 family employs a decoder-only architecture with grouped query attention to share parameters and minimize memory usage for key-value (KV) cache during inference. This enables faster and more efficient operations when handling diverse text-based tasks.
At the core, the models support four primary languages — English, French, Spanish, and Portuguese—and come equipped with a 32K context window, allowing them to process long inputs, such as heavily worded documents.
“Falcon 3 is versatile, designed for both general-purpose and specialized tasks, providing immense flexibility to users. Its base model is perfect for generative applications, while the instruct variant excels in conversational tasks like customer service or virtual assistants,” TII notes on its website.
According to the leaderboard on Hugging Face, while all four Falcon 3 models perform fairly well, the 10B and 7B versions are the stars of the show, achieving state-of-the-art results on reasoning, language understanding, instruction following, code and mathematics tasks.
Among models under the 13B-parameter size class, Falcon 3’s 10B and 7B versions outperform competitors, including Google’s Gemma 2-9B, Meta’s Llama 3.1-8B, Mistral-7B, and Yi 1.5-9B. They even surpass Alibaba’s category leader Qwen 2.5-7B in most benchmarks — such as MUSR, MATH, GPQA, and IFEval — except for MMLU, which is the test for evaluating how well language models understand and process human language.
Deployment across industries
With the Falcon 3 models now available on Hugging Face, TII aims to serve a broad range of users, enabling cost-effective AI deployments without computational bottlenecks. With their ability to handle specific, domain-focused tasks with fast processing times, the models can power various applications at the edge and in privacy-sensitive environments, including customer service chatbots, personalized recommender systems, data analysis, fraud detection, healthcare diagnostics, supply chain optimization and education.
The institute also plans to expand the Falcon family further by introducing models with multimodal capabilities. These models are expected to launch sometime in January 2025.
Notably, all models have been released under the TII Falcon License 2.0, a permissive Apache 2.0-based license with an acceptable use policy that encourages responsible AI development and deployment. To help users get started, TII has also launched a Falcon Playground, a testing environment where researchers and developers can try out Falcon 3 models before integrating them into their applications.
Source link