Thursday, October 23, 2025

Discover TII Falcon-H1 Models Now on Amazon Bedrock Marketplace and SageMaker JumpStart!

Share

Exploring the Falcon-H1 Models: TII and AWS Join Forces for AI Innovation

This post was co-authored with Jingwei Zuo from TII.

We’re thrilled to announce that the Technology Innovation Institute (TII) has made its ground-breaking Falcon-H1 models available on both the Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. Now, developers and data scientists can utilize six instruction-tuned Falcon-H1 models—ranging from 0.5B to a whopping 34B parameters—on AWS. This initiative combines traditional attention mechanisms with State Space Models (SSMs) to offer unparalleled performance and efficiency.

In this article, we will delve deep into the capabilities of the Falcon-H1 models and provide a step-by-step guide on how to get started with TII’s Falcon-H1 models on both Amazon platforms.

Overview of TII and AWS Collaboration

The Technology Innovation Institute (TII) is a prominent research institute situated in Abu Dhabi. As part of the UAE’s Advanced Technology Research Council (ATRC), TII pursues advanced technology research and development across fields like AI, quantum computing, autonomous robotics, and cryptography. With a diverse international team of scientists, researchers, and engineers, TII is committed to driving innovation and establishing Abu Dhabi as a global tech research hub in line with the UAE National Strategy for Artificial Intelligence 2031.

In collaboration with Amazon Web Services (AWS), TII aims to extend its innovative UAE-made AI models to a global audience. This powerful combination of TII’s expertise in large language models (LLMs) and AWS’s robust cloud-based AI and machine learning services enables professionals worldwide to build and scale generative AI applications with greater ease.

About Falcon-H1 Models

The Falcon-H1 architecture features a distinctive parallel hybrid design that integrates elements from Mamba and Transformer architectures. This design harnesses the speed and low-memory footprint of SSMs, while also leveraging the contextual understanding and enhanced generalization of Transformers’ attention mechanisms. The Falcon-H1 models scale fluidly from 0.5 to 34 billion parameters and support 18 languages natively, with the potential to grow to over 100 languages thanks to a multilingual tokenizer trained on diverse datasets.

Key Benefits of Falcon-H1 Models

  1. Performance: The hybrid attention-SSM model optimizes parameters with adjustable ratios between attention and SSM heads, resulting in faster inference and rough memory usage while maintaining strong generalization capabilities. In benchmarks, Falcon-H1 has shown superior performance against other leading Transformer models.

  2. Model Variety: The Falcon-H1 series comprises six distinct sizes—0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B—with both base and instruction-tuned variants available on Amazon platforms.

  3. Multilingual Capabilities: The models support 18 languages natively and can scale to over 100 languages, catering to a diverse user base globally.

  4. Long Context Length: The Falcon-H1 series offers the capability for processing documents with lengths of up to 256,000 tokens, making it ideal for long-form content, multi-turn dialogue, and complex reasoning tasks.

  5. Robust Training Strategy: Falcon-H1 models utilize an innovative training approach that starts with complex data, coupled with strategic data reuse to enhance memorization. This allows for smoother scaling across different model sizes.

  6. Balanced Performance: The models are designed to achieve a balance between general and domain-specific capabilities, minimizing unintended biases.

The models are released under the Falcon LLM license, promoting accessibility through open-source principles and offering substantial cost-effectiveness compared to proprietary options.

Introducing Amazon Bedrock Marketplace and SageMaker JumpStart

Amazon Bedrock Marketplace allows users to access more than 100 specialized and domain-specific models tailored to various use cases, streamlining the deployment process. Users can choose instance types that align with their workload demands while optimizing costs through unified and secure Amazon Bedrock APIs.

On the other hand, SageMaker JumpStart provides an easy gateway for machine learning novices and experts alike to get started with state-of-the-art model architectures. It enables quick model deployment in secure environments while also allowing further customization and fine-tuning.

Get Started: Deploying Falcon-H1 Models

This section will guide you through deploying a Falcon-H1 model—specifically the Falcon-H1-0.5B-Instruct model—using both Amazon Bedrock Marketplace and SageMaker JumpStart.

Deploying Falcon-H1-0.5B-Instruct via Amazon Bedrock Marketplace

Prerequisites

Before you can deploy the Falcon-H1-0.5B-Instruct model, ensure you have:

  • An AWS account.
  • Verified quota allocation for ml.g6.xlarge instances. If you have a default quota of 0, you will need to request an increase.

To request a quota increase, open the AWS Service Quotas console, locate ml.g6.xlarge for endpoint usage, and specify your required limit.

Steps to Deploy the Model

  1. Visit the Amazon Bedrock Console: Navigate to the Model catalog under the “Discover” section.
  2. Filter and Select the Model: Choose Falcon-H1-0.5B-Instruct from the catalog.
  3. Review Model License: Once you agree to the licensing terms, choose Deploy.
  4. Configure Deployment:
    • Enter an Endpoint name.
    • Set the Number of instances to 1 for cost efficiency.
    • Choose an Instance type such as ml.m6.xlarge for this exercise.
  5. Finalize Deployment: Hit Deploy.

As the deployment progresses, you can monitor the status under Managed deployments. The endpoint will change to “In Service” once ready.

Interacting with the Model

Once the model is deployed, you can interact with it directly through the Amazon Bedrock playground or by invoking it programmatically using the Amazon Bedrock APIs, allowing for greater flexibility in how you harness the model’s capabilities.

Deploying Falcon-H1-0.5B-Instruct with SageMaker JumpStart

To deploy the model through SageMaker, you will need the following prerequisites:

  • An AWS account.
  • An IAM role that grants access to SageMaker AI.
  • Access to SageMaker Studio or an IDE like VS Code.

Steps for Programmatic Deployment

  1. Install the SageMaker Python SDK and configure your AWS credentials.
  2. Use the following code snippet to launch the model:

python
import sagemaker
from sagemaker.jumpstart.model import JumpStartModel

Initialize a SageMaker session

session = sagemaker.Session()
role = sagemaker.get_execution_role()

Define model parameters

model_id = "huggingface-llm-falcon-h1-0-5b-instruct"

Create and deploy the model

model = JumpStartModel(model_id=model_id, role=role)
predictor = model.deploy(initial_instance_count=1, accept_eula=True)

print("Endpoint name:", predictor.endpoint_name)

After successfully deploying the model, note the endpoint name for future inference calls.

  1. Perform Inference using the SageMaker API:

python
import boto3

sagemaker_runtime = boto3.client("sagemaker-runtime")
endpoint_name = "{ENDPOINT_NAME}" # Replace with your actual endpoint name

payload = {
"messages": [{"role": "user", "content": "What is generative AI?"}],
"parameters": {"max_tokens": 256, "temperature": 0.1, "top_p": 0.1}
}

Call the endpoint for inference

response = sagemaker_runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType="application/json",
Body=json.dumps(payload)
)

Process the output

result = json.loads(response["Body"].read().decode("utf-8"))
print("Generated Response:", result["choices"][0]["message"]["content"].strip())

Clean-Up After Experimentation

To avert ongoing AWS charges:

  1. Delete your Amazon Bedrock Marketplace resources by going to the console and choosing to delete the deployed endpoint.
  2. Remove SageMaker endpoints and models by navigating to the respective divisions in the SageMaker AI console.

Always verify that all endpoints are deleted after experimentation to optimize costs.


Feel free to explore the Falcon-H1 models on Amazon Bedrock Marketplace or SageMaker JumpStart. Leverage these powerful models in your AI projects, and stay tuned to the AWS Machine Learning Blog and other resources for the latest insights and updates.

Read more

Related updates