How to Implement Function Calling for the Tiny Llama 3.2 1B Model

A Quick Tutorial for Using Instructor to Generate Function Calls with Llama 3.2

Oct 06, 2024

∙ Paid

While the medium-sized Llama 3.2 models (11B and 90B) excel at complex vision understanding, their tiny crew members (1B and 3B) are expected to shine in a different arena: on-device applications. These lightweight models are perfectly suited for portable devices, offering impressive performance for tasks like summarization and instruction following. However, to maximize their potential as edge AI, we must go beyond simple chatting use cases. This article explores how to implement function calling with Llama 3.2 tiny models by enabling them to control actions and interact with external systems or appliances directly.

Function Calling with Llama 3.2

You must have implemented the Function Calling features into your LLM apps or agents because it’s one of the two most practical approaches (the other is RAG) to bring models to life. By defining specific functions and their corresponding schemas, we can instruct the model to generate structured JSON outputs that directly trigger actions. The potential applications of function calling include database queries, API interactions, device control (as we’ll see in our smart home example), and mathematical calculations.

OpenAI popularized the concept of function calling within the context of the GPT models. Developers define functions, and the model, given appropriate prompting, generates JSON objects containing the arguments to call these functions. Initially, effective function calling often relied on models specifically fine-tuned for this purpose. These specialized models could reason about function formats and generate the correct JSON structure based on provided function definitions. However, with the advancements in general reasoning capabilities of models like Llama 3.2, and with tools like the Instructor library, we can now implement basic function calling effectively even on such non-fine-tuned models.

The Instructor Library

With the Instructor library, we can extend this functionality to more base models, including the tiniest Llama 3.2 variants—1B model. Instructor leverages Pydantic, a powerful Python library for data validation and parsing, to define, validate, and document the expected schema for function arguments. This ensures the model's output is structured correctly and can be used directly to call the corresponding functions.

Furthermore, Instructor simplifies the integration process by providing a familiar wrapper around existing model APIs. For instance, using instructor.from_openai() allows you to use the standard OpenAI API calls while seamlessly incorporating Instructor’s structured output features. This reduces the learning curve and allows developers to quickly add structured output capabilities to their existing projects.

Code Walkthrough

Let's dive into the code examples, which demonstrate how to implement function calling with Llama 3.2 1B model and Instructor. We'll start with a simplified financial example and then move to a more practical smart home control scenario.

Due to its size, the Llama 3.2 1B model inference will only consume 2~3 GB GPU/CPU memory, so you can easily afford the environment by either running on your local edge devices or renting an entry-level computing cloud. I will write another tutorial about the local inference and fine-tuning for Llama 3.2. To make this demo neat, I will use a free inference API from OpenRouter, which works as an online inference proxy for various models.

Code Example 1: Simulating a Finance Function

This example simulates retrieving financial data using function calling.

First, install the Python packages, including OpenAI, which will be used to call the OpenRouter’s OpenAI-compatible API.

pip install openai pydantic instructor

In the code, import Instructor, Pydantic components, OpenAI, and load your OpenRouter API Key.

import instructor
from typing import Literal
from pydantic import BaseModel, Field
from openai import OpenAI
import os
from dotenv import load_dotenv

load_dotenv()

api_key = os.getenv("OPENROUTER_API_KEY")

Here, we define a dummyget_revenue() function. This function simulates fetching revenue data which takes the financial year and company name as input and returns a placeholder string about the revenue data. In a real-world application, this function would connect to a database or API. Below is the function definition:

def get_revenue(financial_year: int, company: str) -> str:
    """
    Get revenue data for a company given the year.
    """
    return f"Revenue for {company} in {financial_year}: $1,000,000,000"

Then, we will define Pydantic models. TheFunctiontCall()specifies the function’s name get_revenue() and its arguments. TheFunctionArguments() defines the structure of those arguments: financial_year and company. FunctionCall.model_rebuild() is called to ensure the model is correctly built with the nested structure. Below are the Pydantic models:

Keep reading with a 7-day free trial

Subscribe to Lab For AI to keep reading this post and get 7 days of free access to the full post archives.