Week 10 Lab - Structured Outputs (JSON Mode) & Tool Calling with LiteLLM (2h)

Course slot: Lab (2 hours) after the Week 10 lecture

Important note: Formatting the model's output as JSON is not function calling. Use JSON Mode / Structured Outputs via response_format for JSON, and reserve tool/function calling for when the model should choose & call your code/APIs.

What you'll learn today

- Use JSON Mode / Structured Outputs to get valid JSON or schema-validated JSON from LLMs (no tools involved)

- Implement tool calling and log intermediate proposals (the model's tool name + arguments) before execution

- Validate, handle errors, and manage conversation state

Prereqs

- Python ≥ 3.9

- Free key (pick ONE): Groq or Gemini (or your provider)

Install & env

pip install litellm python-dotenv

Create .env:

GROQ_API_KEY=your_key_here
MODEL=groq/llama-3.3-70b-versatile   # pick any supported chat model

config.py:

import os
from dotenv import load_dotenv
load_dotenv()
MODEL = os.getenv("MODEL")
assert MODEL, "Set MODEL in .env (e.g., groq/llama-3.3-70b-versatile)"

Smoke test (new) - smoke_test.py

Why we run this:

- Check your API key and model are loaded.

- Check features (tool calling, JSON schema) quickly.

- Make one tiny request to be sure everything works.

You should see: model name, two True/False lines, and a short reply.

from litellm import completion, supports_function_calling, supports_response_schema
from config import MODEL
print("MODEL:", MODEL)
print("supports function calling:", supports_function_calling(model=MODEL))
print("supports response schema:", supports_response_schema(model=MODEL))

# minimal request
r = completion(model=MODEL, messages=[{"role":"user","content":"Say hi in 3 words"}], max_tokens=20)
print("OK:", r.choices[0].message["content"])

Run:

python smoke_test.py

Part A - Structured Outputs (JSON Mode) (45m)

Goal: Get JSON output from the model. No tools. Two ways: (1) JSON object, (2) JSON schema.

A1) Plain JSON object (fastest) - json_mode_object.py

Ask for a JSON object only, then parse it with json.loads.

from litellm import completion
from config import MODEL
import json

messages = [
    {"role": "system", "content": "You ONLY reply with a single JSON object."},
    {"role": "user", "content": "Extract: Sarah Johnson, 28, sj@example.com; likes smartphones and tablets."}
]

resp = completion(
    model=MODEL,
    messages=messages,
    response_format={"type": "json_object"},  # JSON Mode
    max_tokens=200,
)
content = resp.choices[0].message["content"]
print("RAW JSON string:\n", content)
print("\nParsed dict:\n", json.dumps(json.loads(content), indent=2))

Checkpoint A: Output is valid JSON you can json.loads.

A2) Strict JSON Schema (validated fields) - json_mode_schema.py

Use a schema to control the shape. Set strict: true for safer output.

from litellm import completion
from config import MODEL
import json

schema = {
  "name": "UserInfo",
  "schema": {
    "type": "object",
    "properties": {
      "name": {"type": "string"},
      "email": {"type": "string"},
      "age": {"type": "integer"},
      "preferences": {"type": "array", "items": {"type": "string"}}
    },
    "required": ["name", "email"],
    "additionalProperties": False
  },
  "strict": True
}

messages = [
  {"role": "system", "content": "Return ONLY a JSON object matching the schema."},
  {"role": "user", "content": "Extract: Sarah Johnson, 28, sj@example.com; likes smartphones and tablets."}
]

resp = completion(
  model=MODEL,
  messages=messages,
  response_format={"type": "json_schema", "json_schema": schema},
)
content = resp.choices[0].message["content"]
print("RAW JSON:\n", content)
print("\nParsed:\n", json.dumps(json.loads(content), indent=2))

Checkpoint B: The model adheres to your schema when strict: true is used.

When to use what

- JSON Mode: You want the LLM's answer formatted as JSON (extraction, classification, summaries). No external functions run.

- Tool/Function Calling: You want the LLM to decide and call code/APIs (e.g., math, DB, weather). The model proposes {name, arguments}; you inspect first, then execute.

Part B - Tool Calling with Intermediate Display (55m)

Goal: Let the model pick a tool. Print the tool name + JSON arguments first, then run the tool and answer.

B1) Calculator tools - tc_calc.py

Two tools: add and area_circle. Print the proposal before running.

import json
import math
from litellm import completion
from config import MODEL


calculator_tools = [
    {
        "name": "add",
        "description": "Add two numbers",
        "parameters": {
            "type": "object",
            "properties": {
                "a": {"type": "number"},
                "b": {"type": "number"}
            },
            "required": ["a", "b"]
        }
    },
    {
        "name": "area_circle",
        "description": "Area of a circle",
        "parameters": {
            "type": "object",
            "properties": {
                "radius": {"type": "number"}
            },
            "required": ["radius"]
        }
    }
]


TOOL_IMPL = {
    "add": lambda a, b: float(a)+float(b),
    "area_circle": lambda radius: math.pi*float(radius)**2,
}


messages = [{"role": "user",
             "content": "What is (25 + 17), then use the result as radius to compute circle area."}]

print("first LLM request :", messages)
# Let the model propose a tool
first = completion(model=MODEL, messages=messages,
                   functions=calculator_tools, function_call="auto")
msg = first.choices[0].message
fc = getattr(msg, "function_call", None) if hasattr(
    msg, "function_call") else msg.get("function_call")
print("=== INTERMEDIATE (1) ===")
print("name:", getattr(fc, "name", None) if fc else None)
raw_args = getattr(fc, "arguments", None) if fc else None
print("arguments (raw):", raw_args)


if fc:
    name = fc.name
    args = json.loads(raw_args or "{}")
    result = TOOL_IMPL[name](**args)
    # return tool result
    messages.append({"role": "assistant", "content": None,
                    "function_call": {"name": name, "arguments": raw_args}})
    messages.append({"role": "function", "name": name,
                    "content": json.dumps({"result": result})})

    print("second LLM request:", messages)
    second = completion(model=MODEL, messages=messages,
                        functions=calculator_tools, function_call="auto")
    msg2 = second.choices[0].message
    fc2 = getattr(msg2, "function_call", None) if hasattr(
        msg2, "function_call") else msg2.get("function_call")
    print("=== INTERMEDIATE (2) ===")
    print("name:", getattr(fc2, "name", None) if fc2 else None)
    print("arguments:", getattr(fc2, "arguments", None) if fc2 else None)
    
    


    if fc2:
        name2 = fc2.name
        args2 = json.loads(fc2.arguments)
        result2 = TOOL_IMPL[name2](**args2)
        messages.append({"role": "assistant", "content": None, "function_call": {
                        "name": name2, "arguments": fc2.arguments}})
        messages.append({"role": "function", "name": name2,
                        "content": json.dumps({"result": result2})})
        print("Final LLM request: ", messages)
        final = completion(model=MODEL, messages=messages)
        print("FINAL:", final.choices[0].message["content"])
    else:
        print("FINAL:", getattr(msg2, "content", None) or msg2["content"])
else:print("No tool proposal; assistant said:", getattr(msg, "content", None) or msg["content"])

B2) Weather tool (simulated) - tc_weather.py

The model should pass city and unit. Check the arguments in the INTERMEDIATE print.

# Part B — Example 2: Weather tool (simulated) with intermediate view
import json
from litellm import completion
from config import MODEL

def get_weather(city: str, unit: str = "celsius"):
    temp_c = 28  # pretend API
    return {"city": city, "unit": unit, "temperature": temp_c if unit == "celsius" else round(temp_c * 9/5 + 32)}

weather_tool = [{
    "name": "get_weather",
    "description": "Get current weather (simulated)",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {"type": "string"},
            "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "default": "celsius"}
        },
        "required": ["city"]
    }
}]

messages = [{"role": "user", "content": "What’s the weather in Chiang Mai in fahrenheit?"}]
first = completion(model=MODEL, messages=messages, functions=weather_tool, function_call="auto")
msg = first.choices[0].message
fc = getattr(msg, "function_call", None) if hasattr(msg, "function_call") else msg.get("function_call")
print("=== INTERMEDIATE (weather) ===")
print("function name:", getattr(fc, "name", None) if fc else None)
print("arguments:", getattr(fc, "arguments", None) if fc else None)
 
if fc:
    args = json.loads(fc.arguments or "{}")
    tool_result = get_weather(**args)
    messages.append({"role": "assistant", "content": None, "function_call": {"name": fc.name, "arguments": fc.arguments}})
    messages.append({"role": "function", "name": "get_weather", "content": json.dumps(tool_result)})
    final = completion(model=MODEL, messages=messages)
    print("\nFINAL:", final.choices[0].message["content"])
else:
    print("No tool call proposed.")

Part C - Complete Tool Calling (Provided Implementation) (20m)

What this file does:

- Full multi-turn loop with tools.

- Add a print() to show the INTERMEDIATE proposal (name + args) before running.

- Edit tools and reuse it for Homework10.

# tc_complete.py - provided implementation
# Source adapted from the uploaded file
# Imports and dependencies
from litellm import completion
import json
import math
from typing import List, Dict, Any
from config import MODEL

# Tool implementations - Calculator functions with schema definitions
class CalculatorTools:
    def add(self, a: float, b: float) -> float:
        return float(a) + float(b)
    
    def area_circle(self, radius: float) -> float:
        return math.pi * float(radius) ** 2
    
    @classmethod
    def get_schemas(cls):
        """Return the function schemas for all tools in this class"""
        return [
            {
                "name": "add",
                "description": "Add two numbers",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "a": {"type": "number"},
                        "b": {"type": "number"}
                    },
                    "required": ["a", "b"]
                }
            },
            {
                "name": "area_circle",
                "description": "Area of a circle",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "radius": {"type": "number"}
                    },
                    "required": ["radius"]
                }
            }
        ]
    
    def register_all_tools(self, executor):
        """Register all tools in this class with the executor"""
        schemas = self.get_schemas()
        executor.register_tool("add", self.add, schemas[0])
        executor.register_tool("area_circle", self.area_circle, schemas[1])

class ToolExecutor:
    """Main executor class that manages tools and handles LLM function calling"""
    def __init__(self):
        self.tools = {}
        self.tool_schemas = []
    
    def register_tool(self, name: str, func: callable, schema: dict):
        """Register a single tool with its execution function and schema"""
        self.tools[name] = func
        self.tool_schemas.append(schema)
    
    def register_tools(self, tool_class_instance):
        """Register all tools from a tool class instance"""
        schemas = tool_class_instance.get_schemas()
        for i, schema in enumerate(schemas):
            tool_name = schema["name"]
            tool_func = getattr(tool_class_instance, tool_name)
            self.register_tool(tool_name, tool_func, schema)
    
    def execute_with_tools(self, user_message: str, model: str = MODEL) -> str:
        """Execute user request with available tools - main conversation loop"""
        messages = [{"role": "user", "content": user_message}]
        
        while True:
            # Step 1: Get LLM response with tool options
            response = completion(
                model=model,
                messages=messages,
                functions=self.tool_schemas,
                function_call="auto"
            )
            
            message = response.choices[0].message
            
            # Step 2: Check if LLM wants to call a function
            if not getattr(message, "function_call", None):
                return getattr(message, "content", None) or message.get("content")
            
            # Step 3: Execute the requested tool
            tool_name = message.function_call.name
            tool_args = json.loads(message.function_call.arguments)
            
            if tool_name in self.tools:
                try:
                    tool_result = self.tools[tool_name](**tool_args)
                    
                    # Step 4: Add tool call and result to conversation history
                    messages.append({
                        "role": "assistant",
                        "content": None,
                        "function_call": {
                            "name": tool_name,
                            "arguments": message.function_call.arguments
                        }
                    })
                    messages.append({
                        "role": "function",
                        "name": tool_name,
                        "content": str(tool_result)
                    })
                    
                except Exception as e:
                    # Handle tool execution errors
                    messages.append({
                        "role": "function", 
                        "name": tool_name,
                        "content": f"Error: {e}"
                    })
            else:
                return f"Tool {tool_name} not available"

# Demo usage and testing
if __name__ == "__main__":
    executor = ToolExecutor()

    # Register calculator tools using the new register_tools method
    calc = CalculatorTools()
    executor.register_tools(calc)

    # Execute user requests and display results
    result1 = executor.execute_with_tools("What's 15 plus 27?")
    result2 = executor.execute_with_tools("Calculate the area of a circle with radius 5")

    print(result1)  # The result is 42
    print(result2)  # The area is approximately 78.54 square units

> Note: This provided implementation does not print the intermediate proposal by default. For teaching purposes, you may add a print() right after reading message.function_call to display name and raw arguments before execution.

Deliverables (Lab)

- Create a GitHub repo `lab10` including:

- smoke_test.py

- json_mode_object.py

- json_mode_schema.py

- tc_calc.py

- tc_weather.py

- tc_complete.py

- README.md and .env.example (no real keys)

- Add collaborators: kitt-cmu and meetip-checker to the repo.

Homework

1) Schema-validated extractor - nested structure

Goal: Design your own JSON Schema and use JSON Mode to extract a nested structure.

Scenario: Given a short paragraph containing an order with a customer and multiple items, output a strict JSON object.

Example expected JSON

{
  "order_id": "A-1029",
  "customer": { "name": "Sarah Johnson", "email": "sj@example.com" },
  "items": [
    { "sku": "WB-500", "name": "Water Bottle", "qty": 2, "price": 12.50 },
    { "sku": "CP-010", "name": "Carrying Pouch", "qty": 1, "price": 5.00 }
  ],
  "total": 30.00,
  "currency": "USD"
}

Starter code (adapt from json_mode_schema.py):

schema = {
  "name": "OrderExtraction",
  "schema": {
    "type": "object",
    "properties": {
      # TODO: define order_id, customer (object), items (array of objects), total (number), currency (string)
    },
    "required": [ /* TODO */ ],
    "additionalProperties": false
  },
  "strict": true
}

messages = [
  {"role":"system","content":"Return ONLY a JSON object matching the schema."},
  {"role":"user","content":"Order A-1029 by Sarah Johnson : 2x Water Bottle ($12.50 each), 1x Carrying Pouch ($5). Total $30."}
]

Task: Create the schema, run JSON Mode with response_format={"type":"json_schema", "json_schema": schema}, and show parsed output. If the model violates the schema, refine the prompt or fields and retry.

3) Currency mini-agent (simulated) - complete tool calling

Goal: Build a multi-tool agent to convert money amounts, resolve currency names, and show supported codes.

Template: Use this class-based stub (based on tc_complete.py/tc_complete_class.py). Save as tc_complete_currency.py.

> We give you a mock rate table and complete tool definitions for the first two tools. You must implement the `convert` tool: add its tool definition and write the function body.

"""
STUDENT_TODO: Currency mini-agent using LiteLLM tool calling (class-based)
- Tools to support:
  1) list_supported() -> list[str]              # PROVIDED (def + schema)
  2) resolve_currency(name_or_code: str) -> str  # PROVIDED (def + schema)
  3) convert(amount: float, base: str, quote: str) -> dict  # YOU implement (def + schema)
- Keep INTERMEDIATE prints before each execution for teaching/debugging.
"""
from typing import Dict, Any, List
from dataclasses import dataclass
import json
from litellm import completion
from config import MODEL

# ===== Mock data =====
RATE_TABLE: Dict[str, float] = {
    "USD->THB": 35.0,
    "THB->USD": 0.0286,
    "THB->EUR": 0.025,
    "EUR->THB": 40.0,
    "USD->EUR": 0.92,
    "EUR->USD": 1.087,
}
SUPPORTED = ["USD", "THB", "EUR", "JPY"]
NAME_TO_ISO = {"baht": "THB", "dollar": "USD", "euro": "EUR", "yen": "JPY"}

@dataclass
class ToolCall:
    name: str
    arguments: str

class CurrencyTools:
    """Currency utilities exposed as tools."""

    # --- Tool 1: list_supported (PROVIDED) ---
    def list_supported(self) -> List[str]:
        return SUPPORTED

    # --- Tool 2: resolve_currency (PROVIDED) ---
    def resolve_currency(self, name_or_code: str) -> str:
        code = (name_or_code or "").strip().upper()
        if code in SUPPORTED:
            return code
        return NAME_TO_ISO.get((name_or_code or "").strip().lower(), "UNKNOWN")

    # --- Tool 3: convert (YOU implement) ---
    def convert(self, amount: float, base: str, quote: str) -> Dict[str, Any]:
        """STUDENT_TODO: use RATE_TABLE to compute result.
        Return dict like: {"rate": , "converted": }.
        If missing rate -> return {"error": f"No rate for {base}->{quote}"}
        """
        raise NotImplementedError("Implement convert() using RATE_TABLE")

    @classmethod
    def get_schemas(cls) -> List[dict]:
        """Return tool schemas (OpenAI-compatible). Fill the TODO for convert."""
        return [
            # 1) list_supported - schema COMPLETE
            {
                "name": "list_supported",
                "description": "Return supported currency ISO codes",
                "parameters": {"type": "object", "properties": {}},
            },
            # 2) resolve_currency - schema COMPLETE
            {
                "name": "resolve_currency",
                "description": "Map currency name or code to ISO code (e.g., 'baht'->'THB')",
                "parameters": {
                    "type": "object",
                    "properties": {"name_or_code": {"type": "string"}},
                    "required": ["name_or_code"],
                },
            },
            # 3) convert - STUDENT_TODO: COMPLETE THIS SCHEMA
            # {
            #     "name": "convert",
            #     "description": "Convert amount from base to quote using fixed RATE_TABLE",
            #     "parameters": {
            #         "type": "object",
            #         "properties": {
            #             "amount": {"type": "number"},
            #             "base":   {"type": "string"},
            #             "quote":  {"type": "string"}
            #         },
            #         "required": ["amount", "base", "quote"]
            #     }
            # }
        ]

class ToolExecutor:
    def __init__(self):
        self.tools = {}
        self.tool_schemas: List[dict] = []

    def register_tool(self, name: str, func, schema: dict):
        self.tools[name] = func
        self.tool_schemas.append(schema)

    def register_tools(self, tool_obj):
        for schema in tool_obj.get_schemas():
            name = schema["name"]
            if not hasattr(tool_obj, name):
                continue
            self.register_tool(name, getattr(tool_obj, name), schema)

    def run(self, user_text: str, model: str = MODEL, max_turns: int = 6):
        messages = [{"role": "user", "content": user_text}]
        for turn in range(1, max_turns + 1):
            resp = completion(model=model, messages=messages, functions=self.tool_schemas, function_call="auto")
            msg = resp.choices[0].message
            fc: ToolCall | None = getattr(msg, "function_call", None)
            if not fc:
                print("FINAL:", getattr(msg, "content", None) or msg.get("content"))
                break
            # INTERMEDIATE print
            print(f"=== INTERMEDIATE (turn {turn}) ===")
            print("name:", getattr(fc, "name", None))
            print("arguments:", getattr(fc, "arguments", None))
            # Execute tool
            try:
                args = json.loads(getattr(fc, "arguments", "{}") or "{}")
                name = getattr(fc, "name", None)
                result = self.tools[name](**args) if args else self.tools[name]()
            except Exception as e:
                result = {"error": str(e)}
            # Return result
            messages.append({"role": "assistant", "content": None, "function_call": {"name": getattr(fc, "name", None), "arguments": getattr(fc, "arguments", "{}")}})
            messages.append({"role": "function", "name": getattr(fc, "name", None), "content": json.dumps(result)})

if __name__ == "__main__":
    tools = CurrencyTools()
    ex = ToolExecutor()
    ex.register_tools(tools)
    # After you implement convert() + its schema, these should work nicely:
    ex.run("Convert 100 USD to THB")
    ex.run("Convert 250 baht to euros")

Expected results (after you implement `convert`)

1. "Convert 100 USD to THB" → final JSON summary like { "amount": 100, "base": "USD", "quote": "THB", "rate": 35.0, "converted": 3500 }

2. "Convert 250 baht to euros" → should call resolve_currency then convert; final JSON summary like { "amount": 250, "base": "THB", "quote": "EUR", "rate": 0.025, "converted": 6.25 }

> Keep temperature=0.2 for stable tool selection. If a code is unknown, call list_supported and ask the user to choose.

Deliverables (Homework)

- Create a GitHub repo `hw10` with your solutions for Tasks 1 and 3 (code + brief README)

- Include your tc_complete_currency.py stub solution and test cases (screenshots ok)

- Export your lab sheet results as HTML using GitHub-style formatting and include it in the repo

- Add collaborators: kitt-cmu and meetip-checker to the repo