Claude API Masterclass: Tool Use, Vision & Extended Thinking

Anthropic's Claude has become the model of choice for reasoning-heavy, long-context, and safety-critical applications. Claude 3.5 Sonnet leads the coding benchmarks; Claude 3 Opus remains the gold standard for nuanced analysis. But most developers use Claude like a fancy chatbot — missing its most powerful capabilities entirely.

This guide covers the Claude API features that actually move the needle: the Tools API (function calling), Vision, and the game-changing Extended Thinking mode introduced in 2026. All examples use the official @anthropic-ai/sdk.

Setup & Client Initialisation

npm install @anthropic-ai/sdk

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,  // never hardcode
});

// Basic message
const msg = await client.messages.create({
  model:      'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Explain React Server Components in one paragraph.' }
  ],
});
console.log(msg.content[0].text);

Always set max_tokens explicitly. Claude will return an error if you don't, unlike OpenAI which defaults to a limit.

Tool Use (Function Calling)

Claude's Tool Use lets you define JSON schemas for functions. Claude decides when to call them, supplying the correct arguments. You execute the function and return the result — Claude then uses it to form the final answer.

const tools = [
  {
    name: 'get_weather',
    description: 'Get current weather for a city.',
    input_schema: {
      type: 'object',
      properties: {
        city:    { type: 'string', description: 'City name, e.g. "London"' },
        unit:    { type: 'string', enum: ['celsius', 'fahrenheit'], default: 'celsius' },
      },
      required: ['city'],
    },
  },
  {
    name: 'search_web',
    description: 'Search the web for up-to-date information.',
    input_schema: {
      type: 'object',
      properties: {
        query: { type: 'string' },
        num_results: { type: 'number', default: 5 },
      },
      required: ['query'],
    },
  },
];

The Tool Use Loop

async function runWithTools(userMessage) {
  const messages = [{ role: 'user', content: userMessage }];

  while (true) {
    const response = await client.messages.create({
      model:     'claude-3-5-sonnet-20241022',
      max_tokens: 4096,
      tools,
      messages,
    });

    // No tool call — final answer
    if (response.stop_reason === 'end_turn') {
      return response.content.find(b => b.type === 'text')?.text;
    }

    // Process tool calls
    const toolUseBlocks = response.content.filter(b => b.type === 'tool_use');
    const toolResults   = [];

    for (const toolUse of toolUseBlocks) {
      let result;
      if (toolUse.name === 'get_weather') {
        result = await fetchWeather(toolUse.input.city, toolUse.input.unit);
      } else if (toolUse.name === 'search_web') {
        result = await performWebSearch(toolUse.input.query);
      }
      toolResults.push({
        type:        'tool_result',
        tool_use_id: toolUse.id,
        content:     JSON.stringify(result),
      });
    }

    // Append assistant response + tool results, continue loop
    messages.push({ role: 'assistant', content: response.content });
    messages.push({ role: 'user',      content: toolResults });
  }
}

const answer = await runWithTools('What is the weather in Tokyo and Berlin right now?');
console.log(answer);

Vision: Sending Images to Claude

Claude accepts images as base64 data or public URLs. It excels at diagram analysis, UI screenshot critique, OCR, and chart interpretation.

import fs from 'fs';

// From file (base64)
const imageData   = fs.readFileSync('screenshot.png');
const base64Image = imageData.toString('base64');

const response = await client.messages.create({
  model:      'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: [
      {
        type:   'image',
        source: {
          type:       'base64',
          media_type: 'image/png',
          data:        base64Image,
        },
      },
      {
        type: 'text',
        text: 'Analyse this UI screenshot. List all accessibility issues and suggest fixes.',
      },
    ],
  }],
});

console.log(response.content[0].text);

Vision + URL (public images)

const response = await client.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: [
      {
        type:   'image',
        source: { type: 'url', url: 'https://example.com/architecture-diagram.png' },
      },
      { type: 'text', text: 'Explain this architecture. What are the potential bottlenecks?' },
    ],
  }],
});

200K

Token context window

Images per request

20MB

Max image size

Extended Thinking Mode

Extended Thinking allows Claude to reason through a problem step-by-step before committing to a final answer — similar to OpenAI o1's chain-of-thought, but visible and controllable. You set a budget_tokens for the thinking process.

const response = await client.messages.create({
  model:      'claude-3-7-sonnet-20250219',   // supports extended thinking
  max_tokens: 16000,
  thinking: {
    type:          'enabled',
    budget_tokens: 10000,   // how many tokens Claude can use to think
  },
  messages: [{
    role:    'user',
    content: 'Design a microservices architecture for a ride-sharing app with 10M DAU. Consider scalability, fault tolerance, and data consistency.',
  }],
});

// Separate thinking from final answer
const thinking = response.content.find(b => b.type === 'thinking');
const answer   = response.content.find(b => b.type === 'text');

console.log('THINKING:\n', thinking?.thinking);
console.log('\nFINAL ANSWER:\n', answer?.text);

Extended Thinking is most valuable for maths, code generation, architecture design, and complex reasoning tasks. Disable it for simple queries — you pay for thinking tokens too.

Streaming Responses

const stream = await client.messages.stream({
  model:      'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Write a haiku about TypeScript generics.' }],
});

for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
    process.stdout.write(chunk.delta.text);
  }
}

const finalMessage = await stream.finalMessage();
console.log('\nTotal tokens used:', finalMessage.usage);

Production Best Practices

Use system prompts wisely. Claude's system prompt is prime real estate for persona, format instructions, and constraints. Keep it under 2K tokens — longer system prompts increase latency.
Cache with Prompt Caching. For documents or code you send repeatedly, enable prompt caching — reduces cost by up to 90% and latency by 85% on cached portions.
Model routing. Use claude-3-haiku for simple classification/extraction (10× cheaper), claude-3-5-sonnet for most tasks, claude-3-opus only for highest-complexity reasoning.
Retry with exponential backoff. The API can return 529 (overloaded) — implement retries with jitter for production reliability.
Validate tool inputs. Claude's tool call arguments are usually correct but never trust them without validation — use Zod schemas before executing any function.

Summary

Tools API: define JSON schemas, implement the loop, execute and return results
Vision: send base64 or URL images alongside text for multimodal analysis
Extended Thinking: set budget_tokens to give Claude room to reason — invaluable for hard problems
Stream all responses in production for perceived-latency improvement
Route to the cheapest model that can do the job: Haiku → Sonnet → Opus

Claude API Masterclass: Tool Use, Vision & Extended Thinking

Setup & Client Initialisation

Tool Use (Function Calling)

The Tool Use Loop

Vision: Sending Images to Claude

Vision + URL (public images)

Extended Thinking Mode

Streaming Responses

Production Best Practices

Summary

More Articles

GPT-4o Deep Dive: Advanced Prompting & Function Calling

Building AI Agents from Scratch

Building Production RAG Pipelines

Enjoyed this article?