Skip to article
AI/LLMs 1 Mar 2026 · 14 min read · 3.2K views

Claude API Masterclass: Tool Use, Vision & Extended Thinking

A complete guide to Anthropic's Claude API — tool use (function calling), vision inputs, extended thinking mode, and production best practices.

SK

Suboor Khan

Full-Stack Developer & Technical Writer

🟠
AI/LLMs

Anthropic's Claude has become the model of choice for reasoning-heavy, long-context, and safety-critical applications. Claude 3.5 Sonnet leads the coding benchmarks; Claude 3 Opus remains the gold standard for nuanced analysis. But most developers use Claude like a fancy chatbot — missing its most powerful capabilities entirely.

This guide covers the Claude API features that actually move the needle: the Tools API (function calling), Vision, and the game-changing Extended Thinking mode introduced in 2026. All examples use the official @anthropic-ai/sdk.

Setup & Client Initialisation

npm install @anthropic-ai/sdk
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,  // never hardcode
});

// Basic message
const msg = await client.messages.create({
  model:      'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Explain React Server Components in one paragraph.' }
  ],
});
console.log(msg.content[0].text);

Always set max_tokens explicitly. Claude will return an error if you don't, unlike OpenAI which defaults to a limit.

Tool Use (Function Calling)

Claude's Tool Use lets you define JSON schemas for functions. Claude decides when to call them, supplying the correct arguments. You execute the function and return the result — Claude then uses it to form the final answer.

const tools = [
  {
    name: 'get_weather',
    description: 'Get current weather for a city.',
    input_schema: {
      type: 'object',
      properties: {
        city:    { type: 'string', description: 'City name, e.g. "London"' },
        unit:    { type: 'string', enum: ['celsius', 'fahrenheit'], default: 'celsius' },
      },
      required: ['city'],
    },
  },
  {
    name: 'search_web',
    description: 'Search the web for up-to-date information.',
    input_schema: {
      type: 'object',
      properties: {
        query: { type: 'string' },
        num_results: { type: 'number', default: 5 },
      },
      required: ['query'],
    },
  },
];

The Tool Use Loop

async function runWithTools(userMessage) {
  const messages = [{ role: 'user', content: userMessage }];

  while (true) {
    const response = await client.messages.create({
      model:     'claude-3-5-sonnet-20241022',
      max_tokens: 4096,
      tools,
      messages,
    });

    // No tool call — final answer
    if (response.stop_reason === 'end_turn') {
      return response.content.find(b => b.type === 'text')?.text;
    }

    // Process tool calls
    const toolUseBlocks = response.content.filter(b => b.type === 'tool_use');
    const toolResults   = [];

    for (const toolUse of toolUseBlocks) {
      let result;
      if (toolUse.name === 'get_weather') {
        result = await fetchWeather(toolUse.input.city, toolUse.input.unit);
      } else if (toolUse.name === 'search_web') {
        result = await performWebSearch(toolUse.input.query);
      }
      toolResults.push({
        type:        'tool_result',
        tool_use_id: toolUse.id,
        content:     JSON.stringify(result),
      });
    }

    // Append assistant response + tool results, continue loop
    messages.push({ role: 'assistant', content: response.content });
    messages.push({ role: 'user',      content: toolResults });
  }
}

const answer = await runWithTools('What is the weather in Tokyo and Berlin right now?');
console.log(answer);

Vision: Sending Images to Claude

Claude accepts images as base64 data or public URLs. It excels at diagram analysis, UI screenshot critique, OCR, and chart interpretation.

import fs from 'fs';

// From file (base64)
const imageData   = fs.readFileSync('screenshot.png');
const base64Image = imageData.toString('base64');

const response = await client.messages.create({
  model:      'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: [
      {
        type:   'image',
        source: {
          type:       'base64',
          media_type: 'image/png',
          data:        base64Image,
        },
      },
      {
        type: 'text',
        text: 'Analyse this UI screenshot. List all accessibility issues and suggest fixes.',
      },
    ],
  }],
});

console.log(response.content[0].text);

Vision + URL (public images)

const response = await client.messages.create({
  model: 'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{
    role: 'user',
    content: [
      {
        type:   'image',
        source: { type: 'url', url: 'https://example.com/architecture-diagram.png' },
      },
      { type: 'text', text: 'Explain this architecture. What are the potential bottlenecks?' },
    ],
  }],
});

200K

Token context window

5

Images per request

20MB

Max image size

Extended Thinking Mode

Extended Thinking allows Claude to reason through a problem step-by-step before committing to a final answer — similar to OpenAI o1's chain-of-thought, but visible and controllable. You set a budget_tokens for the thinking process.

const response = await client.messages.create({
  model:      'claude-3-7-sonnet-20250219',   // supports extended thinking
  max_tokens: 16000,
  thinking: {
    type:          'enabled',
    budget_tokens: 10000,   // how many tokens Claude can use to think
  },
  messages: [{
    role:    'user',
    content: 'Design a microservices architecture for a ride-sharing app with 10M DAU. Consider scalability, fault tolerance, and data consistency.',
  }],
});

// Separate thinking from final answer
const thinking = response.content.find(b => b.type === 'thinking');
const answer   = response.content.find(b => b.type === 'text');

console.log('THINKING:\n', thinking?.thinking);
console.log('\nFINAL ANSWER:\n', answer?.text);

Extended Thinking is most valuable for maths, code generation, architecture design, and complex reasoning tasks. Disable it for simple queries — you pay for thinking tokens too.

Streaming Responses

const stream = await client.messages.stream({
  model:      'claude-3-5-sonnet-20241022',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Write a haiku about TypeScript generics.' }],
});

for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
    process.stdout.write(chunk.delta.text);
  }
}

const finalMessage = await stream.finalMessage();
console.log('\nTotal tokens used:', finalMessage.usage);

Production Best Practices

  • Use system prompts wisely. Claude's system prompt is prime real estate for persona, format instructions, and constraints. Keep it under 2K tokens — longer system prompts increase latency.
  • Cache with Prompt Caching. For documents or code you send repeatedly, enable prompt caching — reduces cost by up to 90% and latency by 85% on cached portions.
  • Model routing. Use claude-3-haiku for simple classification/extraction (10× cheaper), claude-3-5-sonnet for most tasks, claude-3-opus only for highest-complexity reasoning.
  • Retry with exponential backoff. The API can return 529 (overloaded) — implement retries with jitter for production reliability.
  • Validate tool inputs. Claude's tool call arguments are usually correct but never trust them without validation — use Zod schemas before executing any function.

Summary

  • Tools API: define JSON schemas, implement the loop, execute and return results
  • Vision: send base64 or URL images alongside text for multimodal analysis
  • Extended Thinking: set budget_tokens to give Claude room to reason — invaluable for hard problems
  • Stream all responses in production for perceived-latency improvement
  • Route to the cheapest model that can do the job: Haiku → Sonnet → Opus

Stay Updated

Enjoyed this article?

Deep-dive articles on AI, React, and software craft — twice a month. No spam, ever.