Anthropic's Claude has become the model of choice for reasoning-heavy, long-context, and safety-critical applications. Claude 3.5 Sonnet leads the coding benchmarks; Claude 3 Opus remains the gold standard for nuanced analysis. But most developers use Claude like a fancy chatbot — missing its most powerful capabilities entirely.
This guide covers the Claude API features that actually move the needle: the Tools API (function calling), Vision, and the game-changing Extended Thinking mode introduced in 2026. All examples use the official @anthropic-ai/sdk.
Setup & Client Initialisation
npm install @anthropic-ai/sdk
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY, // never hardcode
});
// Basic message
const msg = await client.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Explain React Server Components in one paragraph.' }
],
});
console.log(msg.content[0].text);
Always set
max_tokensexplicitly. Claude will return an error if you don't, unlike OpenAI which defaults to a limit.
Tool Use (Function Calling)
Claude's Tool Use lets you define JSON schemas for functions. Claude decides when to call them, supplying the correct arguments. You execute the function and return the result — Claude then uses it to form the final answer.
const tools = [
{
name: 'get_weather',
description: 'Get current weather for a city.',
input_schema: {
type: 'object',
properties: {
city: { type: 'string', description: 'City name, e.g. "London"' },
unit: { type: 'string', enum: ['celsius', 'fahrenheit'], default: 'celsius' },
},
required: ['city'],
},
},
{
name: 'search_web',
description: 'Search the web for up-to-date information.',
input_schema: {
type: 'object',
properties: {
query: { type: 'string' },
num_results: { type: 'number', default: 5 },
},
required: ['query'],
},
},
];
The Tool Use Loop
async function runWithTools(userMessage) {
const messages = [{ role: 'user', content: userMessage }];
while (true) {
const response = await client.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 4096,
tools,
messages,
});
// No tool call — final answer
if (response.stop_reason === 'end_turn') {
return response.content.find(b => b.type === 'text')?.text;
}
// Process tool calls
const toolUseBlocks = response.content.filter(b => b.type === 'tool_use');
const toolResults = [];
for (const toolUse of toolUseBlocks) {
let result;
if (toolUse.name === 'get_weather') {
result = await fetchWeather(toolUse.input.city, toolUse.input.unit);
} else if (toolUse.name === 'search_web') {
result = await performWebSearch(toolUse.input.query);
}
toolResults.push({
type: 'tool_result',
tool_use_id: toolUse.id,
content: JSON.stringify(result),
});
}
// Append assistant response + tool results, continue loop
messages.push({ role: 'assistant', content: response.content });
messages.push({ role: 'user', content: toolResults });
}
}
const answer = await runWithTools('What is the weather in Tokyo and Berlin right now?');
console.log(answer);
Vision: Sending Images to Claude
Claude accepts images as base64 data or public URLs. It excels at diagram analysis, UI screenshot critique, OCR, and chart interpretation.
import fs from 'fs';
// From file (base64)
const imageData = fs.readFileSync('screenshot.png');
const base64Image = imageData.toString('base64');
const response = await client.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{
role: 'user',
content: [
{
type: 'image',
source: {
type: 'base64',
media_type: 'image/png',
data: base64Image,
},
},
{
type: 'text',
text: 'Analyse this UI screenshot. List all accessibility issues and suggest fixes.',
},
],
}],
});
console.log(response.content[0].text);
Vision + URL (public images)
const response = await client.messages.create({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{
role: 'user',
content: [
{
type: 'image',
source: { type: 'url', url: 'https://example.com/architecture-diagram.png' },
},
{ type: 'text', text: 'Explain this architecture. What are the potential bottlenecks?' },
],
}],
});
200K
Token context window
5
Images per request
20MB
Max image size
Extended Thinking Mode
Extended Thinking allows Claude to reason through a problem step-by-step before committing to a final answer — similar to OpenAI o1's chain-of-thought, but visible and controllable. You set a budget_tokens for the thinking process.
const response = await client.messages.create({
model: 'claude-3-7-sonnet-20250219', // supports extended thinking
max_tokens: 16000,
thinking: {
type: 'enabled',
budget_tokens: 10000, // how many tokens Claude can use to think
},
messages: [{
role: 'user',
content: 'Design a microservices architecture for a ride-sharing app with 10M DAU. Consider scalability, fault tolerance, and data consistency.',
}],
});
// Separate thinking from final answer
const thinking = response.content.find(b => b.type === 'thinking');
const answer = response.content.find(b => b.type === 'text');
console.log('THINKING:\n', thinking?.thinking);
console.log('\nFINAL ANSWER:\n', answer?.text);
Extended Thinking is most valuable for maths, code generation, architecture design, and complex reasoning tasks. Disable it for simple queries — you pay for thinking tokens too.
Streaming Responses
const stream = await client.messages.stream({
model: 'claude-3-5-sonnet-20241022',
max_tokens: 1024,
messages: [{ role: 'user', content: 'Write a haiku about TypeScript generics.' }],
});
for await (const chunk of stream) {
if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
process.stdout.write(chunk.delta.text);
}
}
const finalMessage = await stream.finalMessage();
console.log('\nTotal tokens used:', finalMessage.usage);
Production Best Practices
- Use system prompts wisely. Claude's system prompt is prime real estate for persona, format instructions, and constraints. Keep it under 2K tokens — longer system prompts increase latency.
- Cache with Prompt Caching. For documents or code you send repeatedly, enable prompt caching — reduces cost by up to 90% and latency by 85% on cached portions.
- Model routing. Use
claude-3-haikufor simple classification/extraction (10× cheaper),claude-3-5-sonnetfor most tasks,claude-3-opusonly for highest-complexity reasoning. - Retry with exponential backoff. The API can return 529 (overloaded) — implement retries with jitter for production reliability.
- Validate tool inputs. Claude's tool call arguments are usually correct but never trust them without validation — use Zod schemas before executing any function.
Summary
- Tools API: define JSON schemas, implement the loop, execute and return results
- Vision: send base64 or URL images alongside text for multimodal analysis
- Extended Thinking: set
budget_tokensto give Claude room to reason — invaluable for hard problems - Stream all responses in production for perceived-latency improvement
- Route to the cheapest model that can do the job: Haiku → Sonnet → Opus