Building an AI chatbot that actually helps users — rather than frustrating them — requires more than just wiring up an API. Here's what I've learned from building production chatbots with Claude.
The System Prompt is Everything
Your system prompt defines the chatbot's personality, capabilities, and boundaries. A vague system prompt produces vague responses.
const systemPrompt = `You are a project intake assistant for Umbra Studio.
Your job is to understand what the visitor needs and qualify them as a lead.
Rules:
- Ask one question at a time
- Be conversational, not robotic
- Guide toward booking a discovery call
- Never quote exact prices
- If asked about timeline, give realistic ranges`;
Streaming for Real-Time UX
Nobody wants to stare at a loading spinner for 5 seconds. Stream responses token-by-token:
const response = await fetch("/api/chat", {
method: "POST",
body: JSON.stringify({ messages }),
});
const reader = response.body?.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
appendToMessage(text);
}
Rate Limiting is Non-Negotiable
Without rate limiting, one bad actor can drain your API budget in minutes. Use Upstash Redis for serverless-friendly rate limiting:
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(10, "1 m"),
});
Key Takeaways
- Invest in your system prompt — it's the highest-leverage code in your chatbot
- Always stream — perceived latency matters more than actual latency
- Rate limit from day one — API costs can spiral fast
- Log conversations — you can't improve what you don't measure
The chatbot on this site uses all of these patterns. Try it out on the pricing page to see it in action.