Getting Started
Get up and running with Gerbil in under 5 minutes.
Installation
One-off usage — try without installing:
npx @tryhamster/gerbil "Write a haiku about coding"Global install — use gerbil command directly:
npm install -g @tryhamster/gerbilLocal install — for programmatic use in your project:
npm install @tryhamster/gerbilgerbil and assume global install. Substitute npx @tryhamster/gerbil if using without installing.Quick Start
One-liner API
The simplest way to use Gerbil. No setup required:
import gerbil from "@tryhamster/gerbil";
const text = await gerbil("Explain recursion in one sentence");console.log(text);// → "Recursion is when a function calls itself to solve smaller instances of the same problem."Class-based API
For more control, use the Gerbil class:
01import { Gerbil } from "@tryhamster/gerbil";02
03const g = new Gerbil();04
05// Load a model (downloads automatically on first use)06await g.loadModel("qwen3-0.6b");07
08// Generate text09const result = await g.generate("Write a haiku about coding", {10 maxTokens: 100,11 temperature: 0.8,12});13
14console.log(result.text);15console.log(`Speed: ${result.tokensPerSecond.toFixed(1)} tok/s`);16
17// Clean up when done18await g.dispose();Streaming
Stream responses token by token:
import { Gerbil } from "@tryhamster/gerbil";
const g = new Gerbil();await g.loadModel("qwen3-0.6b");
for await (const chunk of g.stream("Tell me a story")) { process.stdout.write(chunk);}
await g.dispose();Thinking Mode
Qwen3 models support chain-of-thought reasoning. Enable it to see how the model thinks:
const result = await g.generate("What is 127 × 43?", { thinking: true });
console.log("Thinking:", result.thinking);// → "127 × 43 = 127 × 40 + 127 × 3 = 5080 + 381 = 5461"
console.log("Answer:", result.text);// → "5461"Structured JSON Output
Get structured data with Zod schema validation:
import { json } from "@tryhamster/gerbil";import { z } from "zod";
const person = await json("Extract: John is 32 and lives in NYC", { schema: z.object({ name: z.string(), age: z.number(), city: z.string(), }),});
console.log(person);// → { name: "John", age: 32, city: "NYC" }Embeddings
Generate embeddings for semantic search and RAG:
import { embed, embedBatch } from "@tryhamster/gerbil";
// Single embeddingconst result = await embed("Hello world");console.log(result.vector); // [0.123, -0.456, ...]
// Batch embeddingsconst results = await embedBatch([ "Hello world", "Goodbye world",]);Vision AI
Load a vision-capable model and pass images to your prompts:
01import { Gerbil } from "@tryhamster/gerbil";02
03const g = new Gerbil();04await g.loadModel("ministral-3b"); // Vision + reasoning model05
06// Describe an image07const result = await g.generate("What's in this image?", {08 images: [{ source: "https://example.com/photo.jpg" }]09});10
11console.log(result.text);12// → "A golden retriever playing fetch in a sunny park..."13
14await g.dispose();See the Vision documentation for more details.
Text-to-Speech
Generate natural speech with Kokoro-82M (28 voices):
01import { Gerbil } from "@tryhamster/gerbil";02
03const g = new Gerbil();04await g.loadTTS();05
06const { audio, sampleRate } = await g.speak("Hello from Gerbil!", {07 voice: "af_heart",08 speed: 1.0,09});10
11// audio = Float32Array, sampleRate = 24000See the Text-to-Speech documentation for all 28 voices.
Speech-to-Text
Transcribe audio with Whisper ONNX models:
01import { Gerbil } from "@tryhamster/gerbil";02import { readFileSync } from "fs";03
04const g = new Gerbil();05await g.loadSTT("whisper-tiny.en");06
07const audio = new Uint8Array(readFileSync("audio.wav"));08const { text, duration } = await g.transcribe(audio);09
10console.log(text); // "Hello world"See the Speech-to-Text documentation for all Whisper models.
Using Any HuggingFace Model
Load any compatible model from HuggingFace:
// Short syntaxawait g.loadModel("hf:microsoft/Phi-3-mini-4k-instruct-onnx");
// Full URLawait g.loadModel("https://huggingface.co/Qwen/Qwen2.5-Coder-0.5B");
// Local modelawait g.loadModel("file:./models/my-fine-tune");💡 Pro Tip: Preload Models
Download models during app initialization so users don't wait on first use:
// Node.js - download only (frees RAM after)await g.preloadModel("qwen3-0.6b");
// Node.js - keep in memory for instant useawait g.preloadModel("qwen3-0.6b", { keepLoaded: true });
// Browserimport { preloadChatModel } from "@tryhamster/gerbil/browser";await preloadChatModel("qwen3-0.6b", { keepLoaded: true });See Browser Preloading and AI SDK Preloading for details.
Next Steps
- —Explore available models
- —Use Vision AI — describe images, extract text, analyze screenshots
- —Text-to-Speech — generate natural speech with 28 voices
- —Speech-to-Text — transcribe audio with Whisper models
- —Use built-in skills like commit, summarize, review — or create your own
- —Integrate with your framework — AI SDK, Next.js, Express
- —Enable response caching for instant repeated prompts
- —Set up MCP server for Claude Desktop
- —Try the playground to experiment in your browser