LangChain

Full LangChain integration with LLM, embeddings, TTS, and STT. Build chains, agents, and voice-enabled pipelines with local models.

ClassCapability
GerbilLLMText generation + Vision
GerbilEmbeddingsVector embeddings
GerbilTTSText-to-Speech (28 voices)
GerbilSTTSpeech-to-Text (Whisper)

Installation

Terminal
npm install @tryhamster/gerbil langchain

Quick Start

quick-start.ts
01import {
02 GerbilLLM,
03 GerbilEmbeddings,
04 GerbilTTS,
05 GerbilSTT,
06} from "@tryhamster/gerbil/langchain";
07
08// Text generation
09const llm = new GerbilLLM({ model: "qwen3-0.6b" });
10const result = await llm.invoke("Write a haiku about coding");
11
12// Embeddings
13const embeddings = new GerbilEmbeddings();
14const vector = await embeddings.embedQuery("Hello world");
15
16// Text-to-Speech
17const tts = new GerbilTTS({ voice: "af_heart" });
18const { audio } = await tts.speak("Hello from LangChain!");
19
20// Speech-to-Text
21const stt = new GerbilSTT();
22const { text } = await stt.transcribe(audioData);

GerbilLLM

Text generation with optional vision support:

llm-config.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";
02
03const llm = new GerbilLLM({
04 // Model configuration
05 model: "qwen3-0.6b",
06 device: "auto", // "auto" | "gpu" | "cpu"
07 dtype: "q4", // "q4" | "q8" | "fp16" | "fp32"
08
09 // Generation options
10 maxTokens: 500,
11 temperature: 0.7,
12 topP: 0.9,
13 topK: 50,
14
15 // Thinking mode (Qwen3)
16 thinking: false,
17
18 // Callbacks
19 callbacks: [
20 {
21 handleLLMStart: async (llm, prompts) => {
22 console.log("Starting generation...");
23 },
24 handleLLMEnd: async (output) => {
25 console.log("Generation complete");
26 },
27 },
28 ],
29});

invoke()

invoke.ts
01// Simple invocation
02const result = await llm.invoke("Explain recursion");
03
04// With options
05const result = await llm.invoke("Write a poem", {
06 maxTokens: 200,
07 temperature: 0.9,
08});
09
10// With stop sequences
11const result = await llm.invoke("List 3 items:\n1.", {
12 stop: ["\n4."],
13});

Streaming

streaming.ts
01// Stream tokens
02const stream = await llm.stream("Tell me a story");
03
04for await (const chunk of stream) {
05 process.stdout.write(chunk);
06}
07
08// With callbacks
09const stream = await llm.stream("Explain hooks", {
10 callbacks: [{
11 handleLLMNewToken: async (token) => {
12 console.log("Token:", token);
13 },
14 }],
15});

Vision

Use vision-capable models to analyze images:

vision.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";
02
03// Use a vision-capable model
04const llm = new GerbilLLM({ model: "ministral-3b" });
05
06// Check if model supports vision
07const hasVision = await llm.supportsVision(); // true
08
09// Analyze an image
10const description = await llm.invokeWithImages(
11 "Describe this image in detail",
12 [{ source: "https://example.com/photo.jpg" }]
13);
14
15// Compare multiple images
16const diff = await llm.invokeWithImages(
17 "What changed between these two screenshots?",
18 [
19 { source: beforeScreenshot },
20 { source: afterScreenshot },
21 ]
22);
23
24// Use with local files (base64)
25import { readFileSync } from "fs";
26const imageData = readFileSync("photo.jpg").toString("base64");
27const result = await llm.invokeWithImages(
28 "What's in this photo?",
29 [{ source: `data:image/jpeg;base64,${imageData}` }]
30);

GerbilEmbeddings

embeddings.ts
01import { GerbilEmbeddings } from "@tryhamster/gerbil/langchain";
02
03const embeddings = new GerbilEmbeddings({
04 // Optional: specify embedding model
05 model: "all-MiniLM-L6-v2",
06});
07
08// Single query
09const vector = await embeddings.embedQuery("What is the meaning of life?");
10// Returns: number[] (384 dimensions)
11
12// Multiple documents
13const vectors = await embeddings.embedDocuments([
14 "First document",
15 "Second document",
16 "Third document",
17]);
18// Returns: number[][] (array of vectors)

GerbilTTS

Text-to-Speech using Kokoro-82M with 28 natural voices:

tts.ts
01import { GerbilTTS } from "@tryhamster/gerbil/langchain";
02
03const tts = new GerbilTTS({
04 voice: "af_heart", // Default voice (American female)
05 speed: 1.0, // Speed multiplier 0.5-2.0
06});
07
08// Generate speech
09const result = await tts.speak("Hello from LangChain!");
10// result.audio = Float32Array (PCM samples)
11// result.sampleRate = 24000
12// result.duration = seconds
13
14// Override voice/speed per call
15const british = await tts.speak("Cheerio!", {
16 voice: "bf_emma",
17 speed: 1.2,
18});
19
20// Stream long text (yields chunks as generated)
21for await (const chunk of tts.speakStream("Long paragraph...")) {
22 console.log(`Chunk: ${chunk.samples.length} samples`);
23 // chunk.samples = Float32Array
24 // chunk.isFinal = boolean
25}
26
27// List available voices
28const voices = await tts.listVoices();
29// [{ id: "af_heart", name: "Heart", gender: "female", language: "en-US" }, ...]

Available Voices

American Female

af_heart ⭐, af_bella, af_nicole, af_sarah, af_sky, af_alloy, af_aoede, af_kore, af_nova, af_river, af_jessica

American Male

am_fenrir, am_michael, am_puck, am_adam, am_echo, am_eric, am_liam, am_onyx, am_santa

British Female

bf_emma, bf_isabella, bf_alice, bf_lily

British Male

bm_george, bm_fable, bm_lewis, bm_daniel

GerbilSTT

Speech-to-Text using Whisper ONNX models:

stt.ts
01import { GerbilSTT } from "@tryhamster/gerbil/langchain";
02import { readFileSync } from "fs";
03
04const stt = new GerbilSTT({
05 model: "whisper-tiny.en", // Fast, English-only
06});
07
08// Transcribe audio file (WAV)
09const audio = new Uint8Array(readFileSync("audio.wav"));
10const result = await stt.transcribe(audio);
11console.log(result.text);
12
13// With timestamps
14const result2 = await stt.transcribe(audio, { timestamps: true });
15for (const seg of result2.segments) {
16 console.log(`[${seg.start}s - ${seg.end}s] ${seg.text}`);
17}
18
19// With language hint (for multilingual models)
20const sttMulti = new GerbilSTT({ model: "whisper-small" });
21const spanish = await sttMulti.transcribe(audioData, { language: "es" });
22
23// List available models
24const models = await stt.listModels();

Available Models

ModelSizeLanguages
whisper-tiny.en39MBEnglish only
whisper-base.en74MBEnglish only
whisper-small244MB80+ languages
whisper-large-v3-turbo809MB80+ languages (best)

Chains

Use Gerbil with LangChain chains:

chains.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";
02import { PromptTemplate } from "@langchain/core/prompts";
03import { StringOutputParser } from "@langchain/core/output_parsers";
04
05const llm = new GerbilLLM({ model: "qwen3-0.6b" });
06
07// Create a simple chain
08const prompt = PromptTemplate.fromTemplate(
09 "You are a helpful assistant. Answer this question: {question}"
10);
11
12const chain = prompt.pipe(llm).pipe(new StringOutputParser());
13
14const result = await chain.invoke({
15 question: "What is the capital of France?",
16});
17
18console.log(result); // "The capital of France is Paris."

Structured Output

structured.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";
02import { z } from "zod";
03
04const llm = new GerbilLLM({ model: "qwen3-0.6b" });
05
06// Define schema
07const personSchema = z.object({
08 name: z.string(),
09 age: z.number(),
10 city: z.string(),
11});
12
13// Create structured LLM
14const structuredLlm = llm.withStructuredOutput(personSchema);
15
16const result = await structuredLlm.invoke(
17 "Extract: John is 32 years old and lives in New York"
18);
19
20console.log(result);
21// { name: "John", age: 32, city: "New York" }

Vector Stores

vector-stores.ts
01import { GerbilEmbeddings } from "@tryhamster/gerbil/langchain";
02import { MemoryVectorStore } from "langchain/vectorstores/memory";
03import { Document } from "@langchain/core/documents";
04
05const embeddings = new GerbilEmbeddings();
06
07// Create documents
08const docs = [
09 new Document({ pageContent: "Gerbil is a local LLM library" }),
10 new Document({ pageContent: "It supports WebGPU acceleration" }),
11 new Document({ pageContent: "Works with the Vercel AI SDK" }),
12];
13
14// Create vector store
15const vectorStore = await MemoryVectorStore.fromDocuments(docs, embeddings);
16
17// Similarity search
18const results = await vectorStore.similaritySearch("What is Gerbil?", 2);
19console.log(results);

RAG Pipeline

Build a complete Retrieval-Augmented Generation pipeline:

rag.ts
01import { GerbilLLM, GerbilEmbeddings } from "@tryhamster/gerbil/langchain";
02import { MemoryVectorStore } from "langchain/vectorstores/memory";
03import { createRetrievalChain } from "langchain/chains/retrieval";
04import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
05import { ChatPromptTemplate } from "@langchain/core/prompts";
06
07// Initialize
08const llm = new GerbilLLM({ model: "qwen3-0.6b" });
09const embeddings = new GerbilEmbeddings();
10
11// Create vector store from documents
12const vectorStore = await MemoryVectorStore.fromTexts(
13 [
14 "Gerbil runs LLMs locally in Node.js",
15 "It supports GPU acceleration via WebGPU",
16 "Models are cached in IndexedDB",
17 "Works offline after first download",
18 ],
19 [{}, {}, {}, {}],
20 embeddings
21);
22
23// Create retriever
24const retriever = vectorStore.asRetriever({ k: 2 });
25
26// Create prompt
27const prompt = ChatPromptTemplate.fromTemplate(`
28Answer the question based on the context below.
29
30Context: {context}
31
32Question: {input}
33
34Answer:
35`);
36
37// Create chains
38const documentChain = await createStuffDocumentsChain({
39 llm,
40 prompt,
41});
42
43const retrievalChain = await createRetrievalChain({
44 combineDocsChain: documentChain,
45 retriever,
46});
47
48// Query
49const result = await retrievalChain.invoke({
50 input: "Does Gerbil work offline?",
51});
52
53console.log(result.answer);
54// "Yes, Gerbil works offline after the first download..."

Agents

agents.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";
02import { initializeAgentExecutorWithOptions } from "langchain/agents";
03import { Calculator } from "@langchain/community/tools/calculator";
04import { WebBrowser } from "langchain/tools/webbrowser";
05
06const llm = new GerbilLLM({
07 model: "qwen3-0.6b",
08 thinking: true, // Enable for better reasoning
09});
10
11// Create tools
12const tools = [
13 new Calculator(),
14 // Add more tools as needed
15];
16
17// Create agent
18const executor = await initializeAgentExecutorWithOptions(tools, llm, {
19 agentType: "zero-shot-react-description",
20 verbose: true,
21});
22
23// Run agent
24const result = await executor.invoke({
25 input: "What is 25 * 4 + 10?",
26});
27
28console.log(result.output);

Conversation Memory

conversation.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";
02import { ConversationChain } from "langchain/chains";
03import { BufferMemory } from "langchain/memory";
04
05const llm = new GerbilLLM({ model: "qwen3-0.6b" });
06
07const memory = new BufferMemory();
08
09const chain = new ConversationChain({
10 llm,
11 memory,
12});
13
14// First message
15await chain.call({ input: "My name is Alice" });
16
17// Second message - remembers context
18const result = await chain.call({ input: "What's my name?" });
19console.log(result.response); // "Your name is Alice!"

Document Loaders

document-loaders.ts
01import { GerbilLLM, GerbilEmbeddings } from "@tryhamster/gerbil/langchain";
02import { TextLoader } from "langchain/document_loaders/fs/text";
03import { PDFLoader } from "langchain/document_loaders/fs/pdf";
04import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
05import { MemoryVectorStore } from "langchain/vectorstores/memory";
06
07// Load documents
08const textLoader = new TextLoader("./docs/readme.txt");
09const pdfLoader = new PDFLoader("./docs/manual.pdf");
10
11const textDocs = await textLoader.load();
12const pdfDocs = await pdfLoader.load();
13
14// Split into chunks
15const splitter = new RecursiveCharacterTextSplitter({
16 chunkSize: 500,
17 chunkOverlap: 50,
18});
19
20const splitDocs = await splitter.splitDocuments([...textDocs, ...pdfDocs]);
21
22// Create vector store
23const embeddings = new GerbilEmbeddings();
24const vectorStore = await MemoryVectorStore.fromDocuments(splitDocs, embeddings);
25
26// Query
27const results = await vectorStore.similaritySearch("How do I install?", 3);

Voice-Enabled Pipeline

Build a complete voice-to-voice agent with STT → LLM → TTS:

voice-pipeline.ts
01import { GerbilLLM, GerbilTTS, GerbilSTT } from "@tryhamster/gerbil/langchain";
02
03const llm = new GerbilLLM({ model: "qwen3-0.6b" });
04const tts = new GerbilTTS({ voice: "af_bella" });
05const stt = new GerbilSTT({ model: "whisper-tiny.en" });
06
07// Voice input → LLM → Voice output
08async function voiceChat(audioInput: Uint8Array) {
09 // 1. Transcribe user speech
10 const { text: userMessage } = await stt.transcribe(audioInput);
11 console.log("User said:", userMessage);
12
13 // 2. Generate response
14 const response = await llm.invoke(userMessage);
15 console.log("AI response:", response);
16
17 // 3. Speak response
18 const { audio, sampleRate } = await tts.speak(response);
19
20 return { audio, sampleRate, text: response };
21}
22
23// Combine with RAG for voice-enabled knowledge base
24import { MemoryVectorStore } from "langchain/vectorstores/memory";
25import { GerbilEmbeddings } from "@tryhamster/gerbil/langchain";
26
27const embeddings = new GerbilEmbeddings();
28const vectorStore = await MemoryVectorStore.fromTexts(docs, metadata, embeddings);
29
30async function voiceRAG(audioInput: Uint8Array) {
31 // Transcribe question
32 const { text: question } = await stt.transcribe(audioInput);
33
34 // Retrieve relevant documents
35 const relevantDocs = await vectorStore.similaritySearch(question, 3);
36 const context = relevantDocs.map(d => d.pageContent).join("\n");
37
38 // Generate answer with context
39 const answer = await llm.invoke(
40 `Context: ${context}\n\nQuestion: ${question}\n\nAnswer:`
41 );
42
43 // Speak the answer
44 const { audio } = await tts.speak(answer);
45 return { audio, answer };
46}