LangChain
Full LangChain integration with LLM, embeddings, TTS, and STT. Build chains, agents, and voice-enabled pipelines with local models.
| Class | Capability |
|---|---|
| GerbilLLM | Text generation + Vision |
| GerbilEmbeddings | Vector embeddings |
Note: The LangChain integration runs in Node.
GerbilLLM and GerbilEmbeddings use native models (Qwen3.5-0.8B, EmbeddingGemma-300M), running on the WebGPU engine. For speech (Kani-TTS-2 TTS, Moonshine STT) and browser inference, see the WebGPUEngine.Installation
Terminal
npm install @tryhamster/gerbil langchainQuick Start
quick-start.ts
01import {02 GerbilLLM,03 GerbilEmbeddings,04} from "@tryhamster/gerbil/langchain";05
06// Text generation07const llm = new GerbilLLM({ model: "qwen3.5-0.8b" });08const result = await llm.invoke("Write a haiku about coding");09
10// Embeddings11const embeddings = new GerbilEmbeddings();12const vector = await embeddings.embedQuery("Hello world");GerbilLLM
Text generation with optional vision support:
llm-config.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";02
03const llm = new GerbilLLM({04 // Model configuration05 model: "qwen3.5-0.8b",06 device: "auto", // "auto" | "gpu" | "cpu"07 dtype: "q4", // "q4" | "q8" | "fp16" | "fp32"08
09 // Generation options10 maxTokens: 500,11 temperature: 0.7,12 topP: 0.9,13 topK: 50,14
15 // Thinking mode (Qwen3)16 thinking: false,17
18 // Callbacks19 callbacks: [20 {21 handleLLMStart: async (llm, prompts) => {22 console.log("Starting generation...");23 },24 handleLLMEnd: async (output) => {25 console.log("Generation complete");26 },27 },28 ],29});invoke()
invoke.ts
01// Simple invocation02const result = await llm.invoke("Explain recursion");03
04// With options05const result = await llm.invoke("Write a poem", {06 maxTokens: 200,07 temperature: 0.9,08});09
10// With stop sequences11const result = await llm.invoke("List 3 items:\n1.", {12 stop: ["\n4."],13});Streaming
streaming.ts
01// Stream tokens02const stream = await llm.stream("Tell me a story");03
04for await (const chunk of stream) {05 process.stdout.write(chunk);06}07
08// With callbacks09const stream = await llm.stream("Explain hooks", {10 callbacks: [{11 handleLLMNewToken: async (token) => {12 console.log("Token:", token);13 },14 }],15});Vision
Use vision-capable models to analyze images:
vision.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";02
03// Use a vision-capable model04const llm = new GerbilLLM({ model: "qwen3.5-0.8b" });05
06// Check if model supports vision07const hasVision = await llm.supportsVision(); // true08
09// Analyze an image10const description = await llm.invokeWithImages(11 "Describe this image in detail",12 [{ source: "https://example.com/photo.jpg" }]13);14
15// Compare multiple images16const diff = await llm.invokeWithImages(17 "What changed between these two screenshots?",18 [19 { source: beforeScreenshot },20 { source: afterScreenshot },21 ]22);23
24// Use with local files (base64)25import { readFileSync } from "fs";26const imageData = readFileSync("photo.jpg").toString("base64");27const result = await llm.invokeWithImages(28 "What's in this photo?",29 [{ source: `data:image/jpeg;base64,${imageData}` }]30);GerbilEmbeddings
embeddings.ts
01import { GerbilEmbeddings } from "@tryhamster/gerbil/langchain";02
03const embeddings = new GerbilEmbeddings({04 // Optional: specify embedding model (defaults to EmbeddingGemma-300M)05 model: "embeddinggemma-300m",06});07
08// Single query09const vector = await embeddings.embedQuery("What is the meaning of life?");10// Returns: number[] (768 dimensions, L2-normalized)11
12// Multiple documents13const vectors = await embeddings.embedDocuments([14 "First document",15 "Second document",16 "Third document",17]);18// Returns: number[][] (array of vectors)Speech & Audio
Speech runs on the native WebGPU engine rather than a LangChain wrapper. Text-to-speech uses Kani-TTS-2 via engine.speak(), and speech-to-text uses Moonshine via MoonshineSTT — both running on-device on WebGPU. See the Text-to-Speech and Speech-to-Text docs.
Chains
Use Gerbil with LangChain chains:
chains.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";02import { PromptTemplate } from "@langchain/core/prompts";03import { StringOutputParser } from "@langchain/core/output_parsers";04
05const llm = new GerbilLLM({ model: "qwen3.5-0.8b" });06
07// Create a simple chain08const prompt = PromptTemplate.fromTemplate(09 "You are a helpful assistant. Answer this question: {question}"10);11
12const chain = prompt.pipe(llm).pipe(new StringOutputParser());13
14const result = await chain.invoke({15 question: "What is the capital of France?",16});17
18console.log(result); // "The capital of France is Paris."Structured Output
structured.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";02import { z } from "zod";03
04const llm = new GerbilLLM({ model: "qwen3.5-0.8b" });05
06// Define schema07const personSchema = z.object({08 name: z.string(),09 age: z.number(),10 city: z.string(),11});12
13// Create structured LLM14const structuredLlm = llm.withStructuredOutput(personSchema);15
16const result = await structuredLlm.invoke(17 "Extract: John is 32 years old and lives in New York"18);19
20console.log(result);21// { name: "John", age: 32, city: "New York" }Vector Stores
vector-stores.ts
01import { GerbilEmbeddings } from "@tryhamster/gerbil/langchain";02import { MemoryVectorStore } from "langchain/vectorstores/memory";03import { Document } from "@langchain/core/documents";04
05const embeddings = new GerbilEmbeddings();06
07// Create documents08const docs = [09 new Document({ pageContent: "Gerbil is a local LLM library" }),10 new Document({ pageContent: "It supports WebGPU acceleration" }),11 new Document({ pageContent: "Works with the Vercel AI SDK" }),12];13
14// Create vector store15const vectorStore = await MemoryVectorStore.fromDocuments(docs, embeddings);16
17// Similarity search18const results = await vectorStore.similaritySearch("What is Gerbil?", 2);19console.log(results);RAG Pipeline
Build a complete Retrieval-Augmented Generation pipeline:
rag.ts
01import { GerbilLLM, GerbilEmbeddings } from "@tryhamster/gerbil/langchain";02import { MemoryVectorStore } from "langchain/vectorstores/memory";03import { createRetrievalChain } from "langchain/chains/retrieval";04import { createStuffDocumentsChain } from "langchain/chains/combine_documents";05import { ChatPromptTemplate } from "@langchain/core/prompts";06
07// Initialize08const llm = new GerbilLLM({ model: "qwen3.5-0.8b" });09const embeddings = new GerbilEmbeddings();10
11// Create vector store from documents12const vectorStore = await MemoryVectorStore.fromTexts(13 [14 "Gerbil runs LLMs locally in Node.js",15 "It supports GPU acceleration via WebGPU",16 "Models are cached in IndexedDB",17 "Works offline after first download",18 ],19 [{}, {}, {}, {}],20 embeddings21);22
23// Create retriever24const retriever = vectorStore.asRetriever({ k: 2 });25
26// Create prompt27const prompt = ChatPromptTemplate.fromTemplate(`28Answer the question based on the context below.29
30Context: {context}31
32Question: {input}33
34Answer:35`);36
37// Create chains38const documentChain = await createStuffDocumentsChain({39 llm,40 prompt,41});42
43const retrievalChain = await createRetrievalChain({44 combineDocsChain: documentChain,45 retriever,46});47
48// Query49const result = await retrievalChain.invoke({50 input: "Does Gerbil work offline?",51});52
53console.log(result.answer);54// "Yes, Gerbil works offline after the first download..."Agents
agents.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";02import { initializeAgentExecutorWithOptions } from "langchain/agents";03import { Calculator } from "@langchain/community/tools/calculator";04import { WebBrowser } from "langchain/tools/webbrowser";05
06const llm = new GerbilLLM({07 model: "qwen3.5-0.8b",08 thinking: true, // Enable for better reasoning09});10
11// Create tools12const tools = [13 new Calculator(),14 // Add more tools as needed15];16
17// Create agent18const executor = await initializeAgentExecutorWithOptions(tools, llm, {19 agentType: "zero-shot-react-description",20 verbose: true,21});22
23// Run agent24const result = await executor.invoke({25 input: "What is 25 * 4 + 10?",26});27
28console.log(result.output);Conversation Memory
conversation.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";02import { ConversationChain } from "langchain/chains";03import { BufferMemory } from "langchain/memory";04
05const llm = new GerbilLLM({ model: "qwen3.5-0.8b" });06
07const memory = new BufferMemory();08
09const chain = new ConversationChain({10 llm,11 memory,12});13
14// First message15await chain.call({ input: "My name is Alice" });16
17// Second message - remembers context18const result = await chain.call({ input: "What's my name?" });19console.log(result.response); // "Your name is Alice!"Document Loaders
document-loaders.ts
01import { GerbilLLM, GerbilEmbeddings } from "@tryhamster/gerbil/langchain";02import { TextLoader } from "langchain/document_loaders/fs/text";03import { PDFLoader } from "langchain/document_loaders/fs/pdf";04import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";05import { MemoryVectorStore } from "langchain/vectorstores/memory";06
07// Load documents08const textLoader = new TextLoader("./docs/readme.txt");09const pdfLoader = new PDFLoader("./docs/manual.pdf");10
11const textDocs = await textLoader.load();12const pdfDocs = await pdfLoader.load();13
14// Split into chunks15const splitter = new RecursiveCharacterTextSplitter({16 chunkSize: 500,17 chunkOverlap: 50,18});19
20const splitDocs = await splitter.splitDocuments([...textDocs, ...pdfDocs]);21
22// Create vector store23const embeddings = new GerbilEmbeddings();24const vectorStore = await MemoryVectorStore.fromDocuments(splitDocs, embeddings);25
26// Query27const results = await vectorStore.similaritySearch("How do I install?", 3);Voice-Enabled Pipeline
Build a complete voice-to-voice agent with STT → LLM → TTS. The LangChain LLM handles text; speech is the native WebGPU engine (Moonshine for STT, Kani-TTS-2 for TTS):
voice-pipeline.ts
01import { GerbilLLM } from "@tryhamster/gerbil/langchain";02import { MoonshineSTT, WebGPUEngine } from "@tryhamster/gerbil/gpu";03
04const llm = new GerbilLLM({ model: "qwen3.5-0.8b" });05const stt = await MoonshineSTT.create({ repo: "UsefulSensors/moonshine-base" });06const tts = await WebGPUEngine.create({ repo: "nineninesix/kani-tts-450m-0.2-ft" });07
08// Voice input → LLM → Voice output09async function voiceChat(pcm16kMono: Float32Array) {10 // 1. Transcribe user speech (raw 16 kHz mono PCM)11 const { text: userMessage } = await stt.transcribe(pcm16kMono);12 console.log("User said:", userMessage);13
14 // 2. Generate response15 const response = await llm.invoke(userMessage);16 console.log("AI response:", response);17
18 // 3. Speak response19 const { pcm, sampleRate } = await tts.speak(response, { languageTag: "en_us" });20
21 return { pcm, sampleRate, text: response };22}23
24// Combine with RAG for voice-enabled knowledge base25import { MemoryVectorStore } from "langchain/vectorstores/memory";26import { GerbilEmbeddings } from "@tryhamster/gerbil/langchain";27
28const embeddings = new GerbilEmbeddings();29const vectorStore = await MemoryVectorStore.fromTexts(docs, metadata, embeddings);30
31async function voiceRAG(pcm16kMono: Float32Array) {32 // Transcribe question33 const { text: question } = await stt.transcribe(pcm16kMono);34
35 // Retrieve relevant documents36 const relevantDocs = await vectorStore.similaritySearch(question, 3);37 const context = relevantDocs.map(d => d.pageContent).join("\n");38
39 // Generate answer with context40 const answer = await llm.invoke(41 `Context: ${context}\n\nQuestion: ${question}\n\nAnswer:`42 );43
44 // Speak the answer45 const { pcm } = await tts.speak(answer, { languageTag: "en_us" });46 return { pcm, answer };47}