RAG System - Mako

Mako includes a lightweight retrieval-augmented generation (RAG) system that gives the model curated knowledge about the Base ecosystem — without relying on the model’s training data alone.

How it works

  User message
       │
       ▼
  ┌────────────────┐     ┌─────────────────────┐
  │  Keyword scan   │────▶│  Knowledge entries   │
  │  (85+ triggers) │     │  (15 JSON files)     │
  └────────────────┘     └──────────┬────────────┘
                                    │
                              Top 2 matches
                              (~500 tokens)
                                    │
                                    ▼
                         ┌─────────────────────┐
                         │  Injected as system  │
                         │  message before      │
                         │  inference           │
                         └─────────────────────┘

Before every inference call, the gateway scans the user’s message for known project keywords. If a match is found, relevant context is injected as a system message — so the model has accurate, up-to-date information without needing a separate retrieval step.

Two retrieval modes

Auto-injection

Keywords in the user’s message trigger context injection automatically, before the model sees the message. Zero latency.

Explicit search

The model can call the knowledge_search tool directly to query the knowledge base on demand.

Auto-injection

The getContextForMessage() function runs on every request:

Scans the user’s message against 85+ keyword triggers built from each project’s keywords array and name
Scores matches by keyword hit count
Injects the top 2 project summaries (~500 tokens) as a system message

Keyword matching rules

Rule	Behavior
Word boundary	Keywords are matched as whole words using regex word boundaries to avoid false positives (e.g., “base” won’t match “database”)
Short keywords (≤3 chars)	Require a `$` prefix or strict word-boundary matching to prevent noise (except common tickers like `eth`, `bnb`, `uni`)
Token symbols	`$AERO`, `$VIRTUAL`, etc. are matched with the `$` prefix
Scoring	Each keyword hit adds to the entry’s score. Top 2 entries by score are injected
Truncation	Project details are capped at 600 characters to conserve tokens

Injected format

[Knowledge Context]
Aerodrome (DeFi): The central DEX and liquidity hub on Base.
Aerodrome is a ve(3,3) DEX that serves as the primary...
Token: $AERO on base
Website: https://aerodrome.finance

Explicit search

The model can also call knowledge_search directly when it needs structured information. This uses a weighted scoring system:

Field	Score weight
Project name	+10
Keywords	+5 per match
Category	+3
Summary	+2
Details	+1

Returns up to 5 results, each with name, category, summary, details, token, and website. If called with an empty query, it returns all entries (name, category, and summary only) — useful for browsing.

Knowledge entries

Each entry is a JSON file in gateway/src/knowledge/. The gateway loads all entries at startup.

Entry schema

{
  "name": "Aerodrome",
  "category": "DeFi",
  "summary": "The central DEX and liquidity hub on Base.",
  "keywords": ["aerodrome", "aero", "dex", "liquidity", "ve33"],
  "token": {
    "symbol": "AERO",
    "chain": "base"
  },
  "links": {
    "website": "https://aerodrome.finance",
    "docs": "https://docs.aerodrome.finance"
  },
  "details": "Aerodrome is a ve(3,3) DEX...",
  "notable_agents": [
    {
      "name": "AeroBot",
      "description": "Automated LP management"
    }
  ]
}

Current entries

The knowledge base ships with 15 curated entries:

Project	Category	Token
Aerodrome	DeFi	$AERO
Moonwell	DeFi	$WELL
Morpho	DeFi	$MORPHO
Aave (Base)	DeFi	$AAVE
Uniswap (Base)	DeFi	$UNI
Virtuals Protocol	AI Agents	$VIRTUAL
Venice AI	AI Agents	—
Clanker	AI Agents	—
AIXBT	AI Agents	$AIXBT
Luna	AI Agents	$LUNA
Bankr	AI Agents	$BANKR
Farcaster	Social	—
Zora	Social	—
Base Chain	Infrastructure	—
Mako	Infrastructure	$MAKO

Adding a new entry

Create a JSON file in gateway/src/knowledge/ following the schema above. The gateway loads all .json files from this directory at startup — no code changes required, just restart the server.

Choose keywords carefully. Each keyword becomes a trigger for auto-injection, so avoid overly generic terms that would match irrelevant queries.

​How it works

​Two retrieval modes

Auto-injection

Explicit search

​Auto-injection

​Keyword matching rules

​Injected format

​Explicit search

​Knowledge entries

​Entry schema

​Current entries

​Adding a new entry

How it works

Two retrieval modes

Auto-injection

Keyword matching rules

Injected format

Explicit search

Knowledge entries

Entry schema

Current entries

Adding a new entry