The Mako API is OpenAI-compatible. If you’ve used the OpenAI SDK, you already know how to use this.
Authentication
Every request needs two things in the headers:
| Header | Value |
|---|
x-wallet-address | Any EVM address (e.g. 0x1234...) |
Authorization | Bearer YOUR_API_KEY (optional in free mode) |
The gateway is currently running in free mode — no API key required. Just send your wallet address.
Your first request
curl -X POST https://gateway.deepmako.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-wallet-address: 0x0000000000000000000000000000000000000001" \
-d '{
"model": "conductor",
"messages": [{"role": "user", "content": "what is eth at?"}],
"stream": false
}'
from openai import OpenAI
client = OpenAI(
base_url="https://gateway.deepmako.com/v1",
api_key="not-needed",
default_headers={"x-wallet-address": "0x0000000000000000000000000000000000000001"}
)
r = client.chat.completions.create(
model="conductor",
messages=[{"role": "user", "content": "what is eth at?"}]
)
print(r.choices[0].message.content)
const r = await fetch(
"https://gateway.deepmako.com/v1/chat/completions",
{
method: "POST",
headers: {
"Content-Type": "application/json",
"x-wallet-address": "0x0000000000000000000000000000000000000001",
},
body: JSON.stringify({
model: "conductor",
messages: [{ role: "user", content: "what is eth at?" }],
stream: false,
}),
}
);
const data = await r.json();
console.log(data.choices[0].message.content);
Responses follow the standard OpenAI Chat Completions format:
{
"id": "chatcmpl-1718464968543",
"object": "chat.completion",
"model": "mako-8b-operator",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "eth is at $1,665 right now."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1931,
"completion_tokens": 63,
"total_tokens": 1994
}
}
Model selection
There are two models. The model field controls which one you get:
| Value | Model | When to use |
|---|
"conductor" | Mako-32B | Tool calling, complex questions. Requires stream: false. |
"operator" | Mako-8B | Fast chat, streaming. Supports stream: true. |
If you set model: "conductor" with stream: true, the gateway silently falls back to the 8B model. The 32B model only supports non-streaming requests via RunPod Serverless.
This is where Mako gets interesting. She’ll chain tools automatically:
curl -X POST https://gateway.deepmako.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-wallet-address: 0x0000000000000000000000000000000000000001" \
-d '{
"model": "conductor",
"messages": [{"role": "user", "content": "what is vitalik.eth balance in usd?"}],
"stream": false
}'
The gateway handles the entire tool chain server-side — ENS resolution, balance fetch, price lookup, calculation — and returns a single final answer.
Credits
When credits are enabled, each request costs:
- 1 credit per 1,000 input tokens
- 2 credits per 1,000 output tokens
- Minimum charge: 1 credit per request
Check your balance at GET /credits/balance.