Documentation
Sharyx OS is the open-source orchestration engine for building production-grade AI voice agents. It connects Speech-to-Text (STT), Large Language Models (LLM), and Text-to-Speech (TTS) into a seamless, low-latency conversation loop.
New in v1.2: Native support for Gemini 1.5 Pro and Multi-modal tool calling.
Installation
Install the core package via npm to get started:
npm install sharyx-os
Quickstart
Set up a basic agent in under 2 minutes.
import { createAgent, OpenAILLM, DeepgramSTT, ElevenLabsTTS } from 'sharyx-os';
const agent = createAgent({
stt: new DeepgramSTT({ apiKey: process.env.DEEPGRAM_API_KEY }),
llm: new OpenAILLM({ apiKey: process.env.OPENAI_API_KEY }),
tts: new ElevenLabsTTS({ apiKey: process.env.ELEVEN_LABS_API_KEY }),
systemPrompt: "You are a helpful assistant for a medical clinic."
});
agent.start({ port: 3000 });
Built-in Providers
Sharyx OS supports industry-leading providers out of the box.
STT
- Deepgram
- OpenAI Whisper
- Google Speech-to-Text
LLM
- OpenAI (GPT-4o)
- Anthropic (Claude 3.5)
- Google Gemini
TTS
- ElevenLabs
- Cartesia
- Play.ht
Implementation Procedures
Follow these standard procedures to deploy your agent across different communication channels.
๐ Telephony Procedure
- Install: Run
npm install sharyx-osin your project. - Config: Add your API keys and Provider credentials to
.env. - Start: Execute
npm run startto spin up the voice server. - Connect: Point your Twilio/Plivo webhook to your server URL.
๐ Webcall Procedure
- Install: Run
npm install sharyx-osin your project. - Config: Add your API keys to the
.envfile. - Start: Execute
npm run startto launch the local environment. - Test: Open your browser and navigate to the local dashboard.
Offline Documentation
Take the Sharyx OS documentation with you. Download the official PDF guide for offline reference.
๐
Sharyx OS Full Guide
PDF Format โข 1.2 MB