Documentation

Sharyx OS is the open-source orchestration engine for building production-grade AI voice agents. It connects Speech-to-Text (STT), Large Language Models (LLM), and Text-to-Speech (TTS) into a seamless, low-latency conversation loop.

New in v1.2: Native support for Gemini 1.5 Pro and Multi-modal tool calling.

Installation

Install the core package via npm to get started:

npm install sharyx-os

Quickstart

Set up a basic agent in under 2 minutes.

import { createAgent, OpenAILLM, DeepgramSTT, ElevenLabsTTS } from 'sharyx-os';

const agent = createAgent({
  stt: new DeepgramSTT({ apiKey: process.env.DEEPGRAM_API_KEY }),
  llm: new OpenAILLM({ apiKey: process.env.OPENAI_API_KEY }),
  tts: new ElevenLabsTTS({ apiKey: process.env.ELEVEN_LABS_API_KEY }),
  systemPrompt: "You are a helpful assistant for a medical clinic."
});

agent.start({ port: 3000 });

Built-in Providers

Sharyx OS supports industry-leading providers out of the box.

STT

Deepgram
OpenAI Whisper
Google Speech-to-Text

LLM

OpenAI (GPT-4o)
Anthropic (Claude 3.5)
Google Gemini

TTS

ElevenLabs
Cartesia
Play.ht

Implementation Procedures

Follow these standard procedures to deploy your agent across different communication channels.

📞 Telephony Procedure

Install: Run npm install sharyx-os in your project.
Config: Add your API keys and Provider credentials to .env.
Start: Execute npm run start to spin up the voice server.
Connect: Point your Twilio/Plivo webhook to your server URL.

🌐 Webcall Procedure

Install: Run npm install sharyx-os in your project.
Config: Add your API keys to the .env file.
Start: Execute npm run start to launch the local environment.
Test: Open your browser and navigate to the local dashboard.

Offline Documentation

Take the Sharyx OS documentation with you. Download the official PDF guide for offline reference.

📄

Sharyx OS Full Guide

PDF Format • 1.2 MB

Download PDF