Documentation

Sharyx OS is the open-source orchestration engine for building production-grade AI voice agents. It connects Speech-to-Text (STT), Large Language Models (LLM), and Text-to-Speech (TTS) into a seamless, low-latency conversation loop.

New in v1.2: Native support for Gemini 1.5 Pro and Multi-modal tool calling.

Installation

Install the core package via npm to get started:

npm install sharyx-os

Quickstart

Set up a basic agent in under 2 minutes.

import { createAgent, OpenAILLM, DeepgramSTT, ElevenLabsTTS } from 'sharyx-os';

const agent = createAgent({
  stt: new DeepgramSTT({ apiKey: process.env.DEEPGRAM_API_KEY }),
  llm: new OpenAILLM({ apiKey: process.env.OPENAI_API_KEY }),
  tts: new ElevenLabsTTS({ apiKey: process.env.ELEVEN_LABS_API_KEY }),
  systemPrompt: "You are a helpful assistant for a medical clinic."
});

agent.start({ port: 3000 });

Built-in Providers

Sharyx OS supports industry-leading providers out of the box.

STT

  • Deepgram
  • OpenAI Whisper
  • Google Speech-to-Text

LLM

  • OpenAI (GPT-4o)
  • Anthropic (Claude 3.5)
  • Google Gemini

TTS

  • ElevenLabs
  • Cartesia
  • Play.ht

Implementation Procedures

Follow these standard procedures to deploy your agent across different communication channels.

๐Ÿ“ž Telephony Procedure

  1. Install: Run npm install sharyx-os in your project.
  2. Config: Add your API keys and Provider credentials to .env.
  3. Start: Execute npm run start to spin up the voice server.
  4. Connect: Point your Twilio/Plivo webhook to your server URL.

๐ŸŒ Webcall Procedure

  1. Install: Run npm install sharyx-os in your project.
  2. Config: Add your API keys to the .env file.
  3. Start: Execute npm run start to launch the local environment.
  4. Test: Open your browser and navigate to the local dashboard.

Offline Documentation

Take the Sharyx OS documentation with you. Download the official PDF guide for offline reference.

๐Ÿ“„

Sharyx OS Full Guide

PDF Format โ€ข 1.2 MB

Download PDF