Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: Twitter Spaces Integration #1550

Merged
merged 6 commits into from
Jan 1, 2025
Merged

feat: Twitter Spaces Integration #1550

merged 6 commits into from
Jan 1, 2025

Conversation

slkzgm
Copy link
Contributor

@slkzgm slkzgm commented Dec 29, 2024

Risks

Low. Existing users who relied on Deepgram by default will still see no change unless they explicitly define a new TRANSCRIPTION_PROVIDER. Fallback logic preserves original behavior (Deepgram → OpenAI → Local).

Background

What does this PR do?

  • Adds an optional TRANSCRIPTION_PROVIDER setting (deepgram, openai, or local) with fallback logic.
    • If not set, old behavior remains: Deepgram → OpenAI → Local.
  • Moves Twitter Spaces plugins from agent-twitter-client into this repo for better flexibility and less friction in plugin development.
  • Introduces an AI-driven Twitter Spaces flow:
    1. Automatic Space launch decisions (random chance, business hours, cooldown intervals).
    2. Multi-speaker logic with queue management (maxSpeakers).
    3. GPT-based filler/idle messages, optional STT/TTS bridging, local audio recording.
    4. Graceful shutdown and cooldown for repeated Spaces.

Transcription Service Changes

  • We introduced a new TranscriptionProvider enum (Deepgram, OpenAI, or Local) to replace string flags.
  • In initialize(), the provider is chosen in this order:
    1. character.settings.transcription (if the API keys exist),
    2. .env (TRANSCRIPTION_PROVIDER),
    3. Old fallback logic (Deepgram → OpenAI → Local) if neither is configured.
  • For example, in your character.json, you can specify:
    {
      // ...
      "settings": {
        "transcription": "Deepgram"
      }
    }
    If you have DEEPGRAM_API_KEY set, the service will use Deepgram; otherwise it continues to the next check.
  • processQueue() uses a switch on this.transcriptionProvider to pick the final method (transcribeWithDeepgram, transcribeWithOpenAI, or transcribeLocally).

Flow Recap

  1. Periodic Check

    • If no Space is running, possibly launch one by shouldLaunchSpace() (random chance, business hours, cooldown).
    • If a Space is running, manageCurrentSpace() handles speaker timeouts, occupancy updates, queue acceptance, etc.
  2. Space Creation

    • Generates a SpaceConfig (topics from config or GPT).
    • Attaches plugins: audio recording, STT/TTS, idle monitor, etc.
    • Hooks into speakerRequest, occupancyUpdate, idleTimeout, etc.
  3. Speaker Logic

    • Maintains an activeSpeakers array + a queue if at capacity (maxSpeakers).
    • Enforces speakerMaxDurationMs per speaker.
    • If a speaker is removed, accept next in queue if available.
  4. Stopping

    • stopSpace() finalizes the Space, logs completion, clears states, etc.
    • Resumes periodic checks at a slower interval until the next launch is decided.

Configuration

A) .env / Environment Variables

# Transcription Provider
TRANSCRIPTION_PROVIDER=         # Default is local (possible values: deepgram, openai, local)
OPENAI_API_KEY=sk-...
DEEPGRAM_API_KEY=...

B) character.json"twitterSpaces" Field

{
  // ...
    "settings": {
    ...
    "transcription": "Deepgram"
  }
  "twitterSpaces": {
    "maxSpeakers": 2,
    "topics": [
      "Blockchain Trends",
      "AI Innovations"
    ],
    "typicalDurationMinutes": 45,
    "idleKickTimeoutMs": 300000,
    "minIntervalBetweenSpacesMinutes": 60,
    "businessHoursOnly": true,
    "randomChance": 0.3,
    "enableIdleMonitor": true,
    "enableSttTts": true,
    "enableRecording": false,
    "voiceId": "21m00Tcm4TlvDq8ikWAM",
    "sttLanguage": "en",
    "gptModel": "gpt-3.5-turbo",
    "systemPrompt": "You are a helpful AI co-host assistant.",
    "speakerMaxDurationMs": 240000
  }
}
  • maxSpeakers: number of concurrent speakers allowed.
  • topics: if none are provided, GPT generates them dynamically.
  • randomChance: probability for each check cycle to spawn a new Space.
  • speakerMaxDurationMs: maximum time each speaker can speak before removal.

What kind of change is this?

  • Features (new Twitter Spaces integration and optional transcription provider).
  • Improvements (unified plugin development, more config options, fallback logic maintained).

Documentation changes needed?

Yes, minimal. We must mention:

  • The new TRANSCRIPTION_PROVIDER in .env (optional).
  • The new twitterSpaces config section in character.json.

Testing

Where should a reviewer start?

  • Check transcription.service.ts to review how it resolves conflicts by prioritizing character settings, then .env, then old fallback.
  • Check new or relocated Twitter Spaces integration files for the Space lifecycle (launch, speaker management, idle detection, etc.).

Detailed testing steps

  1. Define TRANSCRIPTION_PROVIDER in .env (or leave it empty to keep old fallback).
  2. Provide valid API keys if choosing deepgram or openai.
  3. Define twitterSpaces.randomChance in the character JSON to 1 (for a 100% rate of starting a space).
  4. Run the agent; verify that Spaces launch automatically, respect the chosen transcription provider, and handle multi-speaker logic as expected.

No special database migrations are needed. Basic local runs and logs confirm correct functioning.

Future Improvements

  • More robust decision logic for accepting speakers, switching, and timeouts.
  • Realtime API plugin for smoother, on-the-fly conversation handling.
  • Solo Broadcast Mode: launch Spaces focused on a single host monologue with no external speakers.
  • True VAD (Voice Activity Detection) to detect when a speaker finishes talking, instead of relying on manual mute/unmute cues.
  • Advanced scheduling triggers (e.g., event-based or calendar-based).
  • Analytics & insights for post-Space summaries or usage metrics.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @slkzgm! Welcome to the ai16z community. Thanks for submitting your first pull request; your efforts are helping us accelerate towards AGI. We'll review it shortly. You are now a ai16z contributor!

@odilitime odilitime changed the base branch from main to develop December 29, 2024 19:45
Copy link
Collaborator

@odilitime odilitime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add back the documentation

@slkzgm slkzgm requested a review from odilitime December 30, 2024 20:19
@lalalune
Copy link
Member

lalalune commented Jan 1, 2025

Some conflicts that need review, we should prioritize getting this in since it's a pretty big push

@lalalune lalalune merged commit 6f576b6 into elizaOS:develop Jan 1, 2025
4 checks passed
1to3for5vi7ate9x pushed a commit to 1to3for5vi7ate9x/eliza that referenced this pull request Jan 26, 2025
feat: Twitter Spaces Integration
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants