TikTok Video Transcription API with Speech-to-Text
Transcribe any TikTok video using automatic speech recognition. Get accurate, timestamped text output in clean JSON format for LLMs and search applications.
What is the TikTok Video Transcription API?
The TikTok Video Transcription API provides two methods to get text from TikTok videos. Use the Transcript API to retrieve native captions when they exist (faster and lower cost), or use the Transcribe API for speech-to-text when captions are not available. Both return the same consistent JSON format with timestamps and metadata.
Two methods for TikTok videos
Caption retrieval
Fast extraction of native captions when available. Lower cost per request.
POST /v1/transcriptSpeech-to-text
Generate transcripts from audio when no captions exist. Works on any video.
POST /v1/transcribeWhat data is returned?
- Full transcript with word-level timestamps (start/end times)
- Video metadata: title, description, thumbnail URL
- Creator information: username, profile URL
- Video statistics: views, likes, shares, duration
- Structured segments for easy integration
- Audio quality indicators
- Video URL and creation timestamp
Example use cases
Monitor TikTok content at scale by transcribing viral videos for trend detection and competitive intelligence.
Extract spoken content from TikTok videos to create blog posts, social captions, or searchable archives.
Analyze creator content by transcribing their videos for topic analysis, keyword extraction, or sentiment tracking.
API workflow
- 1.Send POST request with TikTok URL
- 2.API downloads and processes audio
- 3.Speech-to-text generates transcript
- 4.Response returns transcript + metadata
Code examples
Integrate the TikTok Transcription API with these examples.
Response structure
{
"status": "success",
"data": {
"video_info": {
"title": "TikTok video by @creator",
"description": "Check out this amazing content! #viral #fyp",
"thumbnail": "https://p16-sign-va.tiktokcdn.com/...",
"url": "https://www.tiktok.com/@user/video/1234567890",
"channel": "@creator",
"channel_url": "https://www.tiktok.com/@creator",
"duration": 45.2,
"views": 2500000,
"likes": 150000
},
"transcript": [
{ "start": 0.0, "end": 2.5, "text": "Hey everyone, welcome back!" },
{ "start": 2.5, "end": 6.8, "text": "Today I'm going to show you something incredible." },
{ "start": 6.8, "end": 11.2, "text": "You won't believe how easy this is." }
]
}
}Why use VidNavigator for TikTok transcription?
Our speech-to-text engine transcribes 1 hour of video in under 30 seconds. No waiting around.
Transcribe TikTok videos in over 99 languages with high accuracy.
Scale up to as low as $0.000025/transcript or $0.0041/min for speech-to-text. Even cheaper for Enterprise.
Each transcript segment includes precise start and end times for citation and navigation.
Pricing
Start free. Caption retrieval scales up to as low as $0.000025 per transcript. Speech-to-text scales up to as low as $0.0041 per minute. Enterprise plans available at even lower rates.