Skip to main content
POST
/
automation
/
create-transcription
Create Transcription
curl --request POST \
  --url https://public.reap.video/api/v1/automation/create-transcription \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "sourceUrl": "<string>",
  "uploadId": "<string>",
  "language": "<string>",
  "translationLanguage": "<string>",
  "transcriptionScript": "native"
}
'
{
  "id": "<string>",
  "title": "<string>",
  "thumbnail": "<string>",
  "billedDuration": 123,
  "status": "queued",
  "projectType": "clipping",
  "source": "Upload",
  "genre": "talking",
  "topics": [
    "<string>"
  ],
  "clipDurations": [
    [
      123
    ]
  ],
  "selectedStart": 123,
  "selectedEnd": 123,
  "reframeClips": true,
  "exportResolution": 123,
  "exportOrientation": "landscape",
  "captionsPreset": "<string>",
  "enableCaptions": true,
  "enableEmojis": true,
  "enableHighlights": true,
  "language": "<string>",
  "dubbingLanguage": "<string>",
  "translateTranscription": true,
  "translationLanguages": [
    "<string>"
  ],
  "transcriptionScript": "native",
  "metadata": {
    "width": 123,
    "height": 123,
    "aspectRatio": "<string>",
    "size": 123,
    "bitrate": 123,
    "fps": 123,
    "duration": 123,
    "rotation": 123,
    "resolution": 123,
    "codec": "<string>",
    "codecFullName": "<string>",
    "codecTag": "<string>",
    "format": "<string>",
    "formatFullName": "<string>"
  },
  "urls": {},
  "createdAt": 123,
  "updatedAt": 123
}

Overview

Extract accurate transcriptions from your videos using AI-powered speech recognition. This endpoint generates timestamped transcriptions with support for multiple languages, translation, and script format options. Transcription output is available in multiple formats including SRT, VTT, CSV, and TXT.

Rate Limiting

This endpoint is rate limited to 10 requests per minute per API key.
You must provide either sourceUrl or uploadId, but not both.

Video Requirements

Duration

Minimum: 3 seconds
Maximum: 15 minutes

File Size

Maximum: 5 GB

Format

MP4 or MOV with valid audio streams

Audio Quality

Clear speech produces best transcription results

Plan Limits

PlanConcurrent Projects
Creator5
Studio15
Higher-tier plans allow you to process more videos simultaneously.
The Automation API requires an active subscription. View pricing to compare plans.

Response

id
string
Unique project identifier
title
string
Project title (usually the filename)
thumbnail
string
Thumbnail URL for the project
billedDuration
number
Duration in seconds that will be billed to your account
status
string
Current processing status
  • processing - Audio is being transcribed
  • completed - Transcription has been generated successfully
  • failed - Processing failed due to an error
projectType
string
Type of project (always “transcription” for this endpoint)
source
string
Source of the video content
  • Upload - Uploaded file
  • Youtube - YouTube URL
  • Generic - External URL
language
string
Primary language of the video content
translateTranscription
boolean
Whether transcription will be translated
translationLanguages
array
Array of languages for translation
transcriptionScript
string
Script format for transcription (“native” or “roman”)
metadata
object
Video file metadata including duration, resolution, format, etc.
urls
object
Project URLs and assets (populated when processing completes). Includes transcription files in multiple formats: SRT, VTT, CSV, and TXT.
createdAt
integer
Unix timestamp when the project was created
updatedAt
integer
Unix timestamp when the project was last updated

Example Request

curl -X POST "https://public.reap.video/api/v1/automation/create-transcription" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "uploadId": "65f1a2b3c4d5e6f7a8b9c0d1",
    "language": "en",
    "transcriptionScript": "native"
  }'

Example Response

Processing Workflow

  1. Audio Extraction - Audio is extracted from the video file
  2. Speech Recognition - AI transcribes the speech with word-level timing
  3. Translation - If a translation language is specified, the transcription is translated
  4. Format Generation - Output is generated in multiple formats (SRT, VTT, CSV, TXT)
  5. Completion - Use Get Project Status to monitor progress

Output Formats

When transcription completes, the urls object in Get Project Details includes:
FormatFieldDescription
SRTtranscription_srtSubRip subtitle format
VTTtranscription_vttWebVTT subtitle format
CSVtranscription_csvComma-separated values
TXTtranscription_txtPlain text transcript
AudioaudioFileExtracted audio file

Best Practices

  • Audio Quality: Clear audio with minimal background noise produces more accurate results
  • Language Selection: Specify the language explicitly for better transcription accuracy
  • Script Format: Use “roman” for romanized output of non-Latin script languages
  • Translation: Combine with translationLanguage to get translated transcriptions

Use Cases

Content Indexing

Generate searchable text from video libraries at scale

Subtitle Generation

Create SRT/VTT files for video players and platforms

Meeting Notes

Transcribe recorded meetings and webinars

Accessibility

Make video content accessible with accurate transcriptions

Next Steps

After creating a transcription project:
  1. Monitor progress with Get Project Status
  2. Retrieve the full project with transcription URLs via Get Project Details
  3. Download transcription files in your preferred format

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
sourceUrl
string

URL to a video or audio file (alternative to uploadId)

uploadId
string

Upload ID from a previously uploaded file (alternative to sourceUrl)

language
string | null

Primary language of the video content (auto-detected if not provided)

translationLanguage
string | null

Language to translate the transcription to

transcriptionScript
enum<string>
default:native

Script format for transcription output

Available options:
native,
roman

Response

200 - application/json

Successful response

id
string
title
string
thumbnail
string
billedDuration
number
status
enum<string>
Available options:
queued,
prepped,
draft,
processing,
finalizing,
completed,
invalid,
expired,
failed,
error
projectType
enum<string>
Available options:
clipping,
captions,
reframe,
dubbing,
transcription
source
enum<string>
Available options:
Upload,
Youtube,
Vimeo,
TwitchVod,
Twitter,
RumbleEmbed,
Generic
genre
enum<string>
default:talking
Available options:
talking,
screenshare,
gaming
topics
string[]
clipDurations
integer[][]
selectedStart
number | null
selectedEnd
number | null
reframeClips
boolean
exportResolution
integer
exportOrientation
enum<string>
Available options:
landscape,
portrait,
square
captionsPreset
string | null
enableCaptions
boolean
enableEmojis
boolean
enableHighlights
boolean
language
string | null
dubbingLanguage
string | null
translateTranscription
boolean
translationLanguages
string[]
transcriptionScript
enum<string>
default:native
Available options:
native,
roman
metadata
object
urls
object
createdAt
integer
updatedAt
integer