Automate Transcribing your YouTube Videos with the Shipyard API
Last week, we released a blog post discussing the new functionality of our API. In that post, we walked through a workflow where we sent a CSV from a Snowflake query through email. The Shipyard API allows you to change variables at run time which will allow you to create one Fleet that can handle any workflow without having to create multiple Fleets that do the same thing. That specific use case is a popular Fleet that our customers use in Shipyard.
However, I am excited to share how we can use that API enhancement and apply it to a project that we are currently working on inside of Shipyard. If you didn't know already, we have a YouTube channel where we show off use-cases and tutorials on how to use Shipyard from our team. (Shameless plug).
YouTube provides an automatically generated transcription, but we've noticed that it doesn't always do a great job. The accuracy is okay most of the time, but it does not do any clean up such as getting rid of uh's or add in punctuation.
We know that adding great transcriptions for our YouTube videos is a great way to watch without sound as well as boosting their performance in the YouTube algorithm. We have manually added in transcripts to a couple of our videos using OpenAI's Whisper and saw that they were much more accurate than the automatically generated ones from YouTube.
With that in mind, I wanted to create a system where I could send through the ID of a video when we upload it to YouTube and have the transcriptions from Whisper be automatically created and added to the video. Since this is a process that will be continually repeated with minimal changes, I knew this would be a perfect use case for our new API endpoint. Let's dive in and see how I built it. Check out a video version of this post below:
Building Fleet Template
Similar to the blog post from last week, we need to start by building a Fleet in Shipyard that we can use to send parameters through. The inputs aren't important at this point for this Fleet. The Fleet will need to accomplish 3 things:
- Download the audio of the YouTube video.
- Transcribe the video using Whisper's API.
- Upload the created transcription to YouTube as a caption.
Thankfully, Shipyard has low-code Blueprints pre-made to handle the first two tasks. We will need to write a Python script that handles uploading captions to YouTube. You can see the code for that task along with the rest of the Fleet's setup in the YAML below:
name: Youtube Flow
vessels:
Upload Captions to YouTube:
source:
language: PYTHON
version: "3.9"
file:
name: test.py
content: |-
print('Importing')
import os
import googleapiclient.discovery
from google_auth_oauthlib.flow import InstalledAppFlow
from google.oauth2.credentials import Credentials
import googleapiclient.errors
from googleapiclient.http import MediaFileUpload
from google.auth.transport.requests import Request
print('Finished Importing')
video_id = os.environ.get('VIDEO_ID')
SCOPES = ['https://www.googleapis.com/auth/youtube.force-ssl']
TRANSCRIPT_FILE = 'transcription.txt'
VIDEO_ID = video_id
YOUTUBE_TRANSCRIPT_NAME = 'Shipyard Transcription'
CLIENT_ID = os.environ.get('GOOGLE_CLIENT_ID')
CLIENT_SECRET = os.environ.get('GOOGLE_CLIENT_SECRET')
def authenticate():
creds = None
refresh_token = os.environ.get('REFRESH_TOKEN')
if refresh_token:
creds = Credentials.from_authorized_user_info({
'refresh_token': refresh_token,
'client_id': CLIENT_ID,
'client_secret': CLIENT_SECRET,
}, SCOPES)
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file('google_creds.json', SCOPES)
creds = flow.run_local_server(port=0)
print('Your refresh token is: {}'.format(creds.refresh_token))
return googleapiclient.discovery.build('youtube', 'v3', credentials=creds)
youtube = authenticate()
# Read transcript file
with open(TRANSCRIPT_FILE, 'r') as file:
transcript = file.read()
# Set up the media file upload
media = MediaFileUpload(TRANSCRIPT_FILE, mimetype='application/octet-stream')
# Call the captions.insert method
request = youtube.captions().insert(
part="snippet",
body={
"snippet": {
"videoId": VIDEO_ID,
"language": "en",
"name": YOUTUBE_TRANSCRIPT_NAME,
"isDraft": False
}
},
media_body=media
)
response = request.execute()
print(response)
file_to_run: test.py
environment:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: SHIPYARD_HIDDEN
- name: REFRESH_TOKEN
value: SHIPYARD_HIDDEN
- name: VIDEO_ID
value: SHIPYARD_HIDDEN
- name: GOOGLE_CLIENT_ID
value: SHIPYARD_HIDDEN
- name: GOOGLE_CLIENT_SECRET
value: SHIPYARD_HIDDEN
packages:
- name: google-api-python-client
version: ==1.7.2
- name: google-auth
version: ==1.8.0
- name: google-auth-httplib2
version: ==0.0.3
- name: google-auth-oauthlib
version: ==0.4.1
type: CODE
guardrails:
retry_count: 0
retry_wait: 0s
runtime_cutoff: 1h0m0s
notifications:
emails:
after_error: true
after_on_demand: false
Whisper Transcribe Audio:
source:
blueprint: Whisper - Transcribe Audio
inputs:
WHISPER_DESTINATION_FILE_NAME: transcription.txt
WHISPER_FILE: youtube.webm
type: BLUEPRINT
guardrails:
retry_count: 0
retry_wait: 0s
runtime_cutoff: 1h0m0s
notifications:
emails:
after_error: true
after_on_demand: false
Youtube Download Video To Shipyard:
source:
blueprint: Youtube - Download Video to Shipyard
inputs:
YOUTUBE_DOWNLOAD_TYPE: audio
YOUTUBE_FILE_NAME: youtube.webm
YOUTUBE_VIDEO_ID: EkWW4tlzjMU
type: BLUEPRINT
guardrails:
retry_count: 0
retry_wait: 0s
runtime_cutoff: 1h0m0s
notifications:
emails:
- blake@shipyardapp.com
after_error: true
after_on_demand: false
connections:
Whisper Transcribe Audio:
Upload Captions to YouTube: SUCCESS
Youtube Download Video To Shipyard:
Whisper Transcribe Audio: SUCCESS
notifications:
emails:
after_error: true
after_on_demand: false
Choosing Fields to Change at Runtime
The beauty of Shipyard's new Trigger Fleet Run API endpoint really shows up here. Prior to this endpoint being available, I would have to create a separate Fleet for every single video or go back and edit each Fleet input individually for each video that I wanted to provide captions. Thankfully, I can easily send the parameters I want to change to the endpoint and those values are overridden at runtime for me.
The variables that we want to change at runtime are:
Youtube Download Video To Shipyard:
- YOUTUBE_VIDEO_ID
Upload Captions to YouTube:
- VIDEO_ID
Run the Fleet with the API
By referring to the sample code provided in the Shipyard documentation, we can now operate our previously created template Fleet utilizing the custom variables that we defined earlier. These custom variables must be incorporated into a dictionary for use.
json_data = {
"vessel_overrides": [
{
"name": "Youtube Download Video To Shipyard",
"environment_variable_overrides": {
"YOUTUBE_VIDEO_ID": "mcoQPPHdsPo",
}
},
{
"name": "Upload Captions to YouTube",
"environment_variable_overrides": {
"VIDEO_ID": "mcoQPPHdsPo"
}
}
]
}
We will now send the json_data variable to the Shipyard API, which will activate the fleet, substituting the template variables from earlier with the custom ones. The following code will enable this process. To proceed, just input your Shipyard API key, organization, project, and Fleet IDs into the code below:
import requests
headers = {
'Accept': 'application/json',
'X-Shipyard-API-Key': 'YOUR_API_KEY',
'Content-Type': 'application/json',
}
response = requests.post(
'https://api.app.shipyardapp.com/orgs/<YOUR_ORG_ID>/projects/<YOUR_PROJECT_ID/fleets/<YOUR_FLEET_ID>/fleetruns',
headers=headers,
json=json_data,
)
The operation of the Fleet hinges on the variables we supplied, with the corresponding captions also being uploaded. Thanks to this configuration, we can continuously adjust the variables to run the exact same Fleet, but with different inputs. This means we no longer need to endlessly replicate a Fleet for adding captions to every video.