Subtitles

There can be a time in a household where having closed captions or subtitles on movies is very desirable. We have reached that time.

We watch our movies using a Jellyfin server running on our low end Network Attached Storage (NAS) device which does not have enough performance to merge subtitles with video in real time. The Jellyfin client running on our Roku TV only handles a limited set of subtitle streams or formats. In this case, our only choice is having SRT text files for each video file.

So our task has been to get good SRT subtitle files for our movie collection.

A more powerful server or more flexible player could give more choices but some of the issues will be the same.

In our case, the originals of our movies are mostly on DVDs but a few are VHS tapes. The DVDs were ripped using Handbrake. The VHS tapes were captured off the VHS player’s composite output feeding a composite to HDMI device which, in turn, fed a HDMI capture device.

One more item that affects my workflow is that I use a Macintosh which has a different set of GUI tools than a more common PC.

Subtitles

DVDs may contain a closed caption (text) subtitle stream. Or they may contain a VobSub track which is basically a video stream containing rasterized text.

If a DVD contains those tracks, that will be the best to use for creating SRT files: They are typically accurate and properly synchronized with the movie.

When first capturing the files I only worried about getting the video and audio tracks. Later, when subtitles were desired, the DVDs with closed capture subtitles were re-ripped and the closed captions were converted to SRT files. Unfortunately, it seems that only about half of our DVDs had closed caption data.

Later, as subtitles became even more desirable for us, I found a way to convert the VobSub tracks. Then the DVDs with VobSub tracks were scanned and converted to SRT files.

First Observation

I wish that I’d pulled grabbed all available tracks the first time I ripped each DVD. While ripping a DVD is not horribly slow, it does take a while and doing it multiple times for each movie is not desirable. The additional subtitle tracks don’t make the final video file much larger so leaving them out doesn’t save much storage and only adds work later.

Easiest – DVD Contains a Closed Caption Track.

If the DVD contains a closed caption (CC) track, that track can be extracted from the ripped video file by either using ffmpeg or a graphical app. Using ffmpeg the command line is simply:

ffmpeg -i video.mp4 video.srt

The Subler app is also capable of extracting a text subtitle track from a video file.

Almost as Easy – DVD Has VobSub Tracks

The Subler app has the ability to use optical character recognition (OCR) to convert the VobSub stream into a closed caption text track.

The procedure was not obvious to me but not difficult.

First you need to set the “VobSub to Tx3g” check box in the advanced tab of Subler’s preferences (settings on newer MacOS versions). Then in the OCR tab of the settings, get the language data file for the language(s) of interest. This is a one time setup. Then for each video file:

  1. In Subler, create a new file.
  2. Drag the video to the new file window and drop it in.
  3. A option box shows up, make sure that the check box for converting the VobSub to Tx3g and including it is checked.
  4. Save the “new” video file some place to trigger the conversion. The OCR conversion will take a few minutes.
  5. Once converted, select the Tx3g track and under the “File” menu select “Export. . .”.

The documentation suggests that you review the resulting VTT text file to verify accuracy in conversion but, so far, my results have not required touch up editing. VTT is a similar format to SRT and can easily be converted to SRT with a little scripting.

Harder – No CC or VobSub Tracks

If this is a video captured from a VHS or a out of copyright movie downloaded from Archive.org or other site, there will be no CC or VobSub tracks. If this movie was ever released on a DVD, go out an buy it.

Even if the DVD for that movie does not have a CC or VobSub track it will have far better video and audio quality than a VHS capture.

If the DVD has CC or VobSub tracks, then your are back to the easiest or almost as easy steps above. If as in the case of some of our movies, there is no CC or VobSub track on the DVD, then things get much harder.

Much Harder – Find a Good Subtitle File Online

If logged into Jellyfin using the web interface as an administrator, and if Jellyfin has been configured with your opensubtitles.org account information, you can search for appropriate subtitles very easily.

You can, of course, simply do an Internet search for subtitles for your movie. Most hits seem to be on opensubtitles.org but you do have better search options on the OpenSubtitles website than within Jellyfin.

Unfortunately, different releases of a movie over time or in different markets may have slightly different dialog or timing. Many of the subtitle files seem to be created by low budget operations or volunteers and are of poor quality. I have not figured out the optimal way to find the best subtitle file if multiple versions are available. I generally download several and spot check them for accuracy in dialog and timing and keep the best.

The best will, almost always, be mediocre. At a minimum, dialog will be time shifted from my copy of the movie. Almost always there is missing dialog, garbled dialog, and sound alike words in place of the ones actually spoken in the movie.

The Subshift 2 app makes it pretty easy to adjust the timing of the whole file. If the video speed is different the time offset of each subtitle will need to be adjusted. I have not found an easy GUI based app for that so I wrote myself a script.

Fortunately, SRT files are a pretty simple text format so you can use any text editor to make corrections. If I have the time and desire, I will use a text editor to make corrections and use Subshift to view the results. Between my poor transcription speed and needing multiple passes to get the timing and dialog correct, it can take me several multiples of the video run time to correct a typical downloaded SRT file.

Very Hard – Auto Generation of Subtitles

If Internet searches come up empty then creating a subtitle file from scratch is the only option. This can happen for many older films that are not popular nowadays.

For example, I have a couple DVD “value packs” of older movies where none of the films have subtitles embedded and I have been unable to find subtitle files online.

There are a bunch of websites that claim to be able to generate subtitles for videos. Many bandy the “AI” moniker. Maybe they do use AI, maybe they don’t. But while I have the legal right to make digital copies of a movie for my own backup purposes, I don’t believe I have the legal right to upload it to a website for any reason. So I have not used any of these services and cannot comment on how well they work.

The paid MacOS VideoSubs app uses the speech recognition built into the operating system to generate subtitles. In this case it generates VTT text files but those are very easy to convert to SRT files with a little scripting.

I have found that this app works best with a single voice with excellent enunciation, like a narrator on a documentary film. But it can create rough outline for a more typical movie.

In either case, it will take a lot of editing to make a usable subtitle file. The only films I have completed this process on are a couple of home videos and one industrial film from the early 1930s. A full length feature film with multiple voices over a background music track may not be good enough to even start with.

I suspect that as AI capabilities get better on home computers the options will open up. But, at present this is a fairly difficult way to get a good subtitle track.

Hardest – Manually Create Subtitles

SRT files are simply text files. With a text editor and any video player it is possible to manually create a subtitle file. Perhaps if you are trained as a court reporter and can transcript spoken language in real time you can do this fairly easily. But for me, and I suspect most people, this is a horribly long and tedious process.

I have only ever completed this process on a short (10 minute) home video. I would not even consider trying this on a movie.

While ugly and a little technical to set up mplayer with it, Jubler seems to be the best free subtitle editing app. If the mplayer integration is setup there is a pretty nice visual display of the sound track that really helps get the timing and dialog correct.