A script is a file containing the text of the subtitles you want to display along with the exact times that each subtitle is to be displayed on the screen. Different subtitling programs use different file formats for their scripts. The subtitling program that we use, called transcode, supports a single script format, called PPML. Unfortunately, nobody outside of the Linux world uses PPML scripts, so we will first need to convert scripts to PPML format before we can use them. Don't worry about this too much, it will all be covered in the following sections.
There are basically three ways to obtain a script for a DVD:
Occasionally it is possible to find the scripts that you want on the internet. For example, scripts for many of the most popular anime titles are available at http://www.scriptclub.org/.
Because of the great diversity of users on the internet, the various
scripts available on the internet come in a wide variety of file
formats. The best strategy here is to use mplayer
to convert the
script to SRT format, and then afterwards convert the SRT script to
PPML format as described in
this section.
Mplayer supports about a dozen different formats, which are
listed here.
You can convert any of these formats to SRT by using the
-dumpsrtsub
option in mplayer. For example,
mplayer -sub example.ssa -dumpsrtsub -dvd 1
The conversion process is not perfect because mplayer
(and
transcode
for that matter) only supports one subtitle stream.
Newer versions of mplayer make some attempt to simulate multiple
subtitle streams within a single stream, but in most cases overlapping
subtitles still require manual intervention to reconstruct. Also,
mplayer
does not support any formatting capabilities, so as a
rule all formatting in the original script is lost as well.
A lot of DVDs are starting to include subtitles on the disc itself. If your DVD comes with subtitles in the language you want, then you can rip those subtitles off of the disc and essentially get yourself a script for free. This procedure is useful even if the subtitles on the DVD are not in the same language as the language you want to subtitle in, since the timings on the disc will still be accurate even if the language is wrong.
The mplayer
program comes with a subrip.c
program in the
TOOLS/
subdirectory, which is what we will use to rip the
subtitles. You have to change a couple of lines in this program in order
to get it to work:
#define GOCR_PROGRAM ...
line to point to the
actual installed location of the gocr
program on your system.sprintf(cmd, GOCR_PROGRAM" ...
line, get rid of all
the -m
options.Then compile it according to the instructions at the top of the file.
You also need to have gocr installed.
Now you are ready to rip the subtitles off of your DVD:
rm frameno.avi
mencoder -dvd T -vobsubout subtitles -vobsuboutindex 0 -sid N -o frameno.avi -ovc frameno -nosound
subrip subtitles 0 script.srt
where T
is the title you want to rip, N
is the subtitle number
you want (see mplayer
manpage), and script.srt
is the output
file for the subtitles.
Note that the gocr
program is not very good at OCR, so you will
definitely have to edit script.srt
with a text editor and correct
all of the character recognition mistakes that gocr
makes. If the
subtitles are not in a language that at least uses the Roman alphabet,
then the gocr
output is completely useless, but the times listed
in the file should still be accurate.
The script.srt
file is in
SubRip (SRT) format,
which can be converted to PPML as described in the
conversion section.
In order to make your own script, you need to record all the lines of dialogue in the movie, translate them, and then record the start time and end time of each line so that you know when to have the subtitle appear and disappear. While the translation process is straightforward (assuming you know the language...), the task of timing the dialogue is almost impossible without specialized software to help you out. Here you have two options. You can use Windows software, which is what everybody else does, or you can be brave and try to do it in Linux as I describe below.
You need to capture the audio from your DVD into a WAV file in order to be able to use the audio for timing. To do this, put the DVD into your drive and type:
mplayer -dvd 1 -vc null -vo null -ao pcm
to extract the audio from title number 1 into the file
audiodump.wav
(for another title number, change the number 1 to
whatever number it is that you want). If your DVD title has multiple
audio tracks, you may need to use some combination of the -alang
or -aid
options to get mplayer to extract the right audio track.
You might also need to use the -chapter
option if you only want
to extract specific chapters.
To time the dialogue in windows, use the freeware Windows program
Sub Station Alpha to time the dialogue from the above .wav
file as
described in the
Sub Station Alpha documentation. The
Karinkuru Guide To Subtitling also has many useful tips on how to use Sub
Station Alpha for timing.
If you use Sub Station Alpha for timing then you will end up with a
script in Sub Station Alpha format. You can then use mplayer
to
convert the resulting script into SRT format, as described
here.
I have a separate page explaining the setup that I use for WAV timing in Linux. This procedure does work, and it is very easy for me to use, but most people will probably find it too hard to get running.
Another alternative is to use xste, which claims to perform WAV timing, but I have never gotten this software working.