Home : Linux Digital Fansubbing Guide

Linux Digital Fansubbing Guide

David Jao
djao@dominia.org

July 3, 2003

Revision History
Revision v2.02003-07-03
Major rewrite, focusing on transcode and DVD burning
Revision v1.32002-10-19
WAV timing in Linux is now possible.

This article describes how to fansub DVD video on a Linux platform.

1. Introduction

2. Overview of the process

3. Scripts

4. Subtitling

5. Mastering the DVD

6. Questions & Answers

1. Introduction

1.1 What is this guide about?

Fansubbing, a.k.a. fan subtitling, is the process of taking a video or movie in a foreign language and adding your own subtitles to it so that people who don't know the original language can watch and enjoy the work. This guide describes how to produce DVD to DVD fansubs on the Linux platform using free software. Because I am only familiar with NTSC video, the discussion is restricted to the NTSC format only.

1.2 System requirements

Here is what you need to get started.

Hardware requirements

You need a computer with a DVD burner. My expertise is with DVD-R, because that's what I use, but most of it applies equally well to DVD+R once you have dvd+rw-tools working (in fact only the last step is different). DVD-RAM is not supported by most DVD players, and is not discussed here.

If you don't have a DVD burner, you might want to read the horribly outdated previous version of this guide, which focuses on fansubbing to CD media. The current version and all future versions of the guide will cover DVD media only.

Your computer needs to have reasonably modern CPU speed and disk capacity. Reasonable minimums might be a Pentium III 500MHz CPU and 10 GB of free disk space (this is just an estimate; I have never tried actually using such a system for fansubbing). There is no upper limit on what is useful--video work is a very demanding application capable of using up whatever resources you can throw at it.

Software requirements

To keep the scope of the document reasonable, this guide assumes that you use Red Hat Linux or Fedora Linux.

Install the appropriate apt package matching your Red Hat or Fedora distribution, and run as root the command

apt-get install mplayer transcode dvdrecord mjpegtools
Fedora users should replace dvdrecord with cdrecord. You will also need to install the dvdauthor program.

1.3 Copyright & author information

Copyright © 2003 David Jao <djao@dominia.org>.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

All comments, questions, and suggestions are welcome and should be sent to djao@dominia.org.

1.4 Legal disclaimer

Some of the programs whose usage is documented in this guide can optionally make use of CSS decryption routines for decrypting the contents of encrypted DVDs. Although CSS decryption software is legal in most countries, its legal status in the US is in doubt in light of the MPAA v. Reimerdes, Corley and Kazan court ruling. THE AUTHOR OF THIS GUIDE DOES NOT SANCTION THE ILLEGAL USE OF CSS DECRYPTION SOFTWARE IN THE UNITED STATES. READERS WHO ARE SUBJECT TO US LAW MUST AGREE TO LIMIT THE APPLICATION OF THE TECHNIQUES IN THIS GUIDE TO UNENCRYPTED OR LAWFULLY ACCESSIBLE DISCS ONLY.

Regardless of the legal status of CSS decryption, the copyright laws in most countries do not allow unlicensed copying of DVD videos except in very limited "fair use" contexts. THE AUTHOR OF THIS GUIDE DOES NOT SANCTION THE ILLEGAL COPYING OF COPYRIGHTED DVDS.

2. Overview of the process

Making a DVD fansub consists of the following steps:

  1. Rip the DVD video program to the hard disk.
  2. Obtain a subtitle script with timings.
  3. Using the subtitle script, encode a new copy of the video with the subtitles included.
  4. Burn the subtitled video to DVD.

We will give very basic methods for doing all of the above.

2.1 How well does it work?

I am a pretty devoted user of Linux (I have to be, or I wouldn't be writing this guide), but even I have to admit that Linux will always trail other platforms such as Windows and especially Macintosh for multimedia work. Making Linux do fansubbing is like watching a dog walk on its hind legs: one does not marvel that it can be done well, one marvels that it can be done at all.

You will not get fancy cutting edge special effects or large selections of fonts and character sets in Linux. What you do get is rock solid basic text subtitling that looks unbelievably good on a TV screen.

3. Scripts

A script is a file containing the text of the subtitles you want to display along with the exact times that each subtitle is to be displayed on the screen. Different subtitling programs use different file formats for their scripts. The subtitling program that we use, called transcode, supports a single script format, called PPML. Unfortunately, nobody outside of the Linux world uses PPML scripts, so we will first need to convert scripts to PPML format before we can use them. Don't worry about this too much, it will all be covered in the following sections.

There are basically three ways to obtain a script for a DVD:

3.1 Downloading timed scripts

Occasionally it is possible to find the scripts that you want on the internet. For example, scripts for many of the most popular anime titles are available at http://www.scriptclub.org/.

Because of the great diversity of users on the internet, the various scripts available on the internet come in a wide variety of file formats. The best strategy here is to use mplayer to convert the script to SRT format, and then afterwards convert the SRT script to PPML format as described in this section.

Mplayer supports about a dozen different formats, which are listed here. You can convert any of these formats to SRT by using the -dumpsrtsub option in mplayer. For example,

mplayer -sub example.ssa -dumpsrtsub -dvd 1

The conversion process is not perfect because mplayer (and transcode for that matter) only supports one subtitle stream. Newer versions of mplayer make some attempt to simulate multiple subtitle streams within a single stream, but in most cases overlapping subtitles still require manual intervention to reconstruct. Also, mplayer does not support any formatting capabilities, so as a rule all formatting in the original script is lost as well.

3.2 Ripping subtitles off of the DVD

A lot of DVDs are starting to include subtitles on the disc itself. If your DVD comes with subtitles in the language you want, then you can rip those subtitles off of the disc and essentially get yourself a script for free. This procedure is useful even if the subtitles on the DVD are not in the same language as the language you want to subtitle in, since the timings on the disc will still be accurate even if the language is wrong.

The mplayer program comes with a subrip.c program in the TOOLS/ subdirectory, which is what we will use to rip the subtitles. You have to change a couple of lines in this program in order to get it to work:

  1. Change the #define GOCR_PROGRAM ... line to point to the actual installed location of the gocr program on your system.
  2. In the sprintf(cmd, GOCR_PROGRAM" ... line, get rid of all the -m options.

Then compile it according to the instructions at the top of the file.

You also need to have gocr installed.

Now you are ready to rip the subtitles off of your DVD:

rm frameno.avi
mencoder -dvd T -vobsubout subtitles -vobsuboutindex 0 -sid N -o frameno.avi -ovc frameno -nosound
subrip subtitles 0 script.srt
where T is the title you want to rip, N is the subtitle number you want (see mplayer manpage), and script.srt is the output file for the subtitles.

Note that the gocr program is not very good at OCR, so you will definitely have to edit script.srt with a text editor and correct all of the character recognition mistakes that gocr makes. If the subtitles are not in a language that at least uses the Roman alphabet, then the gocr output is completely useless, but the times listed in the file should still be accurate.

The script.srt file is in SubRip (SRT) format, which can be converted to PPML as described in the conversion section.

3.3 Making your own script

In order to make your own script, you need to record all the lines of dialogue in the movie, translate them, and then record the start time and end time of each line so that you know when to have the subtitle appear and disappear. While the translation process is straightforward (assuming you know the language...), the task of timing the dialogue is almost impossible without specialized software to help you out. Here you have two options. You can use Windows software, which is what everybody else does, or you can be brave and try to do it in Linux as I describe below.

Obtaining a WAV file

You need to capture the audio from your DVD into a WAV file in order to be able to use the audio for timing. To do this, put the DVD into your drive and type:

mplayer -dvd 1 -vc null -vo null -ao pcm

to extract the audio from title number 1 into the file audiodump.wav (for another title number, change the number 1 to whatever number it is that you want). If your DVD title has multiple audio tracks, you may need to use some combination of the -alang or -aid options to get mplayer to extract the right audio track. You might also need to use the -chapter option if you only want to extract specific chapters.

Timing in Windows

To time the dialogue in windows, use the freeware Windows program Sub Station Alpha to time the dialogue from the above .wav file as described in the Sub Station Alpha documentation. The Karinkuru Guide To Subtitling also has many useful tips on how to use Sub Station Alpha for timing.

If you use Sub Station Alpha for timing then you will end up with a script in Sub Station Alpha format. You can then use mplayer to convert the resulting script into SRT format, as described here.

Timing in Linux

I have a separate page explaining the setup that I use for WAV timing in Linux. This procedure does work, and it is very easy for me to use, but most people will probably find it too hard to get running.

Another alternative is to use xste, which claims to perform WAV timing, but I have never gotten this software working.

4. Subtitling

4.1 Converting SRT to PPML

If you followed any one of the procedures in the previous section, you should have in your possession a timed script of your subtitles in the SRT format. To convert the SRT script to a PPML script for use in transcode, use this perl program here. Note that this program was a weekend hack (and I didn't even spend the whole weekend on it), so don't expect too much out of it.

The program reads in an SRT script from standard input and outputs the converted PPML script on standard output. It takes a single command line argument -framerate for the frame rate of the video you are subtitling. The default value of the frame rate is 29.97, the NTSC video standard. If you are subtitling film source that has undergone 3:2 pulldown, you will need to use a frame rate of 23.976 instead (see Which frame rate should I use?).

In summary, for normal NTSC TV source material, use:

srt-to-ppml.pl -framerate=29.97 < script.srt > script.ppml

For 3:2 pulldown converted film source material, use:

srt-to-ppml.pl -framerate=23.976 < script.srt > script.ppml

The resulting PPML script needs an additional header containing the configuration parameters for the subtitles. This header needs to be appended to the beginning of the script. I typically use the following header:

*subtitle subtitle
1 *subtitle font_dir=/usr/share/mplayer/iso-8859-1/arial-28
2 *subtitle color=290 sat=70 contr=100 outline=5 vfactor=.114 hfactor=.09

which configures the subtitling engine to use yellow 28 point arial subtitles with 11.4% vertical clearance and 9% horizontal clearance from the screen edges.

4.2 Ripping the DVD

You should use tccat, which is part of the transcode package. Use it like this:

tccat -i /dev/dvd -T A,B-C,D > video.vob
where

Don't worry about selecting the audio track; the tccat program rips all of the audio tracks for you at once.

4.3 Encoding the subtitles

For this task we will use transcode together with our newly made PPML script file. Note that this step only encodes video, not audio. See the next section for directions regarding the audio.

The basic formula is something like:

transcode -J                             \
     ivtc,32detect=force_mode=3,decimate,\
    subtitler="subtitle_file=script.ppml \
    color_depth=32"                      \
    -i video.vob                         \
    -M 0                                 \
    -o output                            \
    --a52_drc_off                        \
    --no_split                           \
    -V                                   \
    -x vob                               \
    -y mpeg2enc,null                     \
    --pulldown                           \
    -F 1,                                \
    "-I1                                 \
    -b6000                               \
    -q4                                  \
    -s                                   \
    -4 2 -2 1"                           \
    -f8                                  \
    -a2

This long command can be a bit intimidating, so we break it down line by line:

ivtc,32detect=force_mode=3,decimate,

This line activates the inverse pulldown filter in transcode. Inverse pulldown is usually needed if you are subtitling anime and if you are using an input frame rate of 29.97. Omit this line if you are not using inverse pulldown.

See also:

subtitler="subtitle_file=script.ppml

Replace script.ppml with the actual filename of the PPML script containing your timed subtitles.

color_depth=32"

Leave this alone.

-i video.vob

Replace video.vob with the actual filename of the VOB file that you ripped in the previous step.

-M 0

This flag controls NTSC audio/video synchronization. Change this to -M 2 if you are not using inverse pulldown.

-o output

Replace output with the desired filename for the encoded output video.

--a52_drc_off
--no_split
-V
-x vob

Leave these settings alone.

-y mpeg2enc,null

These two parameters control the video and audio codecs used for output. In this example we have the mpeg2 video codec combined with the null audio codec, because we will be handling the audio separately.

--pulldown

This flag should be set if you are using 23.976 framerate film source material, or if you are using inverse pulldown. Omit this line otherwise.

-F 1,

This line indicates the output framerate of the video. Use -F 1 when you are encoding film source material having input framerate of 23.976, or when you are using inverse pulldown. Change this to -F 4 if you are encoding NTSC TV source material and not using inverse pulldown.

"-I1

This flag indicates whether or not the source material is interlaced. For interlaced input material, you should use -I1. Note that -I1 will also work correctly even on uninterlaced input, at the cost of unnecessarily increasing encoding time. Thus -I1 is the safest choice.

If you are using inverse pulldown and you are absolutely sure your inverse pulldown is perfect, then you might use -I0 to save time.

-b6000

This number controls the bitrate of the output video. In this example a bitrate of 6000 kbps is used. Higher numbers give higher quality but increase the size of the output file.

-q4
-s
-4 2 -2 1

Leave these alone.

-f8

This flag sets the output format of the video. DVD uses the -f8 flag; SVCD uses -f4, and VCD uses -f1 (but the use of the latter two is not covered in this guide).

-a2

This setting controls the aspect ratio of the output. Standard TV programs use -a2. Change this to -a3 if your input material is (anamorphic) widescreen.

4.4 Extracting the audio

When fansubbing from DVD to DVD, the best way to handle the audio is to extract the original compressed audio from the original DVD. That way, no reencoding is needed and features like surround sound will be unaffected by the transfer.

Typically, extraction of audio is done with tcextract, part of the transcode package. For example,

tcextract -i video.vob -a 0 -x ac3 -t vob > audio.ac3
will extract AC3 audio track #0 from the VOB file video.vob into the file audio.ac3. If you want a different audio track, change the -a option. If your DVD uses MPEG audio or PCM instead of AC3 audio, use -x mp3 or -x pcm instead. You get the idea.

5. Mastering the DVD

5.1 Multiplex

By now, you should have assembled a script file in PPML format, encoded the video into an mpeg video file (in our example, output.m2v), and extracted the audio to a separate file (audio.ac3 in our example). The next step is to combine the audio and video tracks into a single mpeg file, using mplex from the mjpegtools suite:

mplex -f 8 -M -V -o output-%d.mpg output.m2v audio.ac3

By default, mplex will split up files into chunks that are no larger than 2GB. In the above example, the chunks will be saved as output-1.mpg, output-2.mpg, etc. This behavior is an outdated defensive tactic intended to protect against systems with inadequate support for large files. Since modern linux is not such a system, it is recommended to recombine the files into one large output.mpg file to simplify the next step:

cat output-*.mpg > output.mpg

5.2 DVD layout

Find or make a directory with lots of free space (such as /dvd), and run the following series of commands from the dvdauthor suite:

dvddirgen -o /dvd
dvdauthor -o /dvd -a ac3+ja output.mpg
dvdauthor -o /dvd -T
Replace ac3 with mp2 or pcm if your audio track is one of these formats, and replace ja with the two letter language code of the actual language used in the audio track (en for English, ja for Japanese, fr for French, de for German, and so on).

Multiple chapters

There are two ways to create a DVD having multiple chapters. The simplest is to give multiple MPEG files on the dvdauthor command line, e.g.:

dvdauthor -o /dvd -a ac3+ja output1.mpg output2.mpg output3.mpg output4.mpg
The other way is to pass a list of comma-separated H:MM:SS.SS times to dvdauthor using the --chapter flag, e.g.:
dvdauthor -o /dvd -a ac3+ja --chapter 0:24:36.52,0:48:15.98,1:12:42.10 output.mpg

5.3 DVD burning

Make an ISO filesystem:

mkisofs -dvd-video -udf -r -o /dvd.iso /dvd

and burn it using dvdrecord if you have a DVD-R drive:

dvdrecord -v -dao -speed=4 -dev=0,0,0 -driveropts=burnproof /dvd.iso

or using growisofs if you have a DVD+R drive:

dvd+rw-format -f /dev/scd0
growisofs -Z /dev/scd0=/dvd.iso

You're done. Enjoy!

6. Questions & Answers

6.1 How do I tell what frame rate to use?

Play the DVD using mplayer -dvd N where N is the title number. Pay attention to the status line, which will look something like this:

A: 391.8 V: 391.8 A-V: -0.002 ct: -0.104  200/197  21%  9% 25.0% 6 0 0%
The first two numbers tell you the time elapsed in the DVD. The two numbers in the middle separated by a slash tell you the number of frames played (they may be slightly different, because the first number includes NTSC synchronization compensation and the second number does not). Pause the movie, note the frame number, play the movie for ten more seconds, note the new frame number, and see whether the two differ by 240 frames or 300 frames. In the first case, your frame rate is 23.976; in the second case, 29.97. If you need more time, play the movie for twenty or thirty seconds and multiply accordingly.

It is important that you perform this experiment on the actual movie content itself and not the introductory splash screen that may be present on the DVD, since the splash screens very often use a different frame rate from the movie itself.

6.2 What is inverse pulldown?

To understand inverse pulldown, you first need to understand pulldown.

Regular television video is interlaced, meaning that the odd numbered scanlines are displayed first, followed by the even numbered scanlines, then the odd ones again, then the even ones again, etc. Each individual line is displayed 30 times a second, but because of the interlacing, the television image as a whole is refreshed 60 times a second, with only half of the total lines being refreshed each time.

For regular television video, there is no way to recover perfect video frames, because no matter where you are in the video the odd numbered lines are always offset 1/60th of a second from the even numbered lines. There is a technique called deinterlacing which can approximately reconstruct original video frames based on the limited information available in the interlaced video, but deinterlacing is only an approximation and using it will result in some loss of quality.

However, most theatrical movies and anime shows are not filmed using television cameras. Instead they are filmed with film cameras (or, in the case of anime, drawn by hand) at a non-interlaced frame rate of 24 frames per second. In order to display these on TV screens, the 24 frames per second is deliberately interlaced to 30 frames per second using a process called pulldown, also known as telecine, which I will not explain because it is much better explained here or here.

What makes all this important is that, unlike regular television interlacing, the pulldown process can usually be perfectly reversed provided that you know it is there and you have a filter specifically designed to reverse it (that is, NOT a deinterlace filter--most deinterlace filters are not specifically designed to handle telecine). The reversal process is called inverse telecine or inverse pulldown.

Here are some frame grabs illustrating the above concepts (click for full frames):

A pulldown-interlaced frame

After deinterlacing is applied

After inverse pulldown is applied

6.3 How do I tell if inverse pulldown is working?

Follow the directions in the video encoding section, but change the line 32detect=force_mode=3 to 32detect=force_mode=3:verbose instead, leaving everything else the same. You will get a bunch of output on the screen that looks something like:

(0) frame [044265]: (1) =   243 | (2) =   245 | (3) =   1 | interlaced = no
(0) frame [044266]: (1) =   199 | (2) =   240 | (3) =   1 | interlaced = no
(0) frame [044267]: (1) =   215 | (2) =   230 | (3) =   1 | interlaced = no
(0) frame [044268]: (1) =   235 | (2) =   229 | (3) =   1 | interlaced = no
(0) frame [044269]: (1) =   220 | (2) =   244 | (3) =   1 | interlaced = no
(0) frame [044270]: (1) =   243 | (2) =   249 | (3) =   1 | interlaced = no
(0) frame [044271]: (1) =   230 | (2) =   256 | (3) =   1 | interlaced = no
(0) frame [044272]: (1) =   206 | (2) =   218 | (3) =   1 | interlaced = no
(0) frame [044273]: (1) =   213 | (2) =   200 | (3) =   1 | interlaced = no
(0) frame [044274]: (1) =   223 | (2) =   180 | (3) =   1 | interlaced = no
(0) frame [044275]: (1) =   188 | (2) =   192 | (3) =   1 | interlaced = no
(0) frame [044276]: (1) =   210 | (2) =   219 | (3) =   1 | interlaced = no
(0) frame [044277]: (1) =   238 | (2) =   218 | (3) =   1 | interlaced = no
(0) frame [044278]: (1) =   211 | (2) =   202 | (3) =   1 | interlaced = no
(0) frame [044279]: (1) =   211 | (2) =   199 | (3) =   1 | interlaced = no
(0) frame [044280]: (1) =   223 | (2) =   224 | (3) =   1 | interlaced = no
(0) frame [044281]: (1) =   178 | (2) =   241 | (3) =   1 | interlaced = no
If every row has interlaced = no in the right hand column, then you have succeeded in reversing the pulldown and removing the interlacing.

For comparison, here is an example of an unsuccessful attempt at inverse pulldown:

(0) frame [001042]: (1) = 55619 | (2) = 56592 | (3) = 324 | interlaced = yes
(0) frame [001043]: (1) = 27924 | (2) = 28296 | (3) = 162 | interlaced = yes
(0) frame [001044]: (1) = 26425 | (2) = 26147 | (3) = 152 | interlaced = yes
(0) frame [001045]: (1) = 30923 | (2) = 31739 | (3) = 181 | interlaced = yes
(0) frame [001046]: (1) = 21860 | (2) = 22136 | (3) = 127 | interlaced = yes
(0) frame [001047]: (1) = 22056 | (2) = 22412 | (3) = 128 | interlaced = yes
(0) frame [001048]: (1) = 16845 | (2) = 16770 | (3) =  97 | interlaced = yes
(0) frame [001049]: (1) = 13148 | (2) = 12792 | (3) =  75 | interlaced = yes
(0) frame [001050]: (1) = 13294 | (2) = 12528 | (3) =  74 | interlaced = yes
(0) frame [001051]: (1) = 12787 | (2) = 12408 | (3) =  72 | interlaced = yes
(0) frame [001052]: (1) = 12216 | (2) = 11771 | (3) =  69 | interlaced = yes
(0) frame [001053]: (1) = 15243 | (2) = 15207 | (3) =  88 | interlaced = yes
(0) frame [001054]: (1) = 18621 | (2) = 18998 | (3) = 108 | interlaced = yes
(0) frame [001055]: (1) = 17195 | (2) = 17149 | (3) =  99 | interlaced = yes
(0) frame [001056]: (1) = 16599 | (2) = 16748 | (3) =  96 | interlaced = yes
(0) frame [001057]: (1) = 13009 | (2) = 12705 | (3) =  74 | interlaced = yes
(0) frame [001058]: (1) = 10831 | (2) = 11430 | (3) =  64 | interlaced = yes

6.4 How do I make a timecoded video?

Create a PPML script containing the following four lines:

*counter frame_counter
1 *counter font_dir=/usr/share/mplayer/iso-8859-1/arial-28
2 *counter xpos=20 ypos=20
3 *counter sat=80.0 contr=70
Subtitle your video using this script, and it will place a frame counter in the upper left hand corner of the video.

6.5 What are some other ways to subtitle in Linux?

You can use the mencoder program included with mplayer to subtitle using the mplayer engine. Most of this material is covered pretty thoroughly in the previous version of this guide. The subtitles produced by mencoder look better on a computer screen than transcode, but worse on a TV screen. On the other hand, mencoder is for now the only program that supports subtitles in asian language character sets.

The xste program combined with submux-dvd allows you to generate soft (switchable) DVD subtitles that can be turned on or off by the DVD player. If anyone who speaks English figures out how to get this combination working, let me know.

The subtitler-yuv program is a standalone alternative to the subtitler plugin in transcode. It uses the same PPML script format, but instead of being a transcode filter it functions as a standalone program taking input video on STDIN and writing subtitled video on STDOUT. This strategy has the major advantage that the same video can be chained through multiple instantiations of subtitler-yuv, to imprint multiple subtitle streams onto a single video.