Home : Linux Digital Fansubbing Guide
djao@dominia.org
Revision History | |
---|---|
Revision v2.0 | 2003-07-03 |
Major rewrite, focusing on transcode and DVD burning | |
Revision v1.3 | 2002-10-19 |
WAV timing in Linux is now possible. |
Fansubbing, a.k.a. fan subtitling, is the process of taking a video or movie in a foreign language and adding your own subtitles to it so that people who don't know the original language can watch and enjoy the work. This guide describes how to produce DVD to DVD fansubs on the Linux platform using free software. Because I am only familiar with NTSC video, the discussion is restricted to the NTSC format only.
Here is what you need to get started.
You need a computer with a DVD burner. My expertise is with DVD-R, because that's what I use, but most of it applies equally well to DVD+R once you have dvd+rw-tools working (in fact only the last step is different). DVD-RAM is not supported by most DVD players, and is not discussed here.
If you don't have a DVD burner, you might want to read the horribly outdated previous version of this guide, which focuses on fansubbing to CD media. The current version and all future versions of the guide will cover DVD media only.
Your computer needs to have reasonably modern CPU speed and disk capacity. Reasonable minimums might be a Pentium III 500MHz CPU and 10 GB of free disk space (this is just an estimate; I have never tried actually using such a system for fansubbing). There is no upper limit on what is useful--video work is a very demanding application capable of using up whatever resources you can throw at it.
To keep the scope of the document reasonable, this guide assumes that you use Red Hat Linux or Fedora Linux.
Install the appropriate apt package matching your Red Hat or Fedora distribution, and run as root the command
apt-get install mplayer transcode dvdrecord mjpegtools
Fedora users should replace dvdrecord
with
cdrecord
. You will also need to install the dvdauthor
program.
Copyright © 2003 David Jao <djao@dominia.org>.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.
All comments, questions, and suggestions are welcome and should be sent to djao@dominia.org.
Some of the programs whose usage is documented in this guide can optionally make use of CSS decryption routines for decrypting the contents of encrypted DVDs. Although CSS decryption software is legal in most countries, its legal status in the US is in doubt in light of the MPAA v. Reimerdes, Corley and Kazan court ruling. THE AUTHOR OF THIS GUIDE DOES NOT SANCTION THE ILLEGAL USE OF CSS DECRYPTION SOFTWARE IN THE UNITED STATES. READERS WHO ARE SUBJECT TO US LAW MUST AGREE TO LIMIT THE APPLICATION OF THE TECHNIQUES IN THIS GUIDE TO UNENCRYPTED OR LAWFULLY ACCESSIBLE DISCS ONLY.
Regardless of the legal status of CSS decryption, the copyright laws in most countries do not allow unlicensed copying of DVD videos except in very limited "fair use" contexts. THE AUTHOR OF THIS GUIDE DOES NOT SANCTION THE ILLEGAL COPYING OF COPYRIGHTED DVDS.
Making a DVD fansub consists of the following steps:
We will give very basic methods for doing all of the above.
I am a pretty devoted user of Linux (I have to be, or I wouldn't be writing this guide), but even I have to admit that Linux will always trail other platforms such as Windows and especially Macintosh for multimedia work. Making Linux do fansubbing is like watching a dog walk on its hind legs: one does not marvel that it can be done well, one marvels that it can be done at all.
You will not get fancy cutting edge special effects or large selections of fonts and character sets in Linux. What you do get is rock solid basic text subtitling that looks unbelievably good on a TV screen.
A script is a file containing the text of the subtitles you want to display along with the exact times that each subtitle is to be displayed on the screen. Different subtitling programs use different file formats for their scripts. The subtitling program that we use, called transcode, supports a single script format, called PPML. Unfortunately, nobody outside of the Linux world uses PPML scripts, so we will first need to convert scripts to PPML format before we can use them. Don't worry about this too much, it will all be covered in the following sections.
There are basically three ways to obtain a script for a DVD:
Occasionally it is possible to find the scripts that you want on the internet. For example, scripts for many of the most popular anime titles are available at http://www.scriptclub.org/.
Because of the great diversity of users on the internet, the various
scripts available on the internet come in a wide variety of file
formats. The best strategy here is to use mplayer
to convert the
script to SRT format, and then afterwards convert the SRT script to
PPML format as described in
this section.
Mplayer supports about a dozen different formats, which are
listed here.
You can convert any of these formats to SRT by using the
-dumpsrtsub
option in mplayer. For example,
mplayer -sub example.ssa -dumpsrtsub -dvd 1
The conversion process is not perfect because mplayer
(and
transcode
for that matter) only supports one subtitle stream.
Newer versions of mplayer make some attempt to simulate multiple
subtitle streams within a single stream, but in most cases overlapping
subtitles still require manual intervention to reconstruct. Also,
mplayer
does not support any formatting capabilities, so as a
rule all formatting in the original script is lost as well.
A lot of DVDs are starting to include subtitles on the disc itself. If your DVD comes with subtitles in the language you want, then you can rip those subtitles off of the disc and essentially get yourself a script for free. This procedure is useful even if the subtitles on the DVD are not in the same language as the language you want to subtitle in, since the timings on the disc will still be accurate even if the language is wrong.
The mplayer
program comes with a subrip.c
program in the
TOOLS/
subdirectory, which is what we will use to rip the
subtitles. You have to change a couple of lines in this program in order
to get it to work:
#define GOCR_PROGRAM ...
line to point to the
actual installed location of the gocr
program on your system.sprintf(cmd, GOCR_PROGRAM" ...
line, get rid of all
the -m
options.Then compile it according to the instructions at the top of the file.
You also need to have gocr installed.
Now you are ready to rip the subtitles off of your DVD:
rm frameno.avi
mencoder -dvd T -vobsubout subtitles -vobsuboutindex 0 -sid N -o frameno.avi -ovc frameno -nosound
subrip subtitles 0 script.srt
where T
is the title you want to rip, N
is the subtitle number
you want (see mplayer
manpage), and script.srt
is the output
file for the subtitles.
Note that the gocr
program is not very good at OCR, so you will
definitely have to edit script.srt
with a text editor and correct
all of the character recognition mistakes that gocr
makes. If the
subtitles are not in a language that at least uses the Roman alphabet,
then the gocr
output is completely useless, but the times listed
in the file should still be accurate.
The script.srt
file is in
SubRip (SRT) format,
which can be converted to PPML as described in the
conversion section.
In order to make your own script, you need to record all the lines of dialogue in the movie, translate them, and then record the start time and end time of each line so that you know when to have the subtitle appear and disappear. While the translation process is straightforward (assuming you know the language...), the task of timing the dialogue is almost impossible without specialized software to help you out. Here you have two options. You can use Windows software, which is what everybody else does, or you can be brave and try to do it in Linux as I describe below.
You need to capture the audio from your DVD into a WAV file in order to be able to use the audio for timing. To do this, put the DVD into your drive and type:
mplayer -dvd 1 -vc null -vo null -ao pcm
to extract the audio from title number 1 into the file
audiodump.wav
(for another title number, change the number 1 to
whatever number it is that you want). If your DVD title has multiple
audio tracks, you may need to use some combination of the -alang
or -aid
options to get mplayer to extract the right audio track.
You might also need to use the -chapter
option if you only want
to extract specific chapters.
To time the dialogue in windows, use the freeware Windows program
Sub Station Alpha to time the dialogue from the above .wav
file as
described in the
Sub Station Alpha documentation. The
Karinkuru Guide To Subtitling also has many useful tips on how to use Sub
Station Alpha for timing.
If you use Sub Station Alpha for timing then you will end up with a
script in Sub Station Alpha format. You can then use mplayer
to
convert the resulting script into SRT format, as described
here.
I have a separate page explaining the setup that I use for WAV timing in Linux. This procedure does work, and it is very easy for me to use, but most people will probably find it too hard to get running.
Another alternative is to use xste, which claims to perform WAV timing, but I have never gotten this software working.
If you followed any one of the procedures in the previous section, you should have in your possession a timed script of your subtitles in the SRT format. To convert the SRT script to a PPML script for use in transcode, use this perl program here. Note that this program was a weekend hack (and I didn't even spend the whole weekend on it), so don't expect too much out of it.
The program reads in an SRT script from standard input and outputs the
converted PPML script on standard output. It takes a single command
line argument -framerate
for the frame rate of the video you are
subtitling. The default value of the frame rate is 29.97, the NTSC
video standard. If you are subtitling film source that has undergone
3:2 pulldown, you will need to use a frame rate of 23.976 instead (see
Which frame rate should I use?).
In summary, for normal NTSC TV source material, use:
srt-to-ppml.pl -framerate=29.97 < script.srt > script.ppml
For 3:2 pulldown converted film source material, use:
srt-to-ppml.pl -framerate=23.976 < script.srt > script.ppml
The resulting PPML script needs an additional header containing the configuration parameters for the subtitles. This header needs to be appended to the beginning of the script. I typically use the following header:
*subtitle subtitle
1 *subtitle font_dir=/usr/share/mplayer/iso-8859-1/arial-28
2 *subtitle color=290 sat=70 contr=100 outline=5 vfactor=.114 hfactor=.09
which configures the subtitling engine to use yellow 28 point arial subtitles with 11.4% vertical clearance and 9% horizontal clearance from the screen edges.
You should use tccat
, which is part of the transcode
package. Use it like this:
tccat -i /dev/dvd -T A,B-C,D > video.vob
where
A
is the title that you want to ripB
is the chapter that you want to start fromC
is the chapter that you want to end withD
is the angle that you want (typically, 1)video.vob
is the file where you want the ripped output to goDon't worry about selecting the audio track; the tccat
program
rips all of the audio tracks for you at once.
For this task we will use transcode
together with our newly made
PPML script file. Note that this step only encodes video, not audio.
See the
next section for directions regarding
the audio.
The basic formula is something like:
transcode -J \
ivtc,32detect=force_mode=3,decimate,\
subtitler="subtitle_file=script.ppml \
color_depth=32" \
-i video.vob \
-M 0 \
-o output \
--a52_drc_off \
--no_split \
-V \
-x vob \
-y mpeg2enc,null \
--pulldown \
-F 1, \
"-I1 \
-b6000 \
-q4 \
-s \
-4 2 -2 1" \
-f8 \
-a2
This long command can be a bit intimidating, so we break it down line by line:
ivtc,32detect=force_mode=3,decimate,
This line activates the inverse pulldown filter in transcode. Inverse pulldown is usually needed if you are subtitling anime and if you are using an input frame rate of 29.97. Omit this line if you are not using inverse pulldown.
See also:
subtitler="subtitle_file=script.ppml
Replace script.ppml
with the actual filename of the PPML script
containing your timed subtitles.
color_depth=32"
Leave this alone.
-i video.vob
Replace video.vob
with the actual filename of the VOB file that
you ripped in
the previous step.
-M 0
This flag controls NTSC audio/video synchronization. Change this
to -M 2
if you are not using inverse pulldown.
-o output
Replace output
with the desired filename for the encoded
output video.
--a52_drc_off
--no_split
-V
-x vob
Leave these settings alone.
-y mpeg2enc,null
These two parameters control the video and audio codecs used for output. In this example we have the mpeg2 video codec combined with the null audio codec, because we will be handling the audio separately.
--pulldown
This flag should be set if you are using 23.976 framerate film source material, or if you are using inverse pulldown. Omit this line otherwise.
-F 1,
This line indicates the output framerate of the video. Use -F 1
when you are encoding film source material having input framerate of
23.976, or when you are using inverse pulldown. Change this to
-F 4
if you are encoding NTSC TV source material and not
using inverse pulldown.
"-I1
This flag indicates whether or not the source material is interlaced.
For interlaced input material, you should use -I1
. Note that
-I1
will also work correctly even on uninterlaced input, at the
cost of unnecessarily increasing encoding time. Thus -I1
is the
safest choice.
If you are using inverse pulldown and you are
absolutely sure your inverse pulldown is perfect, then you
might use -I0
to save time.
-b6000
This number controls the bitrate of the output video. In this example a bitrate of 6000 kbps is used. Higher numbers give higher quality but increase the size of the output file.
-q4
-s
-4 2 -2 1
Leave these alone.
-f8
This flag sets the output format of the video. DVD uses the -f8
flag; SVCD uses -f4
, and VCD uses -f1
(but the use of the
latter two is not covered in this guide).
-a2
This setting controls the aspect ratio of the output. Standard TV
programs use -a2
. Change this to -a3
if your input
material is (anamorphic) widescreen.
When fansubbing from DVD to DVD, the best way to handle the audio is to extract the original compressed audio from the original DVD. That way, no reencoding is needed and features like surround sound will be unaffected by the transfer.
Typically, extraction of audio is done with tcextract
, part of
the transcode package. For example,
tcextract -i video.vob -a 0 -x ac3 -t vob > audio.ac3
will extract AC3 audio track #0 from the VOB file video.vob
into
the file audio.ac3
. If you want a different audio track, change
the -a
option. If your DVD uses MPEG audio or PCM instead of AC3
audio, use -x mp3
or -x pcm
instead. You get the idea.
By now, you should have
assembled a script file in
PPML format,
encoded the video into an mpeg video file (in our example,
output.m2v
), and
extracted the audio to a
separate file (audio.ac3
in our example). The next step is to
combine the audio and video tracks into a single mpeg file, using
mplex
from the mjpegtools
suite:
mplex -f 8 -M -V -o output-%d.mpg output.m2v audio.ac3
By default, mplex
will split up files into chunks that are no
larger than 2GB. In the above example, the chunks will be saved as
output-1.mpg
, output-2.mpg
, etc. This behavior is an
outdated defensive tactic intended to protect against systems with
inadequate support for large files. Since modern linux is not
such a system, it is recommended to recombine the files into one large
output.mpg
file to simplify the next step:
cat output-*.mpg > output.mpg
Find or make a directory with lots of free space (such as
/dvd
), and run the following series of commands from the
dvdauthor
suite:
dvddirgen -o /dvd
dvdauthor -o /dvd -a ac3+ja output.mpg
dvdauthor -o /dvd -T
Replace ac3
with mp2
or pcm
if your audio track is one
of these formats, and replace ja
with the two letter language
code of the actual language used in the audio track (en
for
English, ja
for Japanese, fr
for French, de
for German,
and so on).
There are two ways to create a DVD having multiple chapters. The
simplest is to give multiple MPEG files on the dvdauthor
command
line, e.g.:
dvdauthor -o /dvd -a ac3+ja output1.mpg output2.mpg output3.mpg output4.mpg
The other way is to pass a list of comma-separated H:MM:SS.SS
times to
dvdauthor
using the --chapter
flag, e.g.:
dvdauthor -o /dvd -a ac3+ja --chapter 0:24:36.52,0:48:15.98,1:12:42.10 output.mpg
Make an ISO filesystem:
mkisofs -dvd-video -udf -r -o /dvd.iso /dvd
and burn it using dvdrecord
if you have a DVD-R drive:
dvdrecord -v -dao -speed=4 -dev=0,0,0 -driveropts=burnproof /dvd.iso
or using growisofs
if you have a DVD+R drive:
dvd+rw-format -f /dev/scd0
growisofs -Z /dev/scd0=/dvd.iso
You're done. Enjoy!
Play the DVD using mplayer -dvd N
where N
is the title
number. Pay attention to the status line, which will look something
like this:
A: 391.8 V: 391.8 A-V: -0.002 ct: -0.104 200/197 21% 9% 25.0% 6 0 0%
The first two numbers tell you the time elapsed in the DVD. The two
numbers in the middle separated by a slash tell you the number of
frames played (they may be slightly different, because the first
number includes NTSC synchronization compensation and the second
number does not). Pause the movie, note the frame number, play
the movie for ten more seconds, note the new frame number, and see
whether the two differ by 240 frames or 300 frames. In the first
case, your frame rate is 23.976; in the second case, 29.97. If you
need more time, play the movie for twenty or thirty seconds and
multiply accordingly.
It is important that you perform this experiment on the actual movie content itself and not the introductory splash screen that may be present on the DVD, since the splash screens very often use a different frame rate from the movie itself.
To understand inverse pulldown, you first need to understand pulldown.
Regular television video is interlaced, meaning that the odd numbered scanlines are displayed first, followed by the even numbered scanlines, then the odd ones again, then the even ones again, etc. Each individual line is displayed 30 times a second, but because of the interlacing, the television image as a whole is refreshed 60 times a second, with only half of the total lines being refreshed each time.
For regular television video, there is no way to recover perfect video frames, because no matter where you are in the video the odd numbered lines are always offset 1/60th of a second from the even numbered lines. There is a technique called deinterlacing which can approximately reconstruct original video frames based on the limited information available in the interlaced video, but deinterlacing is only an approximation and using it will result in some loss of quality.
However, most theatrical movies and anime shows are not filmed using television cameras. Instead they are filmed with film cameras (or, in the case of anime, drawn by hand) at a non-interlaced frame rate of 24 frames per second. In order to display these on TV screens, the 24 frames per second is deliberately interlaced to 30 frames per second using a process called pulldown, also known as telecine, which I will not explain because it is much better explained here or here.
What makes all this important is that, unlike regular television interlacing, the pulldown process can usually be perfectly reversed provided that you know it is there and you have a filter specifically designed to reverse it (that is, NOT a deinterlace filter--most deinterlace filters are not specifically designed to handle telecine). The reversal process is called inverse telecine or inverse pulldown.
Here are some frame grabs illustrating the above concepts (click for full frames):
Follow the directions in the
video encoding
section, but change the line 32detect=force_mode=3
to
32detect=force_mode=3:verbose
instead, leaving everything else
the same. You will get a bunch of output on the screen that looks
something like:
(0) frame [044265]: (1) = 243 | (2) = 245 | (3) = 1 | interlaced = no
(0) frame [044266]: (1) = 199 | (2) = 240 | (3) = 1 | interlaced = no
(0) frame [044267]: (1) = 215 | (2) = 230 | (3) = 1 | interlaced = no
(0) frame [044268]: (1) = 235 | (2) = 229 | (3) = 1 | interlaced = no
(0) frame [044269]: (1) = 220 | (2) = 244 | (3) = 1 | interlaced = no
(0) frame [044270]: (1) = 243 | (2) = 249 | (3) = 1 | interlaced = no
(0) frame [044271]: (1) = 230 | (2) = 256 | (3) = 1 | interlaced = no
(0) frame [044272]: (1) = 206 | (2) = 218 | (3) = 1 | interlaced = no
(0) frame [044273]: (1) = 213 | (2) = 200 | (3) = 1 | interlaced = no
(0) frame [044274]: (1) = 223 | (2) = 180 | (3) = 1 | interlaced = no
(0) frame [044275]: (1) = 188 | (2) = 192 | (3) = 1 | interlaced = no
(0) frame [044276]: (1) = 210 | (2) = 219 | (3) = 1 | interlaced = no
(0) frame [044277]: (1) = 238 | (2) = 218 | (3) = 1 | interlaced = no
(0) frame [044278]: (1) = 211 | (2) = 202 | (3) = 1 | interlaced = no
(0) frame [044279]: (1) = 211 | (2) = 199 | (3) = 1 | interlaced = no
(0) frame [044280]: (1) = 223 | (2) = 224 | (3) = 1 | interlaced = no
(0) frame [044281]: (1) = 178 | (2) = 241 | (3) = 1 | interlaced = no
If every row has interlaced = no
in the right hand column, then
you have succeeded in reversing the pulldown and removing the
interlacing.
For comparison, here is an example of an unsuccessful attempt at inverse pulldown:
(0) frame [001042]: (1) = 55619 | (2) = 56592 | (3) = 324 | interlaced = yes
(0) frame [001043]: (1) = 27924 | (2) = 28296 | (3) = 162 | interlaced = yes
(0) frame [001044]: (1) = 26425 | (2) = 26147 | (3) = 152 | interlaced = yes
(0) frame [001045]: (1) = 30923 | (2) = 31739 | (3) = 181 | interlaced = yes
(0) frame [001046]: (1) = 21860 | (2) = 22136 | (3) = 127 | interlaced = yes
(0) frame [001047]: (1) = 22056 | (2) = 22412 | (3) = 128 | interlaced = yes
(0) frame [001048]: (1) = 16845 | (2) = 16770 | (3) = 97 | interlaced = yes
(0) frame [001049]: (1) = 13148 | (2) = 12792 | (3) = 75 | interlaced = yes
(0) frame [001050]: (1) = 13294 | (2) = 12528 | (3) = 74 | interlaced = yes
(0) frame [001051]: (1) = 12787 | (2) = 12408 | (3) = 72 | interlaced = yes
(0) frame [001052]: (1) = 12216 | (2) = 11771 | (3) = 69 | interlaced = yes
(0) frame [001053]: (1) = 15243 | (2) = 15207 | (3) = 88 | interlaced = yes
(0) frame [001054]: (1) = 18621 | (2) = 18998 | (3) = 108 | interlaced = yes
(0) frame [001055]: (1) = 17195 | (2) = 17149 | (3) = 99 | interlaced = yes
(0) frame [001056]: (1) = 16599 | (2) = 16748 | (3) = 96 | interlaced = yes
(0) frame [001057]: (1) = 13009 | (2) = 12705 | (3) = 74 | interlaced = yes
(0) frame [001058]: (1) = 10831 | (2) = 11430 | (3) = 64 | interlaced = yes
Create a PPML script containing the following four lines:
*counter frame_counter
1 *counter font_dir=/usr/share/mplayer/iso-8859-1/arial-28
2 *counter xpos=20 ypos=20
3 *counter sat=80.0 contr=70
Subtitle your video using this script, and it will place a frame
counter in the upper left hand corner of the video.
You can use the mencoder
program included with
mplayer to subtitle using the
mplayer engine. Most of this material is covered pretty thoroughly in
the
previous version of this guide.
The subtitles produced by mencoder
look better on a computer
screen than transcode
, but worse on a TV screen. On the other
hand, mencoder
is for now the only program that supports
subtitles in asian language character sets.
The xste program combined with submux-dvd allows you to generate soft (switchable) DVD subtitles that can be turned on or off by the DVD player. If anyone who speaks English figures out how to get this combination working, let me know.
The
subtitler-yuv program is a standalone alternative to the
subtitler plugin in transcode. It uses the same PPML script format,
but instead of being a transcode filter it functions as a standalone
program taking input video on STDIN and writing subtitled video on
STDOUT. This strategy has the major advantage that the same video can
be chained through multiple instantiations of subtitler-yuv
, to
imprint multiple subtitle streams onto a single video.