This program illustrates using three non-standard ways of supplying subtitle
tracks for videos in HTML5. The cues for tracks are normally stored in WebVTT file with a
.vtt file type. These files are simply text files with a specified format. The correct mime type for
these files is text/vtt. Unfortunately, many servers do not supply the correct file type. While
most browsers ignore this problem, Internet Explorer 11 refuses to use a .vtt file with the wrong
mime type. This means that many users will be unable to see valid subtitle track cues. This
leads to the question, "Are there any workarounds for this problem?"
This tutorial looks at three workarounds. So in addition the standard way assumed by HTML5,
we will look at creating cues with
These techniques will work because the HTML5 standard provides a means
of creating tracks and adding cues to them.
This tutorial also demonstrates how multiple tracks and videos can be implement in HTML5
using Processing.js.
It suggests some ways to solve problems when trying to use tracks. If you
only want to add a single track to a video, you are familiar with WebVTT file format, you
don't want any sophisticated styling, and
your server supplies the correct mime type for .vtt files, you may not need this tutorial. It
intended for developers who are having problems.
The coding is done in Processing.js. Much of it is very close to the coding that would be
used in Javascript. Most programmers preferring to Javascript should be able to convert it
with little difficulty. Javascript coders would probably prefer to use different buttons. The
coding handling the videos is similar to that uses in the KISVid3
adapted to the array structure used in KISVid5 tutorials.
Just a quick word about the Spanish and German tracks. Unfortunately I am not proficient
with either of these languages. The Spanish and German tracks for "developerStories"
simply count in that language. The German track for "Kitten" was translated by a on-line translator. I have no
idea how good the translation is.
Much of the sketch is similar to the other sketches in the tutorial series. One of the new features
is the "Track source" line below the "Source file" line that gives the source of the current track. The
three lines of blue buttons are as in some of the other tutorials. Clicking an up or down button changes
the setting
by 10%. At the very bottom on the left is a comment about the status of the video which
unfortunately has not been 100% accurate.
On the left near the bottom are three radio button choices for the video. Above them are 1 to 3
(depending on the video) radio button choices for the track language. The
number before the language encodes information about the status of the track. Numbers greater
than 100 indicate the .vtt file for the track has not been processed by the browser yet.
Negative numbers also mean that the track has
not been created yet. "-2" indicates that the source will be created from code. "-3" indicates the track
will be created by reading cues from the .html file. Finally, numbers from 0 to 99 indicate
the track as been processed and give the number of the track in the array of tracks in the video
object.
Lets consider the three radio buttons in the lower right. When active, the "Hide subtitles"
button will conceal the subtitles.
The "Show information" button will add a considerable amount
of information about the current track just below the video frame. On the right there is a listing
of the current cues together with their optional identifiers, start time and end time. On the left
is a significant amount of information about the status of the track and different structures
as specified the standards. In some cases, this information can be helpful during debugging.
Finally the "Read VTT files" button. When this button is active, any further reads of a
track will be done by using XMLHttpRequest even if the browser
is capable of handling the track in the normal way. It is important to note that at least on
my server at the present time. IE 11 always uses this method even if the button has not been
selected. Also note that tracks that have already been read are not reread when the button
is selected and tracks that are generated by code or from HTML cannot be processed by this
method.
The state of tracks in HTML5
The state of tracks has improved significantly in the last couple of years. It appears that the
concept of tracks was added to HTML5 before any much thought was given to how it would be
implemented. But now all the major browsers support tracks - at least parts of it. For example,
the developerStories video in
http://www.html5rocks.com/en/tutorials/track/basics/?redirect_from_locale=it
has a English track that works in the major browsers except IE11. (That site does not include a
the .mp4 version of the video which is the only video type supported by IE11.) However, (at
least of February, 2015) if you
try to use tracks on your own, you are very apt to find that there are significant problems.
We have already mentioned that Internet Explorer 11 requires the correct mime type for vtt files.
Unfortunately
my server (as of February 2015) does not supply the correct file type for vtt files. If your server
does not supply the correct mime type, there are some possible workarounds. This tutorial
suggest three workarounds that I haven't seen elsewhere.
First a comment about using browsers. Only Firefox allows running Processing.js code without
using a server. Hence the author has done most of the development with it. Unfortunately, having
a sketch that works properly in Firefox, doesn't mean that the code will work properly in other
browsers when it is moved to a server. In addition, Firefox has not implemented some features
available in some other browsers. The previously mentioned Internet Explores requirement
of the correct mime type is only one such problem. Clicking the "+/-" link below will show some
comparisons about features impliment by various browsers.
The author has a PC and hence has been unable to doing testing with the latest versions of
Safari. It is his understanding that it handles tracks. The sketch has been tested in Firefox 35,
Opera 26, Chrome 40, and Internet Explorer 11 running on an Windows 7 computer.
This tutorial is very long and readers often may be interested in a certain part(s). To
simplify finding a particular part, readers can just click the [+/-] links to expand the part
they are interested in.
subtitles: Transcription or translation of the dialogue, suitable for when the sound
is available but not understood
captions: Transcription or translation of the dialogue, sound effects, relevant musical
cues, and other relevant audio information, suitable for when the soundtrack is unavailable
descriptions: Textual descriptions of the video component of the media resource,
intended for audio synthesis when the visual component is unavailable
chapters: Chapter titles, intended to be used for navigating the media resource.
This tutorial only discusses subtitle tracks. A subtitle track consists of a number of cues
with some text that appear and disappear at specified times.
The WebVTT file format is discussed in many webpages (e.g.
http://dev.w3.org/html5/webvtt/) so only a quick
discussion is given here. (One needs to be aware that some older sources suggest styling techniques
that are out-of-date.) Please note that WebVTT files use UTF-8 which may be important when
using non-standard characters or languages with special characters.
The general format of a track file is as follows:
WEBVTT optional comment
First line of file
blank lineoptional cue identifierstartTime-->endTimeoptional "WebVTT cue settings list"Text for the cue
Repeat for each cue
This tutorial ignores "Region:" lines which can follow the WEBVTT line. The diagram also ignores
Notes (comments). The following is an example of a track with three cues and two NOTES.
WEBVTT filesp000:00:00.000 -->00:01:11.000 line:5% align:start size:50%
This cue is shown for the first 71 seconds.
NOTE This is a 2 line comment
2nd comment linesp100:00.100 -->00:04.000
This is the second cue. It appears at 0.100 second
and disappears at 4.000 seconds.
00:00:04.000 -->00:00:08.123
This is the third cue. It appears at 4.000 seconds
and disappears at 8.123 seconds.
NOTE This comment uses 1 line
Color coding:
Required Comment (optional) Cue identifier (optional) Start time Endtime WebVTT cue settings list (optional) Text for cue
The blank lines are required.
All WebVTT files must begin with "WEBVTT" optionally followed by a comment. Each cue begins with a blank line.
The optional id is similar to ids in HTML tags. In the example, sp0 and sp1 are ids. The id
was omitted from the third cue. It is common to number cues 1, 2, 3, ... .
Times must have the following format:
hh:mm:ss.sss or mm:ss.sss
where hh is a two digit hour, mm is the minute,
ss.sss is the seconds. The number of digits in each part must be exactly
as shown! mm is limited to the range 00 and 59 and
ss.sss cannot be more than 59.999.
The optional "WebVTT cue settings list" specifies where the cue is to appear
in the video viewport. The default is centered near the bottom. The first cue will be shown 5% from the
top, is aligned on the left side of the video viewport, and can use only 50% of the video viewport (wrapping if
needed).
The text portion give the text that will be shown for the cue. If necessary, the cue is word wrapped.
The cue can have multiple lines and new lines are "respected".
The text may can include <b>, <i>, and <u> tags which are used exactly the same
way as in HTML. In addition, there additional tags that are not part of HTML. Class tags
<c.classname> which are used with CSS classes. <ruby> and <rt>
are used with ruby characters. Voice tags <v name> can be used with CSS to give
distinct style for various speakers. <lang> tags are used to specify a language. All of the tags
(except possibly the <v name> tag require the corresponding closing tag.
It is important to understand that any errors in the WebVTT file can prevent individual
cues or the entire track from being processed.
"NOTE"s can have one or more lines. They are followed by a blank line.
The following code shows the <video ...>, <source..> and <track ...> tags used in the
html file for the sketch.
30 <video id="videotag" preload="auto" width="420">
31 <source src="vid/developerStories-en.mp4" type='video/mp4' />
32 <source src="vid/developerStories-en.webm" type='video/webm' />
33 <source src="vid/developerStories-en.ogg" type='video/ogg' />
34 <track id = "enTrack0" src = "vid/developerStories-subtitles-en.vtt" label="English" kind="subtitles"
35 srclang="en" default />
36 <track id = "spTrack0" src = "vid/counting-subtitles-sp.vtt" label="Spanish" kind="subtitles" srclang="es" />
37 <track id = "deTrack1" src = "vid/Kitten-de.vtt" label="German" kind="subtitles" srclang = "de" />
38 <track id = "enTrack2" src = "vid/jwplayerFeatures.vtt" label="English" kind="subtitles" srclang="en" />
39 <track id = "frTrack2" src = "vid/junk.vtt" label="French" kind="subtitles" srclang="fr" />
40 Your browser does not support HTML 5 video. You may want to update your
41 browser to a current version.
42 </video>
The track with id = "enTrack0" is marked
id = default. Only one subtitle track can have this designation.
The <video ... > is somewhat longer than needed because the source files
could be replaced by code in the sketch as shown in KISVid5.
Unfortunately, it appears that the HTML5 standard does not provides a way to replace the track tags by
Processing.js code.
HTML5 standard provides some functions that allow programs to create a track and add cues to it.
They include:
textTrack media . addTextTrack( kind [, label, language ] ]) This method creates and returns a new textTrack object. kind is one of the 5
track kinds (e.g. "subtitles") mentioned earlier. label is name that is used for the cue.
language is a two letter code for a language. The track is added to the media's
list of text tracks.
textTrack. addCue(cue ) This method is used to add a cue to the track.
textTrack. removeCue( cue ) One can use this method to remove a cue from the track.
The standard specifies creating cues with the constructor:
VTTCue(startTime, endTime, text) But unfortunately, Explorer 11 provides a different constructor:
TextTrackCue(startTime, endTime, text) There is one additional complication. IE 11 doesn't allow cues to be added to an existing track
that was created from a .vtt file. So the
question is, how does one determine which of the constructors to use? The way that KISVidT does
it is to declare a variable, useVTTCue which specifies if VTTCue can
be used and has code to determine its value.
39 boolean useVTTCue = true; // Can browser use VTTCue (IE 11 can't)
...444 void determineVTTCue() {
445 try {
446 if (VTTCue) // This may cause an error (e.g. IE 11)
447 useVTTCue = true;
448 else
449 useVTTCue = false;
450 } catch (Exception err) {
451 useVTTCue = false;
452 }
453 } // determineVTTCue
In the if statement, VTTCue will be true if it exists. If not, it will be false or cause an error.
The techniques used suffer one common problem. The standards do not provide a simple way to
specify a "WebVTT cue settings list", that is the information on the start and end time line
that specifies where the text is to be located. Instead, each type of cue setting must be handled
individually. The standard specify attributes vertical,
snapToLine, line,
position, positionAlign,
size, and align
(See
http://dev.w3.org/html5/webvtt/#the-vttcue-interface)
To avoid these complications, this tutorial will ignore the cue settings list. For much the same
reason, this tutorial ignores the region attribute.
We will now turn our attention to the 3 workarounds. The KISVidT code for the three methods is
stored in separate files. Users will normally only use one of the methods in any particular
application.
The German track for the developerStories-en video uses this technique. The code is in
codeNewTrack.pde.
The methods that we discuss here are extremely simple but they provide basis for the other
two workarounds. When the main part of the sketch wants to create a new track using
code, it calls this method.
label and languageCode are passed
to this method so they can be used in the addTextTrack call to create
the track.
In this case there is only one track that can be created. If there were multiple tracks,
the createTrackWithCode line could be replaced by logic to pick
the proper procedure. The KISVidTBookKeeping connects
the new track with the rest of the KISVidT sketch.
The createTrackWithCode method does the actual work of
creating the text track. It is a model for what is done in the other two workarounds.
19 void createTrackWithCode(String label, String languageCode) {
20 double startTime, endTime;
21 String message, id;
22 Track newTextTrack;
23 int i;
24 msgGenerateCodeCue();
25 cueStr = "started creating track with code";
26 showInfo("in createTrackWithCode + useTrack[languageNo]: " + useTrack[videoNo][languageNo]);
27 // create a new track
28 newTextTrack = video.addTextTrack("subtitles", label, languageCode);
29 newTextTrack.mode = "hidden";
30 // create the cues
31 for (i = 0; i < 7; i++) {
32 id = "" + i; // (optional)
33 startTime = 10 * i;
34 endTime = startTime + 9.9;
35 message = "Dies ist Deutsch - " + startTime;
36 newTextTrack.addCue(newCue(startTime, endTime, message));
37 try { // (optional) the id must be added independently
38 newTextTrack.cues[i].id = id;
39 } catch (Exception e) {// ignore
40 }
41 }
42 cueStr = "Finished creating cues with code";
43 showInfo("Completed creating cues with code");
44 } // createTrackWithCode
Lines 24 - 26 and 42 - 43 are simply progress reports in case debugging is needed and can be
ignored. The important code begins in lines 28 and 29 where the new track is created. The
variable video is simply the video class being used in the
sketch. The track mode is set to "hidden" because in the default mode, the track can not hold
any cues. Lines 31 - 41 are extremely simple just to illustrate the concept. Seven cues are created
which appear at 10 second intervals. addCue is the HTML5 standard
method for adding a cue to a track but newCue is a
method in KISVidT.pde that creates the cue. Recall Internet Explorer 11 uses a nonstandard method
for creating cues so the
newCue simply uses the appropriate constructor - either the standard
VTTCue or TextTrackCue required by
IE 11.
The methods for creating a cue unfortunately do not include the optional identifier so
it is added separately in line 38. The try/catch construction added just in case there are any problems.
As indicated earlier, the method ignores the "WebVTT cue settings list" which specifies
where the cue is located. One could add code to the for loop to specify that information. The "region" lines
are also ignored.
Clearly this method that pretends to add a "German" track is overly simplified. In
real sketches, one could add arrays for the id, startTime, endTime and message to make
method actually useful. The next two workarounds use techniques to read the cues from
files. They get increasingly more complicated so this simple method was used to
introduce the concept.
The English track of the Kitten video uses this technique. The code is in
readHTMLFile.pde.
This workaround will read the cues directly from the .html file. An advantage is that we
know the information needed for the cues are available after .html file is processed instead of
possibly having to wait for the .vtt file to be loaded. There are some disadvantages that we
will discuss later. One question is whether the cues should be added to the .html in exactly the
format as in a .vtt file. While that is certainly possible, but as will be shown in the third
workaround, that requires a significant amount
of code to extract the required information. To avoid those complications, a simpler format will
be used. The cues are stored in a division marked "display: none"
hiding the cues. Each cue is represented as follows:
id~start time in seconds~end time in seconds~text for the cue~.
For example, the sample file with three cues used earlier would
appear as follows:
<div id="simpleExample" style = "display: none">
sp0~0~71~This cue is for the first 71 seconds.~
sp1~.1~4~This is the second cue. It appears at 0.100 second
and disappears at 4.000 seconds.~
~~4~8.123~This is the third cue. It appears at 4.000 seconds
and disappears at 8.123 seconds.
</div>
The division identifier can be named as desired. The line structure does not have to be as strick as shown
but new lines in the text part of the cue will be respected. In accordance with the earlier
comment that "WebVTT cue settings list" and "regions" will be ignored because of additional complications,
there are no provisions for the setting list or regions although they could be added if desired.
The basic idea is to use
document.getElementById("simpleExample1").innerHTML
to read the entire division as a single string and then use split("~")
to break the string up into individual units which can be used for cues.
The code used to read and split the cues for the English track for the Kitten video is as follows:
34 void readCuesFromHTML(String label, String languageCode, String divId) {
35 String startTime, endTime, message;
36 String aStr;
37 String[] result;
38 int i;
39 msgStr += NEW_LINE + "***Read cues from HTML file";
40 // read from the .html file, make changes, and split it into items
41 aStr = document.getElementById(divId).innerHTML;
42 // VTT tags (e.g. <v..>) not in HTML5, are messed up by the above read.
43 // So use "`l" by "<" and "`g" by ">".
44 aStr = replaceAll(aStr, "`l", "<");
45 aStr = replaceAll(aStr, "`g", ">");
46 result = aStr.split("~");
47 showInfo("readCuesFromHTML");
48 // create the track using the cues in result
49 createNewTrack(label, languageCode, result);
50 // do things required by the sketch
51 KISVidTBookKeeping(video.textTracks.length - 1, "HTML file");
52 } // readCuesFromHTML
In addition to the label and languageCode, divId, the identifier used
in the division tag, is passed into the method.
Line 41 reads the entire division into aStr and line 46 splits it into parts separated by "~". Line 49 calls
the method that actually creates the track while line 51 calls the KISVidT method that does the bookkeeping
that allows the sketch to recognize the track. But what about lines 44 and 45? Surprisingly
getElementById(...).innerHTML handles WevVTT tags that are also
HTML5 tags as expected but really messes up new WebVTT tags. Hence
the WebVTT tags <b> <i> and <u> are processed as expected. Even the <c. ...>
tag is read properly. (However, the HTML5 file checker thinks it is an error.) But the other
WebVTT tags are destroyed. To get around, this "<" can be replaced by "`l" and ">" by "`g". Lines
44 and 45 restore "<" and">".
The createNewTrack method is very similar to the
createTrackWithCode of the previous workaround.
57 void createNewTrack(String label, String languageCode, String[] result) {
58 double id, startTime, endTime, message;
59 Track newTextTrack;
60 int i;
61 newTextTrack = video.addTextTrack("subtitles", label, languageCode);
62 newTextTrack.mode = "hidden"; // temporarily hide the track
63 showInfo("createNewTrack " + result.length + " ");
64 // each cue requires 4 pieces of info
65 try {
66 for (i = 0; i < result.length; i += 4) {
67 cueNo = i/4;
68 startTime = result[i+1];
69 endTime = result[i+2];
70 message = result[i+3];
71 newTextTrack.addCue(newCue(startTime, endTime, message));
72 try { // the id must be added independently
73 newTextTrack.cues[cueNo].id = result[i];
74 } catch (Exception e) {
75 // ignore
76 }
77 }
78 } catch (Exception e) {
79 alert ("Error: while creating a track and adding ques. " + e);
80 }
81 cueStr = "Finished adding English track for video 1" + trackLangs[videoNo][languageNo] + " cues";
82 showInfo("Completed the English track for Kitten video");
83 } // createNewTrack
The for loop counts by "4" because data for each cue has 4 parts.
The code is very similar to that in the last section. Lines 63 and 81-82 are there only for possible
debugging and can be ignored. Lines 61 and 62 create the track and then "hide" it so that it actually
hold the cues. The for loop in lines 66 - 77 goes though the cue info in result
4 pieces at a time
and adds the cue to the track. The id is added after the cue has been added. Once again,
the method ignores the "WebVTT cue settings list" and the region option.
One could add code to specify that information.
This method is always used by Internet Explorer 11 and optionally by other browsers except for
the tracks mentioned in the first two alternatives. The code is in readVTTFile.pde.
The code for this alternative is much more complicated for a couple of reasons.
One reason is it takes some time to load the file and the coding to interpret the file can't
start processing until the file has been loaded. The second reason is just the complexity of decoding
a WebVTT file.
When VISVidT wants to read and process a VTT file, the first method called is
In line 38, oReq gets the XMLHttpRequest object. The next several lines set up event handlers.
The first two are commented out because they are not normally needed. The onload event
is called when the file begins to load. The progress event gets updates about the loading
progress but generally that happens so fast that all one sees is 100%. The load
event is critical because it is called when the loading is complete and the sketch can begin processing it. As could be
expected, error get used if there is an error. oReq.open
specifies the operation ("get"), the source file (url), and that the load should asynchronous (true). (Synchronous loads
are discouraged.) oReq.send actually starts the loading process.
Because load event specifies it, the
transferComplete method is called when the .vtt file loading is
complete.
69 void transferComplete(Object evt) {
70 try {
71 // process the file
72 processWebVTT_File(oReq.responseText);
73 // some bookkeeping required for KISVidT.pde
74 // (trackNo must be set to allow modifyCues to work)
75 trackNo = video.textTracks.length - 1;
76 KISVidTBookKeeping(trackNo,
77 aSourceFile + " (XMLHttpRequest)");
78 } catch (Exception e) {
79 alert("Error in transferComplete: " + e);
80 }
81 } // transferComplete
This method asks processWebVTT_File to decode the file whose text is stored
in oReq.responseText and creates the track
in line 72. After that there is some coding to set up things with the main part of the sketch including calling the
infamous "KISVidTBookKeeping" method. After decoding the .vtt file,
processWebVTT_File calls the following method to actually create the track.
This method is called from processWebVTT_File instead of
transferComplete in order to simplify passing the 4 arrays to it. As before,
lines 214, 229 - 232 are intended to simplify debugging should it be needed. Line 215 adds the
track using the label and language code specified. Lines 220 - 225 add the cues to the track using the information passed by the
startTime, endTime,
cueText, and id arrays. The constructors
used by newCue do not handle the id so it must be added separately.
newCue calls the appropriate constructor
The method that decodes the VTT file is long and complicated. So we will look at it in pieces.
(As pointed out earlier, it is assumed that there are no region lines following the WEBVTT line.)
After a number of declarations, the procedure begins by breaking up the string containing the file text
into individual lines.
112 lines = fileText.split("\n");
Then it checks to make sure the first line begins with the required "WEBVTT".
114 if (lines[0].indexOf("WEBVTT") < 0) {
115 alert("Error: This is not a WEBVTT file. Line 0 is:\n" + lines[0]);
116 return;
117 }
This followed by the loop that process cues and notes. That loops begins by making sure the
at least one blank line and ends by incrementing the cue number.
127 while (true) {
128 // skip blank lines checking for EOF
129 while (lineNo < lines.length && lines[lineNo].trim().length == 0) {
...181 cueNo++;
182 } // while
Items processed by the loop could be a "NOTE" (comment) or a cue. First we check for a
NOTE which could contain 1 or more lines.
135 if (lines[lineNo].substr(0, 4) == "NOTE") {
136 lineNo++;
137 while (lineNo < lines.length && lines[lineNo].trim().length > 0) {
138 lineNo++;
139 }
140 if (lineNo >= lines.length)
141 break;
The only alternative is a cue. The cue may begin with an optional identifier. The identifier
can not contain "--->" which indicates the time line.
Next comes the required start and end times separated by "--->".
150 // process times
151 if (lines[lineNo].indexOf(" --> ") < 0) {
152 alert("Error: Invalid WEBVTT file. Invalid time line " + lineNo
153 + "\n" + lines[lineNo]);
154 return;
155 } else {
156 startTime[cueNo] = getTime(lines[lineNo], 0);
157 loc = lines[lineNo].indexOf(" --> ") + 5;
158 while (lines[lineNo].charAt(loc) == "")
159 loc++;
160 endTime[cueNo] = getTime(lines[lineNo], loc);
This method uses a function getTime to return the time in
seconds. That method is supplied the line number and the character location for the
beginning of the time. As in the other methods we will ignore any "WebVTT cue settings list".
The loop concludes with code to determine cue text which has 1 or more lines.
167 s = "";
168 lineMarker = "";
169 while
170 (lineNo < lines.length && lines[lineNo].trim().length() != 0) {
171 s += lineMarker + lines[lineNo];
172 lineMarker = "\n";
173 lineNo++;
174 }
175 if (s.charAt(s.length()-1) < " ") {
176 t = s.substring(0, s.length()-1);
177 }
178 else
179 t = s;
180 cueText[cueNo] = t;
The three workarounds add cues to a track. It has been mentioned that it is possible to delete
cues. It is also possible to modify a cue. There is an exception. Internet Explorer 11 only allows
adding cues to tracks that were not originally created from a .vtt file. The code is in KISVidT.pde.
We begin looking at modifying cues using the modifyCues method in the KISVidT.pde file.
424 // After modification there will be 21 or 23
425 modifySpanish();
426 showInfo("modifyCues");
427 } else if (videoNo == 2 && languageNo == 0) {
428 // attempt to get jwplayersFeatures video to pause at end of cue 14.
429 // does not work in Firefox 34
430 aTrack = arrayTracks[trackNo];
431 msgStr += NEW_LINE + "kkkk aTrack.length: " + aTrack.cues.length + " trackNo: " + trackNo;
432 aTrack.cues[14].pauseOnExit = true;
433 msgStr += NEW_LINE + "***Marked cue for pause on exit: "
434 + aTrack.cues[14].pauseOnExit;
435 }
436 } // modifyCues
437438 /**
439 * It is possible to add cues to a track but the method depends on
440 * the browser. Firefox 35 creates new cues with VTTCue but
The method modifySpanish both adds and deletes
cues from the Spanish track for the developerStories-en video in line 429. Then it modifies
cue 14 in the jwplayerFeatures' English track. (Actually cues are numbered starting with 0 so
the cue is the 15th cue.) One of the attributes specified by the standard
is the boolean valued pauseOnExit attribute. When it is true,
the video should pause when the cue closes. We said "should" because this is not implemented
in some browsers. The change is made in line 436.
We will now look at the modifySpanish method which modifies the Spanish track for the
developerStories-en video.
522 msgStr += NEW_LINE + "***Modifying the Spanish track";
523 try { // This doesn't work in IE 11
524 aTrack.addCue(newCue(21.000, 21.999, "veinte y dos"));
525 aTrack.addCue(newCue(20.000, 20.999, "veinte y one"));
526 msgStr += INDENT_NEW_LINE + "Added two cues to Spanish track";
527 } catch (Exception cueErr) {
528 msgStr += INDENT_NEW_LINE + "Error in modifySpanish: Unable to add newCue: " + cueErr;
529 }
530 }
531 try { // IE 11, Firefox 34, Opera 26, and Chrome 39 do not support
532 // getCueById( )
533 deleteCueId("sp21");
534 msgStr += INDENT_NEW_LINE + "Deleted cue <u>by id</u> from Spanish track";
535 } catch (Exception e) {
536 deleteCue("delete this cue");
537 msgStr += INDENT_NEW_LINE + "Deleted one cue <u>by message</u> from Spanish track";
538 }
539 showInfo("modifySpanish");
540 } // modifySpanish
541542 /**
543 * bookingkeeping
544 */
Lines 527, 531, 533, 539, 542, and 544 are intended to be helpful should debugging be necessary.
Lines 529 and 530. attempt to add two cues to the track. Because this process will fail in
IE11, they are enclosed in a try/catch structure.
Two methods are supplied to delete cues. deleteCueId uses
a standard method to delete the cue by its id. Unfortunately it is not always implemented, so an
alternative method deleteCue has been provided. Lets begin
by looking at deleteCueId
515 /**
516 * Modify the Spanish track by adding and deleting cues
517 */
518 void modifySpanish() {
It takes advantage a method specified by the standards, getCueById
in line 516. Unfortunately that method is not always implemented. The next line uses the standard
method removeCue The alternative method is
deleteCue.
500 return;
501 }
502 }
503 } // deleteCue
504505 /**
506 * Deletes a cue given its id.
507 * the getCueById function appears in the 28 October 2014 draft but does
508 * seem to implemented yet.
This method has to search for the cue with the specified method (assuming that only one
such method exists). After finding the cue, it can be deleted using the same method used by the
previous method. The loop is terminated if and when the cue is found.
What remains is how the sketch keeps track of the tracks in view of the fact that there are
three different videos each with anywhere from 1 to 3 tracks. We need to keep track of the
language and the code for the language. We also need to know the number of the track in the
videos array of tracks and source for the track. It takes several arrays. These arrays
takes advantages of doubly subscripted arrays with a different number of elements per row.
41 String[][] trackLangs = {{"English", "Spanish", "German"}, // video number 0
42 {"English", "German"}, // video number 1
43 {"English", "French"}}; // video number 2
44 String[][] trackIds = {{"enTrack0", "spTrack0", "deTrack0"}, // video number 0
45 {"enTrack1", "deTrack1"}, // video number 1
46 {"enTrack2", "frTrack2"}}; // video number 2
47 int[][] useTrack = {{OFFSET_TRACKS+0, OFFSET_TRACKS+1, USECODE},// for video 0
48 {USEHTML, OFFSET_TRACKS+2}, // for video number 1
49 {OFFSET_TRACKS+3, OFFSET_TRACKS+4}}; // for video number 2
... 54 String[][] trackSource = {{UNKNOWN, UNKNOWN, UNKNOWN}, // for video number 0
55 {UNKNOWN, UNKNOWN}, // for video number 1
56 {UNKNOWN, UNKNOWN}}; // for video number 2
The trackLangs array holds the language displayed for a
track. The trackIds array
holds a code that includes the two letter code for the language and the video number. The
id is used to determine the name of the .vtt file or the division in the html file for tracks
read from that file. The useTrack has a double purpose. Initially
it gives the information about where the program
can find the track for the cue. USECODE tells the program
to use the code alternative to create the cues. USEHTML says to
read the cues from the HTML file. Codes beginning with
OFFSET_TRACKS say that the .vtt file exists. The following
number specifies the position of the file in the list of tracks in the <video ... > tags.
After a track is used the first time, these original numbers are replaced by the position of the
track in the video's objects array of tracks. Finally, the trackSource
holds the source of file (often the name of the .vtt file) that is displayed in the sketch.
Additional arrays and variables are listed below along with comments explaining their use.
23 final int UNKNOWN = -1; // current track number source has not been determined
24 final int USECODE = -2; // the track will be generated by Processing.js code
25 final int USEHTML = -3; // the track will be created from info in HTML file
26 final int OFFSET_TRACKS = 100; // Added to a track number that has <track ...>
27 28 // *********** Global fields for tracks ***********
29 int videoNo = 0; // current video number, initialized to the first video
30 int languageNo; // current languge number
31 int trackNo; // current track number
32 HTMLTrackElement htmlTrack; // an HTML TrackElement object
33 Track aTrack; // the current track
34 String trackId; // current track identifier
35 TextTrackCue[] cues; // the list of cues for the current track
36 TextTrackCue cue; // the current cue
37 boolean showTheInfo; // if true, show the extra information
38 boolean readVTT; // if true, use XMLHttpRequest to read .vtt file in the future
39 boolean useVTTCue = true; // Can browser use VTTCue (IE 11 can't)
40 Track[] arrayTracks = null; // An array of available tracks. Set up first use
... 58 int numTracks = trackLangs[videoNo].length; // number of tracks for the current video
59 ToggleButton[] trackBtn = new ToggleButton[numTracks]; // buttons for identifiers
60 ToggleButton hideSubtitlesBtn; // Button for hiding subtitles
61 ToggleButton showInfoBtn; // Button for showing the extra info
62 ToggleButton readVTTBtn; // Button for saying that XMLHttpRequest should be used
The video number was set in the variable declaration section shown above. The language
number is the only other piece of information needed to completely determine which track
will be used. InitializeTrack will be called after a two second delay
to allow the .vtt file to be read.
Lets turn our attention to the initializeTrack method. The first
time the method is called, it does some initialization of the track tags.
299 void initializeTracks(){
...309 if (arrayTracks == null) { // first time
310 initArrayTracks();
311 for (i = 0; i < arrayTracks.length; i++) {
312 arrayTracks[i].mode = "hidden";
...
The mode of tracks in arrayTracks (which is given a value in
initArrayTracks) is set to "hidden" because the tracks cannot
hold cues in the default mode.
The initArrayTracks methods includes determining if
VTTCue is defined
and then assigning the video's array of tracks to arrayTracks.
372 void initArrayTracks() {
373 // determine if VTTCue is defined (it isn't in IE 11)
374 determineVTTCue();
...377 arrayTracks = video.textTracks;
...379 } // initArrayTracks
Next the trackId is obtained from the trackIds array. It is followed by a call to
getTrackId that just gathers information about the track and
isn't actually required.
372 void initArrayTracks() {
373 // determine if VTTCue is defined (it isn't in IE 11)
374 determineVTTCue();
...377 arrayTracks = video.textTracks;
...379 } // initArrayTracks
Continuing in the initializeTrack method, the method gets to
the selection of the way to process the track. We will consider 5 cases.
Real applications would normally have only 3 of them (already processed, process the
.vtt file in the normal HTML5 fashion, and an alternative to take care of IE 11 when the
server doesn't provide the correct MIME type). In fact, one could ignore
the standard method and always use one of the alternatives. Then only two case would be needed.
We look first at the cases where
the track has already been used, it is to generated by code, or read from the .html file.
321 useTrackCode = useTrack[videoNo][languageNo];
322 if (useTrackCode >= 0 && useTrackCode < OFFSET_TRACKS) {
323 // The track is already set up and ready to be played
324 useExistingTrack();
325 } else if (useTrackCode == USECODE) {
326 // generate cues with Processing.js code
327 generateCodeCues(trackLangs[videoNo][languageNo],
328 trackId);
329 } else if (useTrackCode == USEHTML) {
330 // create cues by reading them form the .html file
331 readCuesFromHTML(trackLangs[videoNo][languageNo],
332 trackId, "EnglishData1");
If the track has already been used, then the useTrackCode
stored in the useTrack array holds the subscript of the track
in the trackArray and is between 0 and 99 (= OFFSET_TRACKS - 1). On the other hand, if
useTrack entry is USECODE
then that track hasn't been used yet and must be generated by code. If the
useTrackCode is USEHTML then
we will need to read the cues from the HTML file.
There are two remaining cases. We need to decide if we are going to use the standard
HTML5 method or use alternative 3 and use the XMLHttpRequest to read the file. In
either case, there is a <track ...> in the .html file that the specifies the
.vtt file and some other information.
333 } else { // the <track...> tag should exist
334 trackNo = useTrackCode - OFFSET_TRACKS;
335 try { // required because cues == null if
336 arrayTracks[trackNo].mode = "hidden";
337 } catch (Exception e) {
338 }
339 if (arrayTracks[trackNo].cues != null
340 && arrayTracks[trackNo].cues.length > 0
341 && !readVTT) {
342 // the cues have been read from a .vtt file
343 useStandardMethod();
344 } else {
345 // the <track ...> exists but will not be read in the normal
346 // fashion. Use XMLHttpRequest() to read file.
Recall that arrayTracks is the array of tracks from the video
structure. Assuming the .vtt file is processed in the normal HTML5 way, the
TrackNo is OFFSET_TRACKS larger the sequence number of
the track tag and its position in arrayTracks. The .vtt file will be
processed in the standard TML5 way if the track cues are not null, number of cues is not 0, and
the ReadVtt files button has not been set. The last alternative is to use workaround 3. Line 348
gets the name of the .vtt file sourceFile from the track tag. An
entry in array trackSource gives the source of the track. Then the
readCuesFromVttFile method is called to read the file.
There are just a few details left.
361 for (i = 0; i < useTrack[videoNo].length; i++) {
362 trackBtn[i].setLabel(useTrack[videoNo][i] + " " + trackLangs[videoNo][i]);
363 }
364 hideSubtitlesBtn.setState(false); // set true or false as desired
...368 video.play();
369 } // initializeTracks
This section sets the language buttons to the languages for this video, makes sure the
hideSubtitlesBtn is set to display the subtitles, and finally
tells the video to start playing.
After the specially for any of the 5 ways to process a track, there is a call to the
KISVidTBookKeeping method. There are several things that must
be done after a track is created and/or selected before it can be used. These are carried out by
this method.
550 trackSource[videoNo][languageNo] = source;
551 aTrack.mode = "showing"; // now show the track
552 cues = aTrack.cues;
553554 // Make changes in cues (Make sure this after VTT file has been read)
555 modifyCues();
556557 msgATrack();
558 msgCue(); // show cue[0]
559 listCues(); // create list of cues
560 showInfo("bookkeeping");
...566 */
The track number and the track source are passed in. The procedure begins by
setting trackNo and then selecting the track
aTrack from the array of tracks. The corresponding element in
the useTrack array is given the track number to signal that the
track is ready to be used if the user wants to use it again and the source is stored in the
trackSource array. The mode of the track is set to
showing so its cues will be displayed. cues
gets the list of cues from aTrack.
After all this done, the cues can be modified if by modifyCues
if needed for the selected track.
There some new buttons were added to help process tracks and processing for the video file
select button has to be modified slightly. For example, the three radio buttons in the lower right
must be processed when clicked.
In each case, the radio button is toggled and an appropriate method is called.
The following code will be used if one of the video file or track language button is clicked.
264 } else {
265 for (i = 0; i < fileBtn.length; i++) {
266 if (fileBtn[i].isOver()) {
267 videoNo = i;
268 setSource(files[i]);
269 setRadioButton(fileBtn,i);
270 setupTrack(0);
271 return;
272 }
273 }
274 for (i = 0; i < useTrack[videoNo].length; i++) {
275 if (trackBtn[i].isOver()) {
276 setupTrack(i);
277 return;
278 }
279 }
280 }
281 } // mouseClicked
The array of video file buttons is named fileBtn. A new step is
needed to handle the tracks. Line 270 calls setupTrack to specify
that "0" language track for that video is to be used.
That method is also executed if one of the trackBtn was been
clicked to select a particular language.
When the track is changed, the old track must be hidden. Before
initializeTrack can be called,
we have seen in the setup method that the language number
must be changed. The appropriate track buttons must be displayed in
case the video file was changed. Then the initialization of the track can begin.
The useExistingTrack method is called when the initializeTrackstrackNo < OFFSET_TRACKSKISVidTBookKeeping in line 400. The one advantage provided by the
method is the call to msgUseExistingTrack that adds a message to the
"Show information" display saying that an existing track is being used.
This method is called to specify a track that is processed by the browser using the standard HTML5
method. Like the previous method, this method is not really needed. The call to this
method could be replaced by a call to KISVidTBookKeeping in line 409
- 410. Using the method has two advantages. The try/catch structure clearly identifies the problem should there
be an error and line 408 adds a line to the "Show information" display saying the cues in the .vtt
are being used.
We will now shift our attention to the information presented when the "Show information" button
is selected. When the work on this tutorials began there were lots of questions and problems associated
with what parts of the standards were actually supported by various browsers and the coding needed
to implement them. The led to a rather
elaborate system for showing details of the track system and helps in debugging. There are two global
variables and several methods are used. The variables msgStr and
cueStr hold the text shown when the "Show information" button is
selected. Several msg... methods are used to add information to
msgStr. The method listCues adds the cues to
cueStr. The showInfo method writes these strings
to the web page. The methods were added to help separate the code used for the "information"
system from code actually needed to handle the tracks. Hopefully this will make it easier to create
actual applications that don't need the added "information".
Lets begin by looking at the cues on the right side which shows the cues for the current
track. As mentioned, the list is created by the showInfo method which
called at near the end of the initializeTrack method.
The left side of the display shows a large amount of information about the tracks being used.
Depending on the way the track was created, the following information may be including:
A list of the tracks that are currently available. The information includes the label, the
id (if the track was created by the HTML5 method), and the number of cues in that track.
Attributes for the current track including: kind of track ("subtitles"), its label, its language
code, the mode (which is always "showing"), the number (length) of the cues. The source
for the track is also included.
Attributes for cue[0] of the current track. This information includes not only its id,
startTime, endTime, and text that are included in the list of cues, but also information about
the cue location and related properties that are not displayed in the list of cues.
Other information show in left side may include things like the subscript of the track in the
list of tracks, the source of the track (e.g. the .vtt file, the HTML file, code, if
XMLHttpRequest was used, or if an already existing track is being used). Information is
provided concerning any modification cue modification to the cues carried out by the sketch.
101~0.000~1.990~Kitty - watch out - that is a doberman~
102~2.000~3.999~He has a big mouth and sharp teeth~
103~4.000~5.999~`lvc.myclass`lgBe careful`lv/c`lg~
104~6.000~7.999~Jane says "`lv Jane`gThis is dangerous`l/v`g"~
105~8.000~9.999~Tarzan says "`lv Tarzan`gBetter run away`l/v`g"~
106~10.000~19.999~Are you sure that this is safe?~
107~20.000~24.999~The kitten doesn't seem afraid of the dog~
108~25.000~26.599~Maybe the kitten can just sneak up on the dog~
109~26.600~39.999~Well sneaking up didn't work~
110~40.000~44.999~Things seem to be going fine for the playmates~
111~45.000~49.999~Mr. Doberman could eat kitty alive if he wanted to~
112~50.000~70.000~But the fearless kitty wants to play some more