Important tips for effective operation
Capturing audio material
• The only way to load audio material into this version of VocALign Project 3 is by dragging and dropping a sound file onto the relevant track.
• Audio files for use with this version of VocALign can be AIFF or WAVE (including Broadcast WAVE) format, at sampling frequencies between 44.1 and 192 kHz, and with resolutions from 16–24 bits. The Guide and Dub audio have to be at the same sampling frequency and resolution as each other, but can have different file formats.
• Interleaved stereo files can be loaded and processed. The channel waveforms will be displayed one above the other. VocALign uses the left channel of the Guide and the Dub to calculate the alignment, then processes both channels identically when the aligned stereo output is rendered.
• VocALign works best when working on relatively short regions at a time, for example from 20 seconds to, say, a minute. But effective alignment will depend on many things: e.g. gaps in the signals, how similar they are and other features. Some experimentation will help the user find the best lengths to process for each signal.
• Try to capture up to around 0.25 to 0.5 seconds of ‘background’ or ‘leader’ audio at the beginning of your Guide audio selection. This will assist VocALign to set the noise floor levels and allow better alignment at the start of the signal. (In the current version of VocALign Project 3, the selected audio can be up to two minutes long and must be longer than 0.25 seconds.)
• The Dub does not have to start at the same time as the Guide. In other words, the timecode position of the Dub is not used. The Guide’s timecode is important and so is the amount of ‘background’ captured before the Guide and the Dub.
• In general, it is common for either the Guide or Dub to be part of a much longer audio region than the other is. In most cases, the Guide region is longer (being part of a long transfer), while the Dub is a shorter section. Before processing, the relevant Dub content should be aligned approximately to the Guide content by the user, and start/end points selected to identify the relevant regions concerned.
• Try to leave as much or very slightly more background audio before the Dub signal starts than for the Guide.
• If you capture an audio region that begins in digital silence, VocALign may generate an error as it needs to be able to detect a signal to enable it to set its analysis parameters. Try to start capturing where there is at least some low level background noise within the audio region.
Trimming the captured material
• Captured audio material may need to be trimmed before alignment in order to ensure that the extracts are optimised for processing. VocALign will work best if Guide and Dub material start at a similar point in their energy profiles, and both have a small period of background noise before they begin.
• Remember that Undo (command-Z) is available if incorrect changes have been made.
• It is preferable to trim starts and ends of extracts by looking at the energy display. You can help VocALign do a good job by trimming the audio so as to match initial energy profiles and ensure that there is a good chance of starting the alignment accurately. To convert a captured audio waveform into the energy display press Align. Although this will attempt a trial alignment at the same time, you can then trim the material and press Align again to re-do the operation.
• Choose an appropriate VocALign setting before pressing Align (e.g. if the Dub is very long compared to the Guide, try ‘Maximum Compression’). Guidance is offered in the following section ‘Alignment settings’.
• After clicking Align, visually inspect the results in the Guide window. The peaks and troughs of the yellow (Aligned Dub) energy trace should line up generally with those of the Guide.
• If you make any changes to the selected waveform or energy profile region after initial alignment (such as modifying the start or end points), or you change any settings affecting alignment, the red bar appears above the Aligned track and Align will need to be pressed again to render new aligned audio.
• If the alignment does not look or sound satisfactory, there are a few options:
a. Select another setting or alignment mode, click Align again, and examine the results.
b. Adjust the ‘leader’ audio before the start of the Guide and Dub to be roughly equivalent, with the Dub leader being slightly longer.
c. Adjust the start or end of the Guide or Dub (see previous section).
d. Choose alternative Guide or Dub audio, if the original selection is thought to be causing the problem.
You can control the alignment settings, which greatly affect how the alignment performs. Use the alignment settings menu to select which preset is active, as shown in Figure 8-1.
The main setting characteristics are described in Table 1.
Fig. 8-1 Selecting the alignment mode
Alignment is not very flexible, sound quality may be best.
(Default): it is recommended to try this first as it works best in most cases.
Alignment is the most flexible of the settings, but may compromise sound quality.
Tries to match the Guide by time compressing the aligned audio as much as possible.
Tries to match the Guide by time-expanding the aligned audio as much as possible.
Table 8-1 Alignment settings
Further tips and tricks
To align just the start of the Dub with the Guide (and leave the rest of the Dub unprocessed), use the Guide End Point Selector to use only select 0.25 to 1.0 second of the Guide audio for processing and keep the Dub signal full length.
To stop the end of a Dub from being stretched to wrongly match a noisy or reverberant Guide, stop the end of the Guide for processing to be 0.25 to 1.0 second before the Guide signal of interest ends and use the entire Dub.
Foreign Dialog Synchronization
This section discusses the problems that arise in foreign language dialog replacement (dubbing) and how and when VocALign can be used to assist this process.
The quality of lip synch that can be achieved in foreign dialog replacement (often called dubbing, doublage, etc.) depends on many factors including:
• The quality of the translation.
• The accuracy of the timing of the new dialog recording.
• The ability of the editor to modify the new dialog.
VocALign will attempt to align one set of audio modulations to another, no matter what the audio signals are. Therefore, it can be used to align the modulations of recorded dialog in one language to recorded dialog in another language. Thus, when there is an audio Guide Track that is in good synch with the lip movements in the picture, VocALign can generally be used to improve the accuracy of the lip synch of the replacement foreign dialog.
Sometimes, unfortunately, the Guide Track will not be in close synch with the picture. This can occur if the Guide Track is itself already badly dubbed or in a different language from the original location recording. In this case, even if VocALign matches the new audio modulations to the Guide, they will be out of synch, just as the Guide is. An experienced editor must align the audio by ear and eye in this case, and VocALign is not likely to be of much assistance.
Also, the Guide track used for dubbing can sometimes contain music and effects. This makes the job for VocALign harder, but not impossible, since it must ‘ignore’ the music and effects in the Guide Track and match only to the dialog.
Lastly, the translation may demand that the two audio signals are significantly incompatible and the result could never be entirely satisfactory. In this case, the best result sometimes comes from making the two audio signals start together and, if possible, end together. VocALign can help in this case too.
Dialog editors should be very familiar with the use of VocALign on dialog replacement in the same language before attempting to use it for foreign dialog replacement. The following tips are suggestions and not rules, so the user should not only try these techniques, but also experiment further. Every line of dialog is likely to need individual attention.
When to use VocALign
If the Guide Track dialog is in synch with the picture, and the replacement foreign dialog has been translated and recorded to achieve good lip synch, then VocALign should be able to improve the quality of lip sync.
Break the dialog into appropriate length sections.
Remember that VocALign is deliberately restricted to stretching a part of a signal by a factor of 2.0 (100% expansion) and compressing it by a factor of 1/2 (50% compression). If VocALign is trying to expand or compress a gap in the speech, and it has used up its allowance, it may try to expand or compress the neighbouring speech.
This means, for example, that if gaps in the new and replacement dialog appear in different places or are of very different durations, after modifying the gap region, VocALign might also have to expand or compress the replacement speech signal near the gap to best align the modulations. This may lead to unwanted effects. In this case, the user should break the signal into sections that can be individually treated.
Sync up only the beginning of a line
To get only the beginning of a new line of dialog to synch up with the Guide, use the ‘latching’ technique described earlier. This means only selecting perhaps a second at the start of the guide that VocALign will try to synch the dub start to, and the entire replacement line for the dub.
Minimize the amount of VocALign’s Time Compression and Expansion
If VocALign is making too many timing modifications and creating unnatural sounding speech, try using the “Low Flexibility” setting.
For help and advice, visit the Synchro Arts support website at: http://www.synchroarts.com/index.php?PAGEID=support&ID=vocalign