eXtensions: AMITIAE

AMITIAE - Wednesday 22 October 2014

System Preferences in OS X 10.10, Yosemite: Dictation & Speech

By Graham K. Rogers

With the latest release of OS X, 10.10, Yosemite there are a number of changes to System Preferences. The Dictation & Speech Preferences panel is for producing text from speech, and speech from highlighted text. While it appears basically the same, there are a number of improvements below the surface, particularly with the addition of new voices.

The Dictation & Speech panel has two sections: Dictation and Text to Speech.

Dictation

The dictation feature can be used in any application where typing is normally required, although there are limits to output. On the left side of the panel is a microphone icon (updated from before). This indicates input levels in white and black. As the sound increases, the white rises.

Below this icon, the currently selected input method is shown, for example, "Internal Microphone." Clicking on this reveals a small menu. By default, it is set to Automatic which uses the most suitable input method. This menu will change if other devices are connected to the computer.

There are two radio buttons in the center top of the Dictation panel: On and Off. A check box below is marked, "Use Enhanced Dictation". When this is checked for the first time, a download of software that will allow offline use and continuous dictation takes place (see below). Below a short text description of the feature are two buttons for Language and Shortcut.

The Language button showed English (US) and English (UK) in my settings. Using a Customize option there are now 41 languages to choose from: a significant increase from what had been available in Mavericks. These are grouped in sections (e.g. Catalan, Chinese, Croatian, Czech). Users in South-east Asia are now represented with selections for Indonesia, Malaysia, Thailand and Vietnam.
Results may vary with user input. While I had better success by changing Siri to a UK English voice on the iPhone, selecting "English (United Kingdom) did not provide the accuracy I wanted until I turned on Enhanced Dictation.
Pressing the Shortcut button reveals a small menu with a number of options for starting Dictation. Keys that may be specified include Fn (on some computers), Command keys. In the menu revealed by the button, shortcuts may also be turned Off, or a user may select Customize: a text box appears for the user's own choice(s) to be entered. Not all key combinations will be accepted.
Using Customize, I was able to use the F6 key (F11 and F12 were already allocated). If a key or combination cannot work as a shortcut, instead of a warning triangle as before, the feature (e.g. Volume) is shown. If a Shortcut is deselected, Dictation may be started by an item in the Edit menu of suitable applications. Changing the key combination settings in the Dictation pane changes the entry in the Edit menu item of applications immediately.

When Dictation is started, a small microphone icon appears on the screen that shows the language option, with a button marked "Done" which is pressed when dictation is completed. There is a limit to the amount of dictation a user is able to do in the basic setup. I found that with the standard settings, the input was stopped at about 30 seconds. "Done" then changes to "Cancel".

However, when Enhanced Dictation was selected, a download of almost 500MB took place and the text was then changed to remove mention of the download. There were a number of improvements. When I tried the microphone, I was first asked to confirm the language. Adding another language later, needed that to be downloaded. The download of the Thai package was some 714 MB.

The microphone icon is now grey. Sound levels are indicated by white, as on the main panel. I was able to produce text output in each language I tried. When Enhanced Dictation was off, the feature worked in the limited way that it had before. A user may select another dictation language by clicking on the language shown on the microphone icon. There is a delay while the software is loaded.

Sometimes output may need editing and correcting. For example, slight pauses may cause a capital letter to be typed. Accuracy is another problem: While spellings are generally correct, homonyms (four, for, fore) may need some fixing.

As an additional note, using the Thai option, I was able to write quite effectively, although I am only able to speak street Thai. I cannot read or write the language, but using the dictation feature, I was able to produce some sentences which were correct.

Used in conjunction with System Preferences > Accessibility, there are also several commands available to enable a user not only to dictate, but to correct, or enter system commands. The commands available in Accessibility may be deselected, or users may add their own commands

At the bottom of the screen, is a button marked, "About Dictation and Privacy". This opens a panel which has a text explanation of the use of online access and the storage of data on remote servers.

Text to Speech

The features available in this panel may be useful for those with limited eyesight, for second language learners, or for those whose time is limited. It allows highlighted text to be read out by the System Voice.

The main button at the top of the panel is marked System Voice, with the default being Alex. This voice has had considerable work done concerning its output and is perhaps the most natural voice available in OS X.

Several other voices are available. On my installation, I already have,

English (Australia) - Karen
India - Veena
English (Ireland) - Moira
English (Scottish Standard English) - Fiona
English (South Africa) - Tessa
English (United Kingdom) - Daniel
English (United States) - Female - Kathy, Vicki, Victoria
English (United States) - Male - Alex, Bruce, Fred
Thai - Narisa

At the bottom of the list is a "Customize" option, which shows another 41 regions, some of which have more than one voice (the Thai selection has "Kanya" and "Narisa"). There is also a section marked, English (United States) - Novelty. This contains a number of voices, such as "Bad News", "Deranged" and "Hysterical" (14 in all). Selecting any voice in the list makes a "Play" button live, so the voice can be tested right away.

I had already installed two: Fiona, which is a female voice with a Scottish accent; and Narisa, female with a Thai accent. "Narisa" has some interesting "errors", some of which are close to how some Thai speakers read out English. There are a number of inaccuracies, however, that are not simply down to accented speech.

Below the voice selector button is a slider to adjust the rate at which the voice speaks text. The Fast setting is perhaps only understandable by a native speaker, but even then is too fast for full understanding. The Slow setting is almost painfully slow but may well be useful for those learning to read, or for non-native speakers of English. The Normal setting produces output at a reasonable speed for a native speaker. Adjustment using the slider is easy and users should experiment to find the output that suits them best.

Beside the slider on the Text to Speech pane is a Play button which produces an example of the voice selected, at the set speed.

In the middle of the panel is a faint horizontal line. Below this are features that work with the System.

The first is a checkbox that allows Alerts to be announced (for example when the battery is low). A button alongside (live only when the box is checked) allows options to be set. Options are:
- Voice: The System Voice is selected as default, but this item reveals all voices installed (plus Customize);
- Phrase: this is what the voice selected will announce when there is a need for user attention. If "Application name" (default) is used, the voice will announce something like, "Aperture needs your attention". Options include the next phrase in the list and a random phrase. There are specific phrases (Alert, Attention, Excuse me, Pardon me) and users may create their own using an editing panel available;
- Delay: this uses a slider to provide a delay of from 0 to 60 seconds between the need for action and the announcement;
- Below the slider are three buttons: Play (to test the voice), Cancel and OK.
A second checkbox is used for announcing selected text using key commands. Alongside is a button to change from the default. Some keys may not work if they are already allocated to other functions (e.g. F12 - speaker volume or Dashboard).

Text to Speech may not work successfully with some non-Apple applications. However, Services (in the application's main menu) may provide an alternative by adding the highlighted text to iTunes.

There are two other options available that use the System Voice, but these need to be activated by other System Preferences: Have the clock announce the time; and Change VoiceOver Settings. Buttons beside these options open Date & Time Preferences or Accessibility Preferences respectively.

System Preferences in OS X 10.10, Yosemite: Dictation & Speech

Dictation

Text to Speech

See Also: