marcus welz

Android: Ducking Audio with TextToSpeech

Posted on January 22, 2012

The Android app I'm working on makes use of TextToSpeech. It's also very likely that the user is playing music in the background. In order for the music to not drown out the TTS output, we'll make use of audio ducking, a feature that's "new" as of API level 8 (Android 2.2).

The ingredients needed for this is an instance of the TextToSpeech class and the AudioManager.

In order to get spoken text, we use TextToSpeech.speak(String text, int mode, HashMap<String,String> params);

Now, in order to properly get any background music to duck when we start speaking, and raise the volume when we're done speaking, we'll need to know when the TTS is done blabbering. And for this, we can register a callback via TextToSpeech.setOnUtteranceCompletedListener(). There are, however, a few pitfalls that we'll need to watch out for:

  1. TextToSpeech.setOnUtteranceCompletedListener() MUST be invoked after onInit() was called. Otherwise it won't necessarily register properly, and onUtteranceCompleted() may never be called.
  2. When invoking TextToSpeech.speak(), we must provide an "Utterance ID" via the params HashMap using TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID. If there's no ID, onUtteranceCompleted() will not be invoked.
  3. Because it is possible to queue up more utterances via TextToSpeech.speak() before the previous one has completed, we'll need to track how many we've queued up, so we know to only release audio focus after all of them have been completed. Otherwise we've got a race condition where we queued up two things to speak, but lose focus after the first one has completed. The result is a mess where music will cut in and out while text is being spoken.

With this in mind, we'll create a wrapper class which will encapsulate this functionality.

public class DuckingTTS {

    // Log tag
    private final String TAG = DuckingTTS.class.getName();

    // Debugging is on (set to false to cut down on spam)
    private static final boolean D = false;

    /**
     * Which audio stream to use. We use the music one here,
     * so that it'll be "in line" with the music volume from whatever is playing.
     */
    private final int STREAM_TYPE = AudioManager.STREAM_MUSIC;

    /**
     * Parameters we're feeding with each invocation of speak()
     */
    private HashMap ttsParams;

    /**
     * How many utterances are playing at a particular moment.
     */
    private int mUtterancesPlaying = 0;

    /**
     * The text-to-speech engine
     */
    private TextToSpeech mTts;

    /**
     * Whether TTS is initialized (onInit() called yet?)
     */
    private boolean mIsInitialized = false;

    /**
     * Queue up chatter when TTS is not initialized yet
     */
    private List queue = new ArrayList();

    /**
     * AudioManager injected via RoboGuice
     */
    @Inject
    private AudioManager mAm;

    /**
     * Context will be injected via RoboGuice
     * @param context
     */
    @Inject
    public DuckingTTS(Context context) {

         ttsParams = new HashMap();
         ttsParams.put(TextToSpeech.Engine.KEY_PARAM_STREAM, String.valueOf(STREAM_TYPE));
         ttsParams.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, "ID");

         mTts = new TextToSpeech(context, ttsOnInitListener);
         mIsInitialized = false;
    }

    public boolean speak(String text) {

        if (!mIsInitialized) {
            if (D) Log.d(TAG, "speak(\"" + text + "\") - queued");
            queue.add(text);
            return false;
        }

        // Tell it to speak
        if (D) Log.d(TAG, "speak(\"" + text + "\") - sending to TTS");
        mTts.speak(text, TextToSpeech.QUEUE_ADD, ttsParams);

        if (mAm == null) return true;

        // if this is the first utterance (e.g. we're not already talking) then request audio focus w/ducking.
        if (mUtterancesPlaying < 1) {
            int status = mAm.requestAudioFocus(audioFocus, STREAM_TYPE, AudioManager.AUDIOFOCUS_GAIN_TRANSIENT_MAY_DUCK);

            if (status == AudioManager.AUDIOFOCUS_REQUEST_FAILED) {
                Log.e(TAG, "speak() audio focus request failed.");
            }
        }
        mUtterancesPlaying++;

        return true;
    }

    public void shutdown() {
        mTts.shutdown();
    }

    private TextToSpeech.OnInitListener ttsOnInitListener =
            new TextToSpeech.OnInitListener() {

        /**
         * Callback for when the TextToSpeech engine was initialized.
         * Result will tell us whether this was successful or not.
         *
         * @param status
         */
        public void onInit(int status) {

            if (status != TextToSpeech.SUCCESS) {
                return; // we just abort on failure, it's never fully initialized
                // this can be bad, by the way, because every speak() call will now add something to the queue.
            }

            mIsInitialized = true;
            mTts.setOnUtteranceCompletedListener(ttsOnUtteranceCompletedListener);

            // Also speak anything that was queued up so far.
            for (String text : queue) {
                speak(text);
            }
        }

    };

    private TextToSpeech.OnUtteranceCompletedListener ttsOnUtteranceCompletedListener =
            new TextToSpeech.OnUtteranceCompletedListener() {

        /**
         * Callback when TTS has completed an utterance.
         */
        public void onUtteranceCompleted(String utteranceId) {
            if (D) Log.d(TAG, "onUtteranceCompleted(\"" + utteranceId + "\")");
            mUtterancesPlaying--;

            if (mAm == null) return;

            // once we're done speaking, lose audio focus.
            if (mUtterancesPlaying < 1) {
                mUtterancesPlaying = 0;
                mAm.abandonAudioFocus(audioFocus);
            }
        }

    };

    private AudioManager.OnAudioFocusChangeListener audioFocus =
            new AudioManager.OnAudioFocusChangeListener() {

        public void onAudioFocusChange(int focusChange) {

            // I don't think we actually care.
            if (D) Log.d(TAG, "onAudioFocusChange(" + focusChange + ")");

        }
    };

}

That's it. I use it in a service, and I wire it in with a simple annotation using RoboGuice:

public class GpsService extends RoboService {

    @Inject
    private DuckingTTS mDuckingTTS;

    // ... rest of the class implementation ...
}

As the final disclaimer: The class works for me and my purposes at the moment, but it doesn't handle every error scenario. Also, don't forget to call shutdown() to release the TTS resources.

Print This Post Print This Post
Filed under: Android No Comments

Android: Getting started with RoboGuice 2.0 (beta 3)

Posted on January 8, 2012

When I started messing around with Android, it consisted mostly of copying and pasting example code together to quickly get some results. That works, but the unfortunate side effect is that the Activity or Service class balloons out with functionality and features that are better off encapsulated according to proper object oriented concepts and best practices and what not.

However, once I started more time modeling classes I realized that there are an aweful lot of cases where you'll need to pass around contexts in order to get access to service providers. AudioManager for TextToSpeech, LocationManager for GPS, SensorManager for Accelerometer information, PowerManager for wake locks, just about anything worth doing required accessing a service provider. So as I started encapsulating functionality in classes I wasn't sure how to best go about initializing them. Do I keep passing around the context via the constructor, and provide setters and getters to inject mock services for unit testing? Do I use factories?

Luckily, I stumbled across RoboGuice, which extends Google's Guice dependency injection framework. Although the current "production ready" version of RoboGuice is at 1.1 (and uses Guice 2.0), RoboGuice version 2.0 uses Guice 3.0, and was simpler to set up — because it doesn't need a custom Application class. I'm all about simplicity (everything should be made as simple as possible, but not simpler).

Quick note here, I'm writing this from the point of retrofitting it to an already existing application. So I already have my application set up. I just want to take advantage of RoboGuice now to simplify it a bit.

Why?

Alright, so a good first question is, what the heck is RoboGuice, and why do I want to use it?

Essentially, RoboGuice is a dependency injection framework and allows for inversion of control. That's almost saying the same thing in two different ways. If that doesn't tell you anything, you should read up on those concepts. It would be silly for me to explain it here, since there are far better resources for that out there. In a nutshell, it helps streamline how objects are wired together by convention and configuration, allows for better separation of concerns, and, a very first and easy benefit to grasp, it reduces the amount of boiler plate code that needs to be written.

Take a look at A Simple Example that RoboGuice provides.

So right off the bat, their example shows that it's dead simple to wire up view object and system service providers, simply using annotations.

How?

As I said, I'm not even bothering with RoboGuice 1.1. Upgrading to RoboGuice 2.0 is explained on the RoboGuice Wiki. If you're completely new to it, however, it can be a bit overwhelming. To start from scratch, you need the following:

  1. Download the latest RoboGuice 2 snapshot. Currently, that's version 2.0b3. You'll want to drop this into your projects "/libs/" directory, which is where most other JAR files go as well if you use any (e.g. the fragments support backport android-support-v4.jar, or the Google Maps maps.jar, etc.)
  2. Download Guice 3.0. You want the guice-3.0-no_aop.jar. Again, this goes into the "/libs/" directory of your project.
  3. Not immediately obvious is that you'll also want to grab the guice-3.0.zip, because you need the javax.inject.jar from it. Yes, also goes into the "/libs/" directory.
  4. The JARs need to be added to your project, so in Eclipse, go to the Project menu, Properties, Java Build Path, Libraries tab, now "Add JARs", and add all three JARs (guice-3.0-no_aop.jar, javax.inject.jar, and roboguice-2.0b3.jar).

Okay, now your project has RoboGuice added to it, but nothing is using it yet.

Putting it to use

One of the first things you'll want to do is go into one of your application's existing Activities. If you're doing it the simple / old way your class probably just extends Activity. Just change it so it extends RoboActivity instead. If you're using fragments and your activity is a FragmentActivity class, just change it to be RoboFragmentActivity. If you're using any services, and have a  class that extends Service, modify the class to extend RoboService instead.

Then go through your onCreate() methods, rip out the findViewById() calls, and replace them with @InjectView annotations in front of your property declarations, it's easy to just check A Simple Example for reference again.

Instead of a setContentView() call in onCreate(), you can use the @ContentView(R.layout.layoutname) annotation right before your class definition.

For example:

@ContentView(R.layout.record)
public class RecordActivity extends RoboFragmentActivity
{
    @InjectView(R.id.txtDistance)	TextView txtDistance;
    @InjectView(R.id.txtTime)		TextView txtTime;
    @InjectView(R.id.txtPace)		TextView txtPace;
    @InjectView(R.id.btnStart)		Button btnStart;
}

I hope that helps you get started quickly and painlessly.

Print This Post Print This Post
Filed under: Android, Development 1 Comment

The Amazon Appstore

Posted on March 24, 2011

So a few days ago Amazon opened their own Android Appstore, a direct competitor to Google's Android Market. There are other "app stores" out there, for instance, AppBrain, which has provided a much nicer web experience than the official Market until about two months ago, and SlideME, which is offering more payment method coverage, geographically.

So with Amazon entering the fray, the question is how successful they will be. This isn't just another application distribution channel, run by a small start-up hoping to make their mark. This is Amazon, after all, a company that's streamlined an online market for books and expanded it to cover just about anything these days. Not to mention their cloud services such as EC2 and S3.

For one, Amazon came in with quite a bang, an exclusive release of Angry Bird Rio, the third release in the Angry Birds game series that many "mobile gamers" are so gung-ho about. Not to mention, the Angry Birds Rio release was free, at least for us consumers. I'm speculating that Amazon is absorbing most if not all of the purchase price for each distribution and treats every download as a conversion.

Second, although the game itself is free, one is required to set up payment information (e.g. entering credit card details to be kept on file) before the download is granted. This really is the magic key, as it sets up users to be able to make impulse purchases. After all, it was Amazon who "invented" (or at least patented) the 1-click purchase.

Going forward, it seems that Amazon is offering a "free app of the day" in order to attract new accounts.

But will it work? Are consumers aware that their actual purchases are made against Amazon's Appstore, and that this fosters a dependency on this Appstore instead of the more official (Google) Market? I, for one, think I'll continue to look at the free apps that are being offered, but ultimately, if I wanted to buy an app, I'd want to purchase it through the "more official" Market. Losing access to (in my eyes) a bunch of freebies I've acquired on the Amazon Appstore would be preferable to having to worry about where I bought something and how to get it all back after upgrading to another phone, for instance.

And ultimately, should Amazon decide to enter the Android tablet market, things may become even more interesting. A color version of the Kindle, similar to the Nook Color could run Android, but not include the Google licensed apps (which includes the Market), and in that case, Amazon's devices would solely rely on their own Appstore.

The danger then is the fragmented Android market, and potentially annoying consumers that are trying to get their previously (Google Market) purchased Android phone applications working on their Amazon tablet only running the Appstore.

Purely speculation, of course. But an interesting scenario nonetheless, and not too far fetched as far as I'm concerned.

Print This Post Print This Post