Software Patent of the Week: Speech Recognition

by on February 6, 2007 · 10 comments

Every week, I look at a software patent that’s been in the news. You can see previous installments in the series here. There haven’t been any major patent controversies this week, so our patent of the week comes from last November. VoiceSignal Technologies has sued Nuance Communications over its voice patent. I’ll discuss the patent below the fold.

Here’s a key description of how the patent works:

The speech segment data processing part 2203 analyzes the speech segment data generated by the speech segment data generating part 2202 and recognizes the meaning. Conventional methods for recognizing the meaning are specifically described in “Digital Speech Process” (by S. Furui, Tokai University Publishing Association)(published in English translation as “Digital Speech Processing Synthesis and Recognition” (Marcel Dekker 1989)). Generally, a speech recognition dictionary storing part 2204 includes a phoneme dictionary and a word dictionary as dictionaries for speech recognition. In the speech recognition process, a phoneme is recognized based on the distance or the similarity between the short time spectrum of an input speech and that of the reference pattern, and the meaning of the speech is identified by a word matching the recognized phoneme sequence in the word dictionary. However, conventional speech recognizers posed the problem that it is not easy to correct an error in the recognition of speech data. More specifically, in the actual speech recognition made by humans, the speech data that was not recognized correctly at first can be corrected later in the context of the conversation for understanding, and the action of the people who are talking is also corrected accordingly. However, in conventional speech recognizers, it is not easy to correct the meaning of the speech once it has been recognized wrongly. Therefore, for example, in an apparatus in which a command is input by speech, it is difficult to correct an operation in the case that an erroneous command was input due to an error in the recognition of the speech data. Thus, the range for the application of speech recognizers is limited.

I think it’s worth conceding that this patent seems to have more merit than most of the ones I’ve looked at to date. At a minimum, the detailed description contains a lot of technical details that could add up to an original and innovative voice recognition system, although I don’t know the relevant literature to judge that for sure. But it’s also worth noting how broad the resulting claim is. Because they’ve come up with one mechanism for re-evaluating previously-recognized speech, they believe they’re entitled to a legal monopoly on any software product that includes a re-evaluation mechanism. This strikes me as unreasonably broad; they’re claiming a general software strategy, not a particular invention.

  • http://weblog.ipcentral.info/ Noel Le

    Tim, you raise the issue of claim construction in this Patent of the Week post.

    There is one recent patent case where the court limited the scope of the patent to the technical merits as reflected in the disclosure (for the life of me, I can’t recall it offhand).

    It would probably be a good idea to see how patents generally are limited in scope by factors inside and outside their claim construction.

  • http://weblog.ipcentral.info/ Noel Le

    Tim, you raise the issue of claim construction in this Patent of the Week post.

    There is one recent patent case where the court limited the scope of the patent to the technical merits as reflected in the disclosure (for the life of me, I can’t recall it offhand).

    It would probably be a good idea to see how patents generally are limited in scope by factors inside and outside their claim construction.

  • http://bennett.com/blog Richard Bennett

    Tim, isn’t it time you put this series of dubious articles on dubious patents to rest? In each instance, you’re forced to admit that you’re not capable of evaluating the claims, and you seem not even to understand the difference between the actual claims and the summary and description. And you don’t even bother to include a link to the actual patent itself so that more qualified people might instruct you.

    If your point is to show that there’s a lot of litigation around patents, that’s well taken, but neither you nor your audience is qualified to pass judgment on the merits or demerits of any particular patent. You’ve demonstrated that by presenting strawman simplifications over and over again.

    Isn’t there a more effective way for you to communicate your frustration with the patent system than by trying to play amateur patent examiner?

  • http://bennett.com/blog Richard Bennett

    Tim, isn’t it time you put this series of dubious articles on dubious patents to rest? In each instance, you’re forced to admit that you’re not capable of evaluating the claims, and you seem not even to understand the difference between the actual claims and the summary and description. And you don’t even bother to include a link to the actual patent itself so that more qualified people might instruct you.

    If your point is to show that there’s a lot of litigation around patents, that’s well taken, but neither you nor your audience is qualified to pass judgment on the merits or demerits of any particular patent. You’ve demonstrated that by presenting strawman simplifications over and over again.

    Isn’t there a more effective way for you to communicate your frustration with the patent system than by trying to play amateur patent examiner?

  • http://bennett.com/blog Richard Bennett

    For example, you say: Because they’ve come up with one mechanism for re-evaluating previously-recognized speech, they believe they’re entitled to a legal monopoly on any software product that includes a re-evaluation mechanism.

    But in fact the patent makes no such claim. It simply relates to one particular method of correcting speech recognition mistakes that uses a user-specific dictionary and specific contextual clues. The claim is therefore much more narrow than what you allege, and your post would be much less alarming than it would be if your allegations were correct.

  • http://bennett.com/blog Richard Bennett

    For example, you say: Because they’ve come up with one mechanism for re-evaluating previously-recognized speech, they believe they’re entitled to a legal monopoly on any software product that includes a re-evaluation mechanism.

    But in fact the patent makes no such claim. It simply relates to one particular method of correcting speech recognition mistakes that uses a user-specific dictionary and specific contextual clues. The claim is therefore much more narrow than what you allege, and your post would be much less alarming than it would be if your allegations were correct.

  • http://www.techliberation.com/ Tim Lee

    Richard,

    I ordinarily do include a link to the patent in question, although in this case, it was the first link in the news story I linked to above, so it’s not exactly hard to find.

    But in fact the patent makes no such claim. It simply relates to one particular method of correcting speech recognition mistakes that uses a user-specific dictionary and specific contextual clues.

    I don’t think that’s quite right. Here’s the relevant part of the first claim:

    …wherein the speech data is reevaluted based on at least one of the speaker categorization results at a point of time by switching to a recognition dictionary corresponding to the speaker categorization results, and the speech data corresponds to the speech recognition results that have been obtained at that point of time, at that point of time and before, or before that point of time.

    So you’re right that this doesn’t apply to absolutely any reevaluation. However, it seems to me that the vast majority of re-evaluations that a speech-recognition system might actually be interested in making would be covered by these claims.

    As for your point about me playing “amateur patent examiner,” a couple of points. First: the time constraints of blogging necessarily require that I opine on subjects on which I’m not necessarily an expert. I try to deal with that by being up front about when I’m venturing beyond my area of expertise, and by counting on readers to point out when I go off the rails. But if I thoroughly researched every sentence I posted here, there would be a lot less content.

    Secondly, you’re right that I’m not always qualified to evaluate each patent in detail. In fact, I acknowledged that in my post! However, the I don’t think that means that we should all shut up and let the patent bar run things. The fact that “neither you nor your audience is qualified to pass judgment on the merits or demerits of any particular patent” is precisely the problem. The lawyers have made the patent system so Byzantine that most people just throw up their hands and take the lawyers’ word for it when they claim there’s rhyme and reason to it. I think it would be a mistake to allow the complexity of the patent system scare us off from giving it serious scrutiny.

  • http://www.techliberation.com/ Tim Lee

    Richard,

    I ordinarily do include a link to the patent in question, although in this case, it was the first link in the news story I linked to above, so it’s not exactly hard to find.

    But in fact the patent makes no such claim. It simply relates to one particular method of correcting speech recognition mistakes that uses a user-specific dictionary and specific contextual clues.

    I don’t think that’s quite right. Here’s the relevant part of the first claim:

    …wherein the speech data is reevaluted based on at least one of the speaker categorization results at a point of time by switching to a recognition dictionary corresponding to the speaker categorization results, and the speech data corresponds to the speech recognition results that have been obtained at that point of time, at that point of time and before, or before that point of time.

    So you’re right that this doesn’t apply to absolutely any reevaluation. However, it seems to me that the vast majority of re-evaluations that a speech-recognition system might actually be interested in making would be covered by these claims.

    As for your point about me playing “amateur patent examiner,” a couple of points. First: the time constraints of blogging necessarily require that I opine on subjects on which I’m not necessarily an expert. I try to deal with that by being up front about when I’m venturing beyond my area of expertise, and by counting on readers to point out when I go off the rails. But if I thoroughly researched every sentence I posted here, there would be a lot less content.

    Secondly, you’re right that I’m not always qualified to evaluate each patent in detail. In fact, I acknowledged that in my post! However, the I don’t think that means that we should all shut up and let the patent bar run things. The fact that “neither you nor your audience is qualified to pass judgment on the merits or demerits of any particular patent” is precisely the problem. The lawyers have made the patent system so Byzantine that most people just throw up their hands and take the lawyers’ word for it when they claim there’s rhyme and reason to it. I think it would be a mistake to allow the complexity of the patent system scare us off from giving it serious scrutiny.

  • http://bennett.com/blog Richard Bennett

    Yes, those damn lawyers. I’m not suggesting you shouldn’t complain about patents, I’m saying your method leaves much to be desired.

  • http://bennett.com/blog Richard Bennett

    Yes, those damn lawyers. I’m not suggesting you shouldn’t complain about patents, I’m saying your method leaves much to be desired.

Previous post:

Next post: