Computer learns sign language

Researchers made progress enabling a computer to teach itself British Sign Language by analyzing video footage. The scientists from the University of Oxford and University of Leeds first programmed a machine vision algorithm so the computer could identify the shapes of hands in the video. From New Scientist:
Once the team were confident the computer could identify different signs in this way, they exposed it to around 10 hours of TV footage that was both signed and subtitled. They tasked the software with learning the signs for a mixture of 210 nouns and adjectives that appeared multiple times during the footage.

The program did so by analysing the signs that accompanied each of those words whenever it appeared in the subtitles. Where it was not obvious which part of a signing sequence relates to the given keyword, the system compared multiple occurrences of a word to pinpoint the correct sign.

Starting without any knowledge of the signs for those 210 words, the software correctly learnt 136 of them, or 65 per cent, says Everingham. "Some words have different signs depending on the context – for example, cutting a tree has a different sign to cutting a rose." he says, so this is a high success rate given the complexity of the task.
"Computer learns sign language by watching TV"


  1. Congratulations! With enough work a production company can have their very own computer-generated signer – no need for those pesky human beings for that job anymore.

  2. @ Ian70 #1:

    I’m guessing that’s a joke, but I still remember when news broadcasts sometimes included a little picture-in-picture view of a person signing whatever the anchor was saying for the benefit of deaf viewers. They must have stopped doing that around the same time that somebody invented closed-captioning.

  3. Learning signs is one thing; learning sign language is another thing entirely.

    Sign language has a very complex grammar and inflectional system (there are a number of regional and national sign languages, but I feel fairly sure that most if not all of them fit that description; I know American Sign Language does). What the computer has done is akin to learning individual lexemes in a spoken language without knowing how to correctly put them together.

  4. Lords, this was done (non-computer) for ASL back in the eighties or nineties.

    Signs have three vectors. Hand shape, position, and movement. Once you get those down, only close homonyms (as it were) can be confused.

    So, cool that it was done by computer, but I’m curious to why it was only 65% successful.

  5. @ bazzargh #3:

    That’s interesting. Is there some advantage to BSL over closed captioning? Are there many people who are fluent in sign language but text-illiterate?

  6. #6: Sounds like a basic courtesy to present it in the signers native language, instead of in written English.

    Sign languages are, with very few exceptions, only vaguely related or even completely unrelated to spoken languages. ASL is not “American English, transcribed to gestures”, and as far as I know, BSL isn’t either. They are their own languages, with their own grammatical structures and distinct vocabularies.

    Written English is a second language for deaf children – and a very difficult one, because the child often has to learn the written form before learning the spoken form through lip reading.

    To my knowledge, there is no written form of any sign language yet – short of video recordings. A tad inconvenient for filling out forms and writing books…

  7. @8 Jerril

    There was, or is, but it came about in the late eighties/early nineties, just in time to be eclipsed by the internet. That’s what I was referencing in my previous post.

    Unfortunately, my google-fu is weak with this one…

  8. (@2 … from old SNL:)

    Chevy Chase: And now, as a public service to those of our viewers who have difficulty with their hearing, I will repeat the top story of the day, aided by the Headmaster of the New York School for the Hard of Hearing, Garrett Morris.

    [Garrett’s face appears in a circle to Chevy’s right]

    Chevy Chase: Our top story tonight…

    Garrett Morris: [screaming] OUR TOP STORY TONIGHT…

  9. To me, its not surprising at all. Everything I know I learned from watching television.

  10. I believe that this will be a useful tool in advancing the rights and culture of the deaf. However I find it insulting that I cannot share this newscast with my deaf friends because there are no subtitles!

  11. I believe the narrator must have it wrong at 0:40.

    They want to use this software in order to generate “a very realistic signer”? Wouldn’t this technology be better applied to translate FROM sign language to text or spoken word rather than the opposite? No optical recognition is necessary to do what she says at 0:40, I don’t see how it applies and its pretty unimaginative.

  12. nah, it isn’t that amazing,

    the most amazing thing was that monitor mind reader, if your not familiar with it, search the net and think how frequencies can read your mind.

  13. Facial expression and lip pattern are a crucial component of British Sign Language syntax. Until this software analyses the face and lips along with the arms and hands, it’s never going to be very accurate.

    That’s only scratching the surface of the subtleties and complexities of the language that software designed solely to identify vocabulary would miss. You could literally write a book on what this software misses…

  14. @6 “Lords, this was done (non-computer) for ASL back in the eighties or nineties.

    Signs have three vectors. Hand shape, position, and movement. Once you get those down, only close homonyms (as it were) can be confused. ”

    There’s more now. If I remember correctly, there’s now accepted to be about 6 orthogonal vectors in sign language, and they don’t think they’ve found all of them yet. IIRC the list includes handshape, orientation, direction, repetition, positioning, and another one I can’t remember.

    Also don’t forget that hands only convey about 50% of the meaning in sign lanuage – the other 50% is facial expressions, eye gaze, head position, etc etc.

    And that’s without going into grammatical elements like indexing, referencing, topic marking, time lines, etc.

    Basically, computer sign translation is always gonna be crap. Computers have a hard time translating spoken english into written english, or from one language into another, and that’s with the billions pumped into these use-cases.

    The quality of these translations is so dire they’re only accepted straight with severe reluctance. Computerised sign translation is even worse, with far less money going into it. To say deaf people aren’t keen on it is a massive understatement.

Comments are closed.