At the Streaming Media East 2018 conference, David Clevinger, Senior Director, Product Management & Strategy, IBM Cloud Video, discussed how Watson’s AI technology was used at the recently concluded US Open Tennis Championship:
The typical use case that we’ve been seeing is media entities that have large back catalogs of content that was originally created when they didn’t have complex metadata toolsets, didn’t have necessarily the right people applying metadata, didn’t think of all the use cases on the output side, such as historical content.
Teaching Watson About Tennis
A very concrete example is work that we’ve done for the US Open. We actually took hundreds of thousands of video clips and photos and news articles and vocabulary terms and proper names and fed it to Watson and helped Watson to understand what tennis was about. This was so that it could do things like when you heard the word “ash” it was capital ASHE, Arthur Ashe, as opposed to lowercase. There was a lot of training around that.
The output then became our ability to create clips based on what was happening within an event but also to describe historical video as well. That’s critical for companies with large media back catalogs who then need to optimize that before. You can apply it to live, of course, but that’s a typical use case that we see.
It’s a Recursive Learning System
It’s a recursive learning system where we took a cross-section of a set of video assets, described it to Watson, said this is what’s going on, this is who this player is, and this is what is being said. We were then able to turn it loose really on other unstructured assets, have it say what it thought it was finding, and then we were able to correct it.
We were able to basically train it up to understand tennis specifically.
Teaching Watson to Score Excitement
Then the output was we could then turn it loose on a bunch of different kinds of outputs for the client. The outputs are closed captioning, video clips, and excitement scoring. We were able to do things like listen for crowd noise and then say this must be really exciting because the crowd is making a lot of noise at this moment, so we were able to turn that into an excitement score.
We wouldn’t be able to do that if we didn’t really help the algorithm understand what it was looking at and how it should be thinking about that body of work. Then we just turned it loose and let it go.
That’s the idea, to get it to the point where you can just turn it loose and let it run.