Microsoft has been using deep neural networks for awhile now to power its speech recognition technologies bundled into Windows and Skype to identify and follow commands and to translate speech respectively. This technology is part of Microsoft's Computational Network Toolkit. Last April, the company made this toolkit available to academic researchers on Codeplex, and it is now opening it up even more by moving the project to GitHub and placing it under an open source license.
Lead by chief speech and computer scientist Xuedong Huang, a team of Microsoft researchers built the Computational Network Toolkit (CNTK) to power all their speech related projects. The CNTK is a deep neural network for machine learning that is built to be fast and scalable across multiple systems, and more importantly, multiple GPUs which excel at these kinds of parallel processing workloads and algorithms. Microsoft heavily focused on scalability with CNTK and according to the company's own benchmarks (which is to say to be taken with a healthy dose of salt) while the major competing neural network tool kits offer similar performance running on a single GPU, when adding more than one graphics card CNTK is vastly more efficient with almost four times the performance of Google's TensorFlow and a bit more than 1.5-times Torch 7 and Caffe. Where CNTK gets a bit deep learning crazy is its ability to scale beyond a single system and easily tap into Microsoft's Azure GPU Lab to get access to numerous GPUs from their remote datacenters — though its not free you don't need to purchase, store, and power the hardware locally and can ramp the number up and down based on how much GPU muscle you need. The example Microsoft provided showed two similarly spec'd Linux systems with four GPUs each running on Azure cloud hosting getting close to twice the performance of the 4 GPU system (75% increase). Microsoft claims that "CNTK can easily scale beyond 8 GPUs across multiple machines with superior distributed system performance."
Using GPU-based Azure machines, Microsoft was able to increase the performance of Cortana's speech recognition by 10-times compared to the local systems they were previously using.
It is always cool to see GPU compute in practice and now that CNTK is available to everyone, I expect to see a lot of new uses for the toolkit beyond speech recognition. Moving to an open source license is certainly good PR, but I think it was actually done more for Microsoft's own benefit rather than users which isn't necessarily a bad thing since both get to benefit from it. I am really interested to see what researchers are able to do with a deep neural network that reportedly offers so much performance thanks to GPUs. I'm curious what new kinds of machine learning opportunities the extra speed will enable.
If you are interested, you can check out CNTK on GitHub!