Moving speech recognition from software to silicon: the in silico vox project

Lin, Edward C.; Yu, Kai; Rutenbar, Rob A.; Chen, Tsuhan

doi:10.21437/Interspeech.2006-103

Moving speech recognition from software to silicon: the in silico vox project

Edward C. Lin, Kai Yu, Rob A. Rutenbar, Tsuhan Chen

To achieve much faster decoding, or much lower power consumption, we need to liberate speech recognition from the artificial constraints of its current software-only form, and move the essential computations directly into silicon. There are vast efficiencies waiting to be unlocked in this application - we need the proper architecture to do so. We report results from a first-generation hardware architecture simulated at bit-level, and a complete, working FPGA-based prototype. Simulation results show that rather modest hardware designs, running 10-20X slower than conventional processors, can already decode at 0.6 xRT, running the standard 5K Wall Street Journal benchmark.

doi: 10.21437/Interspeech.2006-103

Cite as: Lin, E.C., Yu, K., Rutenbar, R.A., Chen, T. (2006) Moving speech recognition from software to silicon: the in silico vox project. Proc. Interspeech 2006, paper 1942-Thu1CaP.12, doi: 10.21437/Interspeech.2006-103

@inproceedings{lin06_interspeech,
  author={Edward C. Lin and Kai Yu and Rob A. Rutenbar and Tsuhan Chen},
  title={{Moving speech recognition from software to silicon: the in silico vox project}},
  year=2006,
  booktitle={Proc. Interspeech 2006},
  pages={paper 1942-Thu1CaP.12},
  doi={10.21437/Interspeech.2006-103}
}