A free fortnightly newsletter on Taxonomy, Thesauri & Ontology and Semantic Publishing
Getting one AI to teach another
To improve speech processing, researchers at Massachusetts Institute of Technology (MIT) used two machine-learning networks in tandem. Similar to an example of multi-modal training where vision and sound reinforce one another, the researchers applied this use of networks with image recognition and speech processing. According to Tiernan Ray, a contributing writer with ZDNET, this development could simplify the task of natural language processing radically.
Machine learning has made computers good at recognizing images. This made researchers at MIT examine how image recognition capability can be used to teach computer other things. Therefore, the researchers hooked up natural language processing to image recognition.
MIT coordinated the activity of two machine learning systems, one for image recognition and another for speech parsing. The image network learned to pick out the exact place of an object in a picture. Simultaneously, the speech network picked out the exact moment in a sentence containing a word for that object in the picture. Evidently, the two networks learned together, building up one another until their answers converged and coincided. The converging point represented the union of the location of the object and the moment of the spoken word. The two networks “co-localized,” spatially and temporally.
Click here to find out how the researchers based their research on the learning process of babies.