Sun Research Developing Audio Search Tools

Several months ago I blogged about an audio search research project at Sun Labs called, Search Inside the Music. Recently, Search Firms: Google, Yahoo … And Sun? from Investors Business Daily offers a bit more info.

[Paul] Lamere is working on a system that studies the sound qualities of a song a user likes. It uses that information to find similar or diverse music, depending on the listener’s mood. Analyzing the acoustics, tempo and melody of a song by, say, rapper Eminem could let the system recommend a song by his protege, 50 Cent.

The IBD story also highlights two other research projects at Sun Labs.

Sun is doing research that’s developing a new way to search recorded speech.

In the speech search project, Sun researchers are using speech recognition technology to create a searchable index of text. It combines that with time stamps on the recording so users can zero in on the audio snippet they’re looking for.

Companies like Nexidia and StreamSage are also in this space. HP’s Speechbot has offered voice recognition searching for many years.

However, the article points out that what Sun is working on is not “exactly” speech recoginition.

Rather than wasting computing power on translating the speech with 100% accuracy, the software takes note of all the things the speaker might be saying. Taking note of all those possibilities results in a document that’s bigger than a perfect transcript would be, admits Paul Martin, who’s leading the speech search project at Sun Labs. But it works because storage has gotten much cheaper, and computers can search lots of text easily.

The other project discussed is a tool that automatically classifies documents by looking at “similarities” between old and new documents.

First, the user would train the system by creating a set of new folders — one labeled “cars,” for example, and another named “animals.”

The user would put a few sample documents in each. When a new document needs to find the right folder, the system would compare keywords.

A report on how to groom a Labrador retriever would probably use a lot of the same words as a previously filed paper on cats: “fur,” “pet,” “feed” and “vet.” The system would know to put the new document in the “animals” folder.

According to the article, Sun has no plans to market these products as standalone tools but rather use them as “showcases” by adding them to other Sun products.

Related reading

How to lead SEO teams and track its performance effectively: Experts tips
SEO is a team sport: How brands and agencies organize work
How to pitch to top online publishers: 10 exclusive survey insights
search reports for ecommerce to pull now for Q4 plan