Sound recognition API, SDK (Android) [closed]

A

2

11

I need to make an Android app that can recognize certain sound files created by me, and do an action on recognition. So something similar to Shazam/Soundhound, but with my own sound files. Is there any API or SDK or something for this? I've read about Echoprint, but i understand it is for Windows and iOS and it seems quite difficult for me. Would that work? Or are there any other options?

PS: To make it clear, i don't want voice recognition, or text-to-speech. My sound files can have music, distorted voice, effects etc

Alloy answered 20/6, 2013 at 7:10 Comment(5)

whats this?you don't want voice recognition ??so how you can recognize the sound file??? – Argillite 20/6, 2013 at 7:12

as i said, i want it to recognize sound files like Shazam or Soundhound, not somebody's voice commands – Alloy 20/6, 2013 at 7:55

this was also used in (Qualcomm's Gimbal) Star Trek Into Darkness app, but the sdk feature hasn't been released to the public yet – Alloy 20/6, 2013 at 7:59

This is audio feature extraction and audio fingerprinting problem. There is no shortage of academic research into different approaches. Robust (e.g against playback speed adjustment, EQ, distortion, compression) tend to be proprietary (essentially, Shazam's main asset is its algorithm). There are plenty of far less robust and non systematic approaches that are published however, possibly with source-code. Sonic Visualizer is a good place to plunder for both approaches and source-code. This is a particularly difficult problem – Circumjacent 20/6, 2013 at 8:55

I know it's about audio fingerprinting, but I don't want to create the system.. That's a whole project by itself. I want to use a system that's already created for this, that's why I was asking for any APIs or SDKs that might be around – Alloy 21/6, 2013 at 21:43

A

1

One year later, and I've ended up using Echoprint compiled for Android as explained here. It gets some results, but in general it works pretty poorly, especially with custom sound files. Echoprint is not designed for OTA recognition. I would recommend it for a testing/prototyping kind of thing, but not for production. Unfortunately, so far it's the only one allowing you to have your own server and sound files.

Alloy answered 4/8, 2014 at 11:17 Comment(2)

ACRCloud is a Audio/Music recognition service, which supports user defined searching DB, that means user could upload their own audio/music files to build the audio/music's index. Please see : github.com/acrcloud/webapi_example and console.acrcloud.com/demo – Brufsky 6/9, 2015 at 11:50

I am working on a project of speaker recognition/speaker identification by pre-store sound. This will be helpful for that??? – Unipersonal 25/1, 2018 at 8:3

B

4

ACRCloud supports Music/Audio search engine, 50 million songs/User-upload content are supported, SDK for iOS/Android/Linux, which could be downloaded after registration (http://console.acrcloud.com/signup). There are three tiers for the customers:

Free tier, for demo/prototyping
Accelerating tier, for startups
Commercial tier

wish this helps

Brufsky answered 24/6, 2015 at 11:51 Comment(2)

I just threw together a test app of this service and it looks very promising. The docs definitely need and overhaul, and all around it looks like a service that is still maturing, but it was very easy to get started, and it worked on the first attempt. – Plural 11/7, 2015 at 5:6

thanks, we have improved the console and sample code could be found here: github.com/acrcloud/webapi_example – Brufsky 6/9, 2015 at 11:45

A

1

One year later, and I've ended up using Echoprint compiled for Android as explained here. It gets some results, but in general it works pretty poorly, especially with custom sound files. Echoprint is not designed for OTA recognition. I would recommend it for a testing/prototyping kind of thing, but not for production. Unfortunately, so far it's the only one allowing you to have your own server and sound files.

Alloy answered 4/8, 2014 at 11:17 Comment(2)

ACRCloud is a Audio/Music recognition service, which supports user defined searching DB, that means user could upload their own audio/music files to build the audio/music's index. Please see : github.com/acrcloud/webapi_example and console.acrcloud.com/demo – Brufsky 6/9, 2015 at 11:50

I am working on a project of speaker recognition/speaker identification by pre-store sound. This will be helpful for that??? – Unipersonal 25/1, 2018 at 8:3

Recommended topics

Hot tags