Sound recognition API, SDK (Android) [closed]
Asked Answered
A

2

11

I need to make an Android app that can recognize certain sound files created by me, and do an action on recognition. So something similar to Shazam/Soundhound, but with my own sound files. Is there any API or SDK or something for this? I've read about Echoprint, but i understand it is for Windows and iOS and it seems quite difficult for me. Would that work? Or are there any other options?

PS: To make it clear, i don't want voice recognition, or text-to-speech. My sound files can have music, distorted voice, effects etc

Alloy answered 20/6, 2013 at 7:10 Comment(5)
whats this?you don't want voice recognition ??so how you can recognize the sound file???Argillite
as i said, i want it to recognize sound files like Shazam or Soundhound, not somebody's voice commandsAlloy
this was also used in (Qualcomm's Gimbal) Star Trek Into Darkness app, but the sdk feature hasn't been released to the public yetAlloy
This is audio feature extraction and audio fingerprinting problem. There is no shortage of academic research into different approaches. Robust (e.g against playback speed adjustment, EQ, distortion, compression) tend to be proprietary (essentially, Shazam's main asset is its algorithm). There are plenty of far less robust and non systematic approaches that are published however, possibly with source-code. Sonic Visualizer is a good place to plunder for both approaches and source-code. This is a particularly difficult problemCircumjacent
I know it's about audio fingerprinting, but I don't want to create the system.. That's a whole project by itself. I want to use a system that's already created for this, that's why I was asking for any APIs or SDKs that might be aroundAlloy
A
1

One year later, and I've ended up using Echoprint compiled for Android as explained here. It gets some results, but in general it works pretty poorly, especially with custom sound files. Echoprint is not designed for OTA recognition. I would recommend it for a testing/prototyping kind of thing, but not for production. Unfortunately, so far it's the only one allowing you to have your own server and sound files.

Alloy answered 4/8, 2014 at 11:17 Comment(2)
ACRCloud is a Audio/Music recognition service, which supports user defined searching DB, that means user could upload their own audio/music files to build the audio/music's index. Please see : github.com/acrcloud/webapi_example and console.acrcloud.com/demoBrufsky
I am working on a project of speaker recognition/speaker identification by pre-store sound. This will be helpful for that???Unipersonal
B
4

ACRCloud supports Music/Audio search engine, 50 million songs/User-upload content are supported, SDK for iOS/Android/Linux, which could be downloaded after registration (http://console.acrcloud.com/signup). There are three tiers for the customers:

  • Free tier, for demo/prototyping
  • Accelerating tier, for startups
  • Commercial tier

wish this helps

Brufsky answered 24/6, 2015 at 11:51 Comment(2)
I just threw together a test app of this service and it looks very promising. The docs definitely need and overhaul, and all around it looks like a service that is still maturing, but it was very easy to get started, and it worked on the first attempt.Plural
thanks, we have improved the console and sample code could be found here: github.com/acrcloud/webapi_exampleBrufsky
A
1

One year later, and I've ended up using Echoprint compiled for Android as explained here. It gets some results, but in general it works pretty poorly, especially with custom sound files. Echoprint is not designed for OTA recognition. I would recommend it for a testing/prototyping kind of thing, but not for production. Unfortunately, so far it's the only one allowing you to have your own server and sound files.

Alloy answered 4/8, 2014 at 11:17 Comment(2)
ACRCloud is a Audio/Music recognition service, which supports user defined searching DB, that means user could upload their own audio/music files to build the audio/music's index. Please see : github.com/acrcloud/webapi_example and console.acrcloud.com/demoBrufsky
I am working on a project of speaker recognition/speaker identification by pre-store sound. This will be helpful for that???Unipersonal

© 2022 - 2024 — McMap. All rights reserved.