Things I have learned along the Way today! Speech to Text project log 1

theknowledgejack

Things I have learned along the Way today! Speech to Text project log 1

First things first, I know that phonems are the first thing I need to research. Well actually, I probably need to figure out how I am going to do the structure for the program. But I definitely need to understand phonems, from what I understood from the first video I watched phonems seem to make everything with speech to text recognition like so much better. I think people who worked on end-to-end style ASR systems were doing like character by character recognition which sounds like it is super inefficient to me. – 4:58 PM

So I am going to do some research about phonems! – 5:00 PM

What is a phonem? According to google:

any of the perceptually distinct units of sound in a specified language that distinguish one word from another, for example p, b, d, and t in the English words pad, pat, bad, and bat.

5:02 PM

https://www.education.vic.gov.au/Documents/school/teachers/teachingresources/discipline/english/literacy/44SoundsofAusEnglish.pdf

5:03 PM

https://realpython.com/python-speech-recognition/

https://pypi.org/project/assemblyai/

https://pypi.org/project/pocketsphinx/

https://pypi.org/project/SpeechRecognition/

https://pypi.org/project/wit/

So I found this webpage that seems to be talking about how to do it more in depth, however it seems like almost all of the stuff I would need to do would require internet and API keys to paid services! So that’s cool, thanks but no thanks! Still will continue to research about it, because currently I do not understand if some of the libraries it speaks of can be used completely offline. This will be fun! I can already tell! – 5:13 PM

Can assemblyai work offline? No. You need an API key! Scratch that one. The python SpeechRecognition library? Maybe? I was reading it, it just seems to be a wrapper. That means the library that supposed does speech recognition really actually doesn’t! WOOOOO! It simply calls other libraries that can! Oh and those ones use API keys and wifi! WOOOO! There are two things it says it supports offline; CMU Sphinx and snowboy, snowboy doesn’t look like it exists anymore though haha! More reading to do. – 5:25 PM

No idea what “wit 6.0.1” is but it doesn’t look like what I need. Pocketsphinx looks like something I might want. Time to dig deeper. I really hope there is a library for this stuff already and I can adapt it rather than having to start everything from scratch. But if I do have to, so be it!

5:30 PM

https://cmusphinx.github.io/wiki/

I think we have found our winner! Time to go reading all the documentation! WOOOOOO!

5:35 PM

https://github.com/cmusphinx/pocketsphinx

https://github.com/cmusphinx/sphinxtrain

https://cmusphinx.github.io/wiki/tutorialsphinx4/

https://cmusphinx.github.io/wiki/tutorialpocketsphinx/

So I was doing some reading on it, and I think I will be using pocketsphinx for this project. Sphinx4 is great but it is purely java, and I love java but I wanted to do this stuff with python not java. Java is messy and slow in my opinion. 5:48 PM

So I seem to have encountered an error when it come to this kind of stuff, also an over sight. I am not entirely sure how I am going to be able to install all the packages and things necessary for my offline computer. I can try and import them directly from a USB flash drive but I do not know if that will work. We shall have to see. Pip probably will not work on something without pip or anything else. I might be stuck between a rock and a hard place. It seem like pocketsphinx is going to be way more complicated as there doesn’t seem to be support for it actively right now. So I might just be back to developing everything myself. Most of the dependencies I have to install look fairly shady. Yeah I think I’ll just have to say screw it we are developing from scratch. The CMU Sphinx website might still be very helpful for this though. I might still need to use another programming language in tantum with python. This stuff is going to be fun I can tell. Well back to square one! Wooooo! Maybe I can google how neural networks work and implementing one with speech to text. This is gonna be funnnnnnnn! Chicken nugget break first though. – 6:20 PM

After a hot minute of working on things, I received an email to my personal email stating my application for a position with a writing association has been approved. Curious I went to go look through some things, I have determined that the organization is a scam and am now investigating their website and their organization in order to report them/mess with them. My continued work on my project will have to be discontinued for today as I must endeavor elsewhere. I leave today’s research with understanding that I will need to probably understand the HMM model and neural networking. Join me some other day as I realize that research further. This has been a most interesting day of work. I feel I have been productive but also distracted. No matter. My endeavors will continue in this facet at a further time.

https://en.wikipedia.org/wiki/Hidden_Markov_model

7:32 PM

Thank you for reading through my research and stream of consciences, hopefully you will endeavor to read the other things I have written!

-Ben