Browsed by
Month: December 2022

Dispatch Tone Out Decoder Part 5: Adding Speech to Text Conversion in Python

Dispatch Tone Out Decoder Part 5: Adding Speech to Text Conversion in Python

I want the scanner tone out decoder to include an actual real time transcribed audio text displayed in line with the departments toned out so that I can easily see if there is something going on that is important with just a quick skim of the output display. Another great feature I can incorporate is checking for trigger keywords that will alert me if there is really something that needs immediate attention, such as a wildland fire in my area.

I have been researching for most of today looking to see what is available for converting speech to text and there are several options. I have chosen to go down the path of using the Google Cloud API. Mainly because they have a 60 minute free transcription per month, they include a $300 credit within a 90 day trial to use for any of the APIs, and finally because once you go over 60 minutes per month of transcription the cost is only 2.4 cents per minute. So it is worth trying out.

After I setup my Google Cloud account, I had trouble figuring out how to really get started until I happened across this : https://www.hellocodeclub.com/python-speech-recognition-create-program-with-google-api/ . This tutorial worked up until the point of adding in an environment variable. My main Python debug setup is still running on Windows 10. I saved my Google Credentials JSON file to my C:\ directory for ease of editing the Environment Variable. Add the System Environment Variable for the JSON file location by hitting the Window key and typing in ‘environment’. The search brings up ‘edit environment variables control panel’ click that.

Click Environment Variables…

Click New…

Add GOOGLE_APPLICATION_CREDENTIALS in the Variable name

Add your JSON file name and location in the Variable value:

Restart your Python Idle session for the environment variables to take effect.

I ran the code in the tutorial against a snippet of recorded audio from the Broadcastify Boulder County dispatch channel:

The output is not 100% correct but is good enough to see what is going on:

Transcript: 2621 2601 respond with Lafayette please to the outside entrance of Exempla Good Sam 200 Exempla Circle

The next step will be to integrate this into the tone out decoder display.

Dispatch Tone Out Decoder Part 4: Display the Unit or Department that was Toned Out to the Output

Dispatch Tone Out Decoder Part 4: Display the Unit or Department that was Toned Out to the Output

To display the actual unit or department that was toned out we need to first make a CSV with the department/unit and the tones, to start I used what was available here: https://wiki.radioreference.com/index.php/Boulder_County_(CO)

This is what the departments.csv file should resemble:


Fire Department Tone list
Rocky Mountain Rescue
U.S. Forest Service
Boulder Emergency Services
Boulder Mountain
Four Mile
Gold Hill
Indian Peaks
Lefthand
Rocky Mountain
Sugar Loaf
AMR 1
Allenspark
Boulder Rural
Jamestown
Lafayette
Lyons
Pinewood Springs
St. Vrain Rapid Intervention Team
Sunshine
AMR 2
Timberline
Boulder Emergency Squad (Group 1)
Boulder Emergency Squad (Group 2)
Longmont Emergency Unit
Hygiene
Big Elk Meadows
Mountain View Station 6
Nederland
AMR 3
Front Range Rescue Dogs
Boulder Mountain
AMR 4
Lafayette Battalion Chief
American Medical Response
AMR 5+
Coal Creek Canyon
Mountain View Station 1
Rocky Mountain
Louisville
Louisville Ambulance
Mountain View Station 4
Mountain View Station 3
2602
2260
2718

Next to add code to read in and compare the contents to the tone we need to add pandas to the code:

import pandas as pd   

Then we add a line of code that reads in the CSV file as a dataframe:

df = pd.read_csv("departments.csv")

Next we will compare the two detected tones, first_tone and second_tone to the dataframe Tone1 and Tone2 values with +/-15Hz of tolerance around the tones, and put a check in for no matching tone out pair in the CSV by indexing through the dataframe values with iloc and adding or subtracting 15Hz from that tone to compare to what was detected:

tone_found = 0  
   if first_tone_detect_count >=5 and second_tone_detect_count > 25:
     print("")
     for i in range(len(df)):
        if first_tone > df.iloc[i]['Tone1']-15 and first_tone < df.iloc[i]['Tone1']+15 and second_tone > df.iloc[i]['Tone2']-15 and second_tone < df.iloc[i]['Tone2']+15:
        print(df.iloc[i]['Department'],": Tone out on", time.strftime("%m-%d-%y at %H:%M:%S"))
        tone_found = 1

     if tone_found == 0:
        print ("****** Tone out on", time.strftime("%m-%d-%y at %H:%M:%S"))
        print ("       1: {:0.1f}Hz 2:{:1.1f}Hz C1:{:2.0f} C2{:3.0f}".format (first_tone,second_tone,first_tone_detect_count,second_tone_detect_count))

When running the code against this test case recorded audio dispatch:

We get this ouput:

****** Tone out on 12-27-22 at 15:40:28
       1: 1675.2Hz 2:1338.1Hz C1:23 C2 26

Lafayette : Tone out on 12-27-22 at 15:40:33

2602 : Tone out on 12-27-22 at 15:40:37

Lafayette Battalion Chief : Tone out on 12-27-22 at 15:40:41

When we get a tone out without a CSV reference as shown at 15:40:28 we can listen to the recorded audio then add that unit or department to the CSV , so the next time the tone pair is encountered we will get a display of who was actually toned out:

2718, 1675, 1340 is added to the CSV

Run once again with the test case audio now the display shows:

2718 : Tone out on 12-27-22 at 15:50:26

Lafayette : Tone out on 12-27-22 at 15:50:31

2602 : Tone out on 12-27-22 at 15:50:36

Lafayette Battalion Chief : Tone out on 12-27-22 at 15:50:39

As we learn more about the dispatches we may change the description for each tone pair to suit what we want to see displayed for each dispatch.

I have both of the Python routines running concurrently on the same laptop , one recording the dispatch audio into mp3 files with timestamps and the other decoding tones to display who was toned and when. It will be easy to cross reference time stamps from the displayed dispatched unit/departments with the associated and all following mp3 files.

Update 12/30/22: – I spent a little time making a few changes here and there after leaving the Python code running listening for tone outs on the Broadcastify channel ( I am still waiting for the USB sound card from Amazon so that I can attach either my scanner or a cheap Baofeng UV-5R radio) I changed the script to do 4x the samples size while searching for valid tones, this minimizes the detection of invalid tones by taking an FFT on ~88 milliseconds of audio sample vs ~22 milliseconds so the dominate tone can be pulled out with more distinction:

CHUNK_SIZE = 1024*4

The CSV was changed to include the unit number groups in the department description:

The display was changed to show a little more info, some for debugging the code:

2200: Mountain View Fire St.-6 : Tone out on 12-29-22 at 22:56:00
       1: 1130.4Hz 2:871.9Hz C1: 8 C2:  8

2700: Louisville Fire Department : Tone out on 12-30-22 at 01:30:51
       1: 1669.3Hz 2:948.4Hz C1: 5 C2:  8

2700: Louisville Ambulance : Tone out on 12-30-22 at 01:30:56
       1: 1669.2Hz 2:1130.4Hz C1: 9 C2:  8

2300: Boulder Rural  : Tone out on 12-30-22 at 01:48:00
       1: 948.7Hz 2:1529.2Hz C1: 9 C2:  8

2300: Boulder Rural Fire  : Tone out on 12-30-22 at 01:48:05
       1: 948.7Hz 2:1983.7Hz C1: 9 C2:  8

2200: Mountain View Fire St.-4 : Tone out on 12-30-22 at 01:56:07
       1: 1669.0Hz 2:871.9Hz C1: 8 C2:  8

2200: Mountain View Fire  : Tone out on 12-30-22 at 01:56:12
       1: 1498.5Hz 2:1087.5Hz C1: 7 C2:  8

2600: Lafayette Fire Department : Tone out on 12-30-22 at 01:56:16
       1: 948.3Hz 2:1345.4Hz C1: 9 C2:  8

2700: Louisville Fire Department : Tone out on 12-30-22 at 03:14:37
       1: 1669.3Hz 2:1345.7Hz C1: 9 C2:  8

2600: Lafayette Fire Bat Chief : Tone out on 12-30-22 at 03:14:45
       1: 1304.5Hz 2:2074.4Hz C1: 7 C2:  8

5200: Allenspark Fire Department : Tone out on 12-30-22 at 04:08:11
       1: 948.8Hz 2:1231.0Hz C1: 8 C2:  8

5200: Allenspark Fire Department : Tone out on 12-30-22 at 04:08:15
       1: 948.5Hz 2:1230.9Hz C1: 8 C2:  8

Also I have noticed that when the units transmitting have a faulty radio transmitting a lot of noise the code picks out frequencies less than 600Hz so I put a note that the tone out is not valid, I will reject this low confidence tone out print later on but just want to see how often this occurs so leaving it in for now:

****** Tone out on 12-29-22 at 23:39:00
       1: 539.3Hz 2:538.2Hz C1: 6 C2:  8
       Actual tone out confidence low

For the next project I want to try and tackle: It is time to take the audio recorded from dispatch and try converting speech to text…hmmm it will be interesting. If successful I can add a snippet of the dispatch info to the display too. For now I hope you enjoyed this Mad Scientist Hut Python series and thanks for visiting, I hope to see you soon!

A future version for this dispatch decoder project will be to put this on a Raspberry Pi (someday they will be back in stock…🙄 ) and couple it with a cheap Baofeng UV-5R radio tuned to dispatch.

Dispatch Tone Out Decoder Part 3: Making the Dispatch Tone Detection Robust Against Radio Traffic

Dispatch Tone Out Decoder Part 3: Making the Dispatch Tone Detection Robust Against Radio Traffic

After getting the code to work yesterday I ran it against live audio on the Broadcastify Boulder County dispatch channel with overlapping audio and found the detection for dispatch tones outs to really be lacking. So I spent time going through recorded dispatch audio and pulling more test cases out to run the code against. This was a great mental exercise for me to figure out why the tones were not being detected. I now have tweaked the detection algorithm so it will search the audio with excellent results of detecting two tone dispatches.

I had to create several rejection criteria while ‘looking’ for a valid tone, once rejected it will start the search over again. The code uses the FFT to simply look for peak tones in the 500-2500Hz range every ~1/44th of a second and some times there are other events that will cause the algorithm that cause it to “think” it has a valid tone to look at, such as a voice or chirp in the range of tones. Examples of reasons of changes follow:

Boulder County Dispatch uses a TX Chirp of ~1950Hz that is within the band of frequencies that need to be examined and this is rejected with a simple test checking if the tone detected is between 1936Hz to 1960Hz and occurs within the first 25 milliseconds of a tone detection. This does not become an issue to reject this frequency range because it is not a valid dispatch two tone frequency. This is the line of code where this rejection occurs:

if time.time()- tone_start_time <= 0.025 and first_tone-12 < 1948.0 and first_tone+12 > 1948.0

Sometimes the silence detection routine while listening to the audio of the radio transmissions has a delay between handing the baton off to the tone detection routine. This causes the first tone to be truncated, so I had to reduce the amount of valid 1st tone detects to 5 detections of a tone within the first ~700 milliseconds of ‘looking’ at the first tone. This becomes one of the decision points to reject an invalid tone before going on to look at the second tone:

if time.time()- tone_start_time >= 0.850 and time.time()- tone_start_time <= 1.000:
   second_tone_sum = second_tone_sum + thefreq
   second_tone_detect_count += 1
   second_tone = second_tone_sum / second_tone_detect_count
     
   if first_tone_detect_count <=4:

One of the most important for this algorithm for monitoring Boulder County Dispatch is that if any silence that occurs while ‘looking’ at a tone to check if it is valid the tone will be rejected since there is a continuous wave for tone one with no break going to tone two with dispatch transmissions as seen here:

The rejection point line of code is here (voice should be changed to audio_level for better clarity):

if not voice and second_tone_detect_count == 0:

There are several other tweaks that were put in, and the code seems to be much better at detecting a valid dispatch. I have run the current code against every test case of dispatch two tone audio that I have and this is what the output looks like:

****** Tone out on 12-27-22 at 11:00:10
       1: 948.0Hz 2:1338.3Hz
****** Tone out on 12-27-22 at 11:00:13
       1: 948.3Hz 2:1982.0Hz
****** Tone out on 12-27-22 at 11:00:16
       1: 948.2Hz 2:1338.1Hz
****** Tone out on 12-27-22 at 11:00:55
       1: 1675.6Hz 2:1123.6Hz
****** Tone out on 12-27-22 at 11:01:05
       1: 1676.0Hz 2:1337.9Hz
****** Tone out on 12-27-22 at 11:01:10
       1: 1401.6Hz 2:1533.1Hz
****** Tone out on 12-27-22 at 11:01:15
       1: 1238.4Hz 2:771.3Hz
****** Tone out on 12-27-22 at 11:01:23
       1: 1595.1Hz 2:864.7Hz
****** Tone out on 12-27-22 at 11:01:31
       1: 948.3Hz 2:1982.1Hz
****** Tone out on 12-27-22 at 11:01:35
       1: 1123.9Hz 2:864.7Hz
****** Tone out on 12-27-22 at 11:01:41
       1: 1675.4Hz 2:947.8Hz
****** Tone out on 12-27-22 at 11:01:46
       1: 1123.6Hz 2:1123.6Hz
****** Tone out on 12-27-22 at 11:01:56
       1: 1294.0Hz 2:1080.9Hz
****** Tone out on 12-27-22 at 11:02:27
       1: 865.6Hz 2:1813.4Hz
****** Tone out on 12-27-22 at 11:02:31
       1: 950.5Hz 2:1896.6Hz
****** Tone out on 12-27-22 at 11:02:34
       1: 1401.8Hz 2:1533.0Hz
****** Tone out on 12-27-22 at 11:02:55
       1: 864.8Hz 2:1813.4Hz
****** Tone out on 12-27-22 at 11:02:58
       1: 948.1Hz 2:1896.6Hz
****** Tone out on 12-27-22 at 11:03:01
       1: 1401.7Hz 2:1532.9Hz
****** Tone out on 12-27-22 at 11:03:19
       1: 866.6Hz 2:1813.3Hz
****** Tone out on 12-27-22 at 11:03:22
       1: 948.1Hz 2:1896.6Hz
****** Tone out on 12-27-22 at 11:03:25
       1: 1402.6Hz 2:1533.0Hz
****** Tone out on 12-27-22 at 11:03:37
       1: 864.8Hz 2:1813.4Hz
****** Tone out on 12-27-22 at 11:03:40
       1: 948.1Hz 2:1896.6Hz
****** Tone out on 12-27-22 at 11:03:43
       1: 1401.7Hz 2:1532.9Hz
****** Tone out on 12-27-22 at 11:03:51
       1: 1079.1Hz 2:1337.9Hz
****** Tone out on 12-27-22 at 11:04:00
       1: 948.5Hz 2:1741.6Hz
****** Tone out on 12-27-22 at 11:04:03
       1: 1078.9Hz 2:1337.9Hz
****** Tone out on 12-27-22 at 11:04:11
       1: 948.4Hz 2:1741.8Hz
****** Tone out on 12-27-22 at 11:04:15
       1: 1401.7Hz 2:1532.9Hz
****** Tone out on 12-27-22 at 11:04:26
       1: 948.4Hz 2:1741.3Hz

Here is the modified code, I left all of the commented out print statements in just incase I find some better test cases to run against.

"""
   Tone Out Decoder for dispatch tone decoding
   Copyright (C) 2022  Kirk Garrison <madscientist@madscientisthut.com>

   Voice activated audio recorder intended for scanner radio use
   Copyright (C) 2018  Kari Karvonen <oh1kk at toimii.fi>

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software Foundation,
   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301  USA
                                       
"""
from sys import byteorder
from array import array
from struct import pack

import time
import pyaudio
import wave
import os

import numpy as np


SILENCE_THRESHOLD = 3000
RECORD_AFTER_SILENCE_SECS = 1

RATE = 44100
CHANNELS = 1
MAXIMUMVOL = 32767
CHUNK_SIZE = 1024
FORMAT = pyaudio.paInt16

tone_silent = True
tone_error = 50.0 # expected maximum tone error in hz from fft detect
first_tone_detect_count = 0
first_tone_sum = 0
second_tone_detect_count = 0
second_tone_sum = 0

def voice_detected(snd_data):
    return max(snd_data) > SILENCE_THRESHOLD

def wait_for_activity():
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                input_device_index = 2,
                frames_per_buffer = 512)
    
    record_started_stamp = 0
    wav_filename = ''
    record_started = False
    #print("waiting for audio...", end = '')
    while 1:
        try:
            snd_data = array('h', stream.read(512))
            #print (type(snd_data))
        except:
            #print("Exception:wait_for_activity")
            snd_data = [0,0]
            p.terminate()
            stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                input_device_index = 2,
                frames_per_buffer = 512)

            
        if byteorder == 'big':
            snd_data.byteswap()

        voice = voice_detected(snd_data)       
        del snd_data

        if voice:
            break
        
    stream.stop_stream()
    stream.close()
    p.terminate()
    return True

def decode_tone():
    global tone_silent
    global first_tone_detect_count
    global second_tone_detect_count
    global first_tone_sum
    global second_tone_sum 
    global tone_start_time
    first_tone = 0
    second_tone = 0
    test_count = 0
    tone_start_time = 0
    callback_output = []
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                input_device_index = 2,
                frames_per_buffer = CHUNK_SIZE,
                )
    
    record_started_stamp = 0
    last_voice_stamp = 0
    wav_filename = ''
    record_started = False

    r = array('h')
    #print ("Checking Audio for tones on:", time.strftime("%m-%d-%y at %H:%M:%S"))

    while 1:
        snd_data = array('h', stream.read(CHUNK_SIZE))
        if byteorder == 'big':
            snd_data.byteswap()
        r.extend(snd_data)

##########################################################

        if max(snd_data) > SILENCE_THRESHOLD:
                     
            fftData=abs(np.fft.rfft(snd_data))**2
            which = fftData[1:].argmax() + 1
            if which != len(fftData)-1:
                y0,y1,y2 = np.log(fftData[which-1:which+2:])
                x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
                thefreq = (which+x1)*RATE/CHUNK_SIZE
                #print(f"The freq is {thefreq} Hz.")
            else:
                thefreq = which*RATE/CHUNK_SIZE
                #print (f"The freq is {thefreq} Hz.")
                
            #print (tone_silent)
            # Boulder county tone out freqs are between 500 and 2100
            if thefreq > 500 and thefreq < 2100:
                #print ("Possible tone",thefreq,"Hz @", time.time())
                #print (tone_silent)
                if tone_silent:
                    #print("Tone detect",thefreq,"Hz")
                    first_tone_detect=thefreq
                    tone_start_time = time.time()
                    tone_silent = False
                    first_tone_detect_count = 1
                    first_tone = thefreq
                    first_tone_sum = thefreq
                    second_tone_sum = 0
                    test_count = 0
                    #print(tone_silent)

                #filter tx chirp - Boulder County has a TX chirp running close to 1948Hz
                if time.time()- tone_start_time <= 0.025 and first_tone-12 < 1948.0 and first_tone+12 > 1948.0:
                    #print("tx chirp detect, tone:",first_tone,"time:",time.time()- tone_start_time)
                    tone_start_time = 0
                    tone_silent = True
                    first_tone_detect_count = 0
                    second_tone_detect_count = 0
                    test_count = 0
               
            
                #first half of tone
                if time.time()- tone_start_time <= 0.700:
                    #print ("First Time",time.time()- tone_start_time)
                    if first_tone_detect_count > 0:
                        #print("First tone",thefreq,"Hz @ count",first_tone_detect_count)
                        #print( first_tone-tone_error, thefreq,first_tone+tone_error )
                        if thefreq <= first_tone+tone_error and thefreq >= first_tone-tone_error:
                            first_tone_sum = first_tone_sum + thefreq
                            first_tone_detect_count += 1
                            first_tone = first_tone_sum / first_tone_detect_count
                            #print ("t1",first_tone, first_tone_detect_count)
                            
                    
                #second half of tone
                if time.time()- tone_start_time >= 0.850 and time.time()- tone_start_time <= 1.000:
                    second_tone_sum = second_tone_sum + thefreq
                    second_tone_detect_count += 1
                    second_tone = second_tone_sum / second_tone_detect_count
     
                    if first_tone_detect_count <=4:
                        #print("reset start second tone with t1 ct:", first_tone_detect_count,"t1",first_tone)
                        tone_start_time = 0
                        tone_silent = True
                        first_tone_detect_count = 0
                        second_tone_detect_count = 0
                        test_count = 0
   
     
                if time.time()- tone_start_time >= 1.000 and time.time()- tone_start_time <= 2.0: 
                    if thefreq <= second_tone+tone_error and thefreq >= second_tone-tone_error:
                        second_tone_sum = second_tone_sum + thefreq
                        second_tone_detect_count += 1
                        second_tone = second_tone_sum / second_tone_detect_count
                        #print ("t2",second_tone,second_tone_detect_count)
                               
      

                if time.time()- tone_start_time >= 2.250:
                    #print("reset timeout at 2.25 secs, first tone",first_tone,"Count",first_tone_detect_count,)
                    #print ("t2",second_tone,second_tone_detect_count)
                    tone_start_time = 0
                    tone_silent = True
                    first_tone_detect_count = 0
                    second_tone_detect_count = 0
                    test_count = 0 
 
                        
                if first_tone_detect_count >=5 and second_tone_detect_count > 12:
                    print ("****** Tone out on", time.strftime("%m-%d-%y at %H:%M:%S"))
                    print ("       1: {:0.1f}Hz 2:{:1.1f}Hz".format (first_tone,second_tone))
 
                    tone_start_time = 0
                    tone_silent = True
                    first_tone_detect_count = 0
                    second_tone_detect_count = 0
                    test_count = 0
                    #print("reset valid tone detected")
       

##########################################################
    
        voice = voice_detected(snd_data)
 
        if voice and record_started:
            last_voice_stamp = time.time();
        elif voice and not record_started:
            record_started = True
            record_started_stamp = last_voice_stamp = time.time();
        
        if record_started and time.time() > (last_voice_stamp + RECORD_AFTER_SILENCE_SECS):
            break

    tone_start_time = 0
    tone_silent = True
    first_tone_detect_count = 0
    second_tone_detect_count = 0
    test_count = 0
    #print("reset end of sound")
     

    return

#########################################################

        
while 1:
    idle=wait_for_activity()
    decode_tone()

So time to let it listen to live dispatch again. Before the Mad Scientist Hut moves on to the next step of this project of displaying department/unit info for each tone out…

Dispatch Tone Out Decoder Part 2: Picking Out Tone Frequencies from Radio Traffic

Dispatch Tone Out Decoder Part 2: Picking Out Tone Frequencies from Radio Traffic

For the last few days at the Mad Scientist Hut I have have been recording radio traffic from the Broadcastify channel for Boulder County dispatch, this channel contains overlapping audio from both sheriff and fire dispatch and responding units. I know on my phone scanner apps if I choose left or right audio channels I can separate the fire from sheriff dispatches, but for some reason on my Win10 laptop it does not. So while I wait for the USB audio card that will attach to my RadioShack Pro-2052 scanner I am just going to live with what I have from Broadcastify.

The program running in the background has recorded a lot of mp3 files, about 300MB worth. Picking through a few of the recorded mp3 files to find the tone out portion with out overlapping audio for this next part of the project has been completed. Here is a sample of the picked files:

This is the file we will use for debugging:

Just to keep things simple for debugging I am going to use an audio editor to remove everything but the tone portion.

Here is a view in Audacity of the Lyons Fire and AMR tone out mp3 file:

After cutting everything but the Lyons Fire tone we see this:

Looking closely at the tones generated by dispatch, we can see the first tone is ~750 milliseconds and the the second tone follows without a gap and is ~1250 milliseconds long. This gives us an idea of what to search for when analyzing the audio signal.

Then we export the file as Lyons.mp3 for playback while looking at the tones:

I have created a separate version of the project program just for analyzing the tones:

"""
   Tone Out Decoder for dispatch tone decoding
   Copyright (C) 2022  Kirk Garrison <madscientist@madscientisthut.com>

   Voice activated audio recorder intended for scanner radio use
   Copyright (C) 2018  Kari Karvonen <oh1kk at toimii.fi>

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software Foundation,
   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301  USA
                                       
"""
from sys import byteorder
from array import array
from struct import pack

import time
import pyaudio
import wave
import os

###################################################

import numpy as np

####################################################
SILENCE_THRESHOLD = 3000
RECORD_AFTER_SILENCE_SECS = 0

RATE = 44100
CHANNELS = 1
MAXIMUMVOL = 32767
CHUNK_SIZE = 1024
FORMAT = pyaudio.paInt16

def voice_detected(snd_data):
    return max(snd_data) > SILENCE_THRESHOLD

def wait_for_activity():
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                input_device_index = 2,
                frames_per_buffer = CHUNK_SIZE)
    
    record_started_stamp = 0
    wav_filename = ''
    record_started = False
    print("waiting for audio...")
    while 1:
        snd_data = array('h', stream.read(CHUNK_SIZE))
        if byteorder == 'big':
            snd_data.byteswap()

        voice = voice_detected(snd_data)       
        del snd_data

        if voice:
            break
        
    stream.stop_stream()
    stream.close()
    p.terminate()
    return True

def decode_audio():
    
    callback_output = []
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                input_device_index = 2,
                frames_per_buffer = CHUNK_SIZE,
                )
    
    record_started_stamp = 0
    last_voice_stamp = 0
    wav_filename = ''
    record_started = False

    r = array('h')

    while 1:
        snd_data = array('h', stream.read(CHUNK_SIZE))
        if byteorder == 'big':
            snd_data.byteswap()
        r.extend(snd_data)

##########################################################
                 
        fftData=abs(np.fft.rfft(snd_data))**2
        # find the peak audio
        which = fftData[1:].argmax() + 1
        #print (which)
        if which != len(fftData)-1:
            y0,y1,y2 = np.log(fftData[which-1:which+2:])
            x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
            thefreq = (which+x1)*RATE/CHUNK_SIZE
            print(f"The freq is {thefreq} Hz.")
        else:
            thefreq = which*RATE/CHUNK_SIZE
            print (f"The freq is {thefreq} Hz.")

##########################################################
    
        voice = voice_detected(snd_data)
 
        if voice and record_started:
            last_voice_stamp = time.time();
        elif voice and not record_started:
            record_started = True
            record_started_stamp = last_voice_stamp = time.time();
        
        if record_started and time.time() > (last_voice_stamp + RECORD_AFTER_SILENCE_SECS):
            break

    return

#########################################################

      
while 1:
    idle=wait_for_activity()
    decode_audio()

When the code is run against the Lyons.MP3 file we get this output:

waiting for audio...
The freq is 947.5150342473756 Hz.
The freq is 948.7854148771141 Hz.
The freq is 948.0139136733909 Hz.
The freq is 948.1462187442012 Hz.
The freq is 948.0086656342902 Hz.
The freq is 947.6618726923299 Hz.
The freq is 947.740627066793 Hz.
The freq is 947.7693989218812 Hz.
The freq is 948.0775335043835 Hz.
The freq is 947.8162893928887 Hz.
The freq is 948.1886012230533 Hz.
The freq is 947.8380049287326 Hz.
The freq is 947.8730128374611 Hz.
The freq is 948.2286984816029 Hz.
The freq is 948.1418436942806 Hz.
The freq is 947.9948079399098 Hz.
The freq is 948.007548493049 Hz.
The freq is 948.2537731002493 Hz.
The freq is 947.8326519252762 Hz.
The freq is 946.7854508462983 Hz.
The freq is 1743.4608182619825 Hz.
The freq is 1742.0425412978634 Hz.
The freq is 1741.9231741686472 Hz.
The freq is 1742.371826070488 Hz.
The freq is 1742.1494317507695 Hz.
The freq is 1742.2122620549544 Hz.
The freq is 1742.4424680868483 Hz.
The freq is 1742.3125170711137 Hz.
The freq is 1742.7119612099782 Hz.
The freq is 1741.459887860244 Hz.
The freq is 1742.0782854459844 Hz.
The freq is 1741.8200041956156 Hz.
The freq is 1741.807972001664 Hz.
The freq is 1741.672695794544 Hz.
The freq is 1741.202896219085 Hz.
The freq is 1741.4826458269972 Hz.
The freq is 1741.1146482429865 Hz.
The freq is 1741.267362389106 Hz.
The freq is 1741.006884735257 Hz.
The freq is 1740.9436611977917 Hz.
The freq is 1741.1155199144227 Hz.
The freq is 1741.2925550824457 Hz.
The freq is 1741.0301588575699 Hz.
The freq is 1741.5367101331321 Hz.
The freq is 1741.6710684894783 Hz.
The freq is 1741.6596282508099 Hz.
The freq is 1741.758766463525 Hz.
The freq is 1741.7507043035375 Hz.
The freq is 1741.7147682463435 Hz.
The freq is 1741.6926434472268 Hz.
The freq is 1742.0094989947552 Hz.
The freq is 1742.0332111678813 Hz.
The freq is 1742.0986274227405 Hz.
The freq is 1741.8256500887956 Hz.
The freq is 1742.1461570611496 Hz.
The freq is 1742.029012463372 Hz.
The freq is 1741.6497173883001 Hz.
The freq is 1741.9912900146207 Hz.
The freq is 1741.9744676040787 Hz.
The freq is 1741.8872850525745 Hz.
The freq is 1742.0255865123036 Hz.
The freq is 1741.7520628653326 Hz.
The freq is 1741.189106519698 Hz.
The freq is 1741.4242117738945 Hz.
The freq is 1741.4270616083645 Hz.
The freq is 1741.3258431496656 Hz.
The freq is 1741.4909123608206 Hz.
The freq is 1741.2091183766859 Hz.
The freq is 1741.0765715236573 Hz.
The freq is 1741.0187414372963 Hz.
The freq is 1740.977703436862 Hz.
The freq is 1741.3637533784447 Hz.
The freq is 1741.0890592580956 Hz.
The freq is 1741.4511296221801 Hz.
The freq is 1748.346815451805 Hz.
The freq is 100.6997152620881 Hz.
waiting for audio...

If we compare it to the actual frequencies for dispatch we can see we are right in ballpark:

So now that we can see the frequencies are ‘matching’ we can create code to find the tone outs.

Here I have modified the code to detect the actual tones, There may be a much more eloquent way to do this, but it works:


"""
   Tone Out Decoder for dispatch tone decoding
   Copyright (C) 2022  Kirk Garrison <madscientist@madscientisthut.com>

   Voice activated audio recorder intended for scanner radio use
   Copyright (C) 2018  Kari Karvonen <oh1kk at toimii.fi>

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software Foundation,
   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301  USA
                                       
"""
from sys import byteorder
from array import array
from struct import pack

import time
import pyaudio
import wave
import os

###################################################

import numpy as np

####################################################
SILENCE_THRESHOLD = 3000
RECORD_AFTER_SILENCE_SECS = 5

RATE = 44100
CHANNELS = 1
MAXIMUMVOL = 32767
CHUNK_SIZE = 1024
FORMAT = pyaudio.paInt16

tone_silent = True
tone_error = 10.0 # expected maximum tone error in hz from fft detect
first_tone_detect_count = 0
first_tone_sum = 0
second_tone_detect_count = 0
second_tone_sum = 0


def voice_detected(snd_data):
    return max(snd_data) > SILENCE_THRESHOLD

def wait_for_activity():
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                input_device_index = 2,
                frames_per_buffer = CHUNK_SIZE)
    
    record_started_stamp = 0
    wav_filename = ''
    record_started = False
    print("waiting for audio...")
    while 1:
        snd_data = array('h', stream.read(CHUNK_SIZE))
        if byteorder == 'big':
            snd_data.byteswap()

        voice = voice_detected(snd_data)       
        del snd_data

        if voice:
            break
        
    stream.stop_stream()
    stream.close()
    p.terminate()
    return True


def decode_tone():
    global tone_silent
    global first_tone_detect_count
    global second_tone_detect_count
    global first_tone_sum
    global second_tone_sum 
    global tone_start_time
    first_tone = 0
    second_tone = 0
    callback_output = []
    p = pyaudio.PyAudio()
    stream = p.open(format = FORMAT,
                channels = CHANNELS,
                rate = RATE,
                input = True,
                input_device_index = 2,
                frames_per_buffer = CHUNK_SIZE,
                )
    
    record_started_stamp = 0
    last_voice_stamp = 0
    wav_filename = ''
    record_started = False

    r = array('h')

    while 1:
        snd_data = array('h', stream.read(CHUNK_SIZE))
        if byteorder == 'big':
            snd_data.byteswap()
        r.extend(snd_data)

##########################################################
                 
        fftData=abs(np.fft.rfft(snd_data))**2
        which = fftData[1:].argmax() + 1
        if which != len(fftData)-1:
            y0,y1,y2 = np.log(fftData[which-1:which+2:])
            x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
            thefreq = (which+x1)*RATE/CHUNK_SIZE
            #print(f"The freq is {thefreq} Hz.")
        else:
            thefreq = which*RATE/CHUNK_SIZE
            

        # Boulder county tone out freqs are between 500 and 2100
        if thefreq > 500 and thefreq < 2100:
            #print ("Possible tone",thefreq,"Hz @", time.time())
            #print (tone_silent)
            if tone_silent:
                first_tone_detect=thefreq
                tone_start_time = time.time()
                tone_silent = False
                first_tone_detect_count = 1
                first_tone = thefreq
                first_tone_sum = thefreq
                second_tone_sum = 0
                
            #first half of tone
            if time.time()- tone_start_time <= 0.700:
                if first_tone_detect_count > 0:
                     if thefreq <= first_tone+tone_error and thefreq >= first_tone-tone_error:
                        first_tone_sum = first_tone_sum + thefreq
                        first_tone_detect_count += 1
                        first_tone = first_tone_sum / first_tone_detect_count
                        
        
            #second half of tone
            if time.time()- tone_start_time >= 0.800 and time.time()- tone_start_time <= 1.000:
                    second_tone_sum = second_tone_sum + thefreq
                    second_tone_detect_count += 1
                    second_tone = second_tone_sum / second_tone_detect_count

 
            if time.time()- tone_start_time >= 1.000 and time.time()- tone_start_time <= 2.0: 
                if thefreq <= second_tone+tone_error and thefreq >= second_tone-tone_error:
                    second_tone_sum = second_tone_sum + thefreq
                    second_tone_detect_count += 1
                    second_tone = second_tone_sum / second_tone_detect_count

            if time.time()- tone_start_time >= 2.250:
                if first_tone_detect_count >10 and second_tone_detect_count > 20:
                    print ("Tone out on:", time.strftime("%m-%d-%y at %H:%M:%S"))
                    print ("First tone",first_tone,"Hz , Count",first_tone_detect_count)
                    print ("second tone",second_tone,"Hz , Count",second_tone_detect_count)

                tone_start_time = 0
                tone_silent = True
                first_tone_detect_count = 0
                second_tone_detect_count = 0
 
##########################################################
    
        voice = voice_detected(snd_data)
 
        if voice and record_started:
            last_voice_stamp = time.time();
        elif voice and not record_started:
            record_started = True
            record_started_stamp = last_voice_stamp = time.time();
        
        if record_started and time.time() > (last_voice_stamp + RECORD_AFTER_SILENCE_SECS):
            break

   
    return

#########################################################

while 1:
    idle=wait_for_activity()
    decode_tone()

I have run it against this dispatch recording:

And this is the resulting output:

waiting for audio...
Tone out on: 12-26-22 at 12:52:03
First tone 948.2939742995145 Hz , Count 24
second tone 1741.6734801240202 Hz , Count 42
Tone out on: 12-26-22 at 12:52:08
First tone 1401.6512285269284 Hz , Count 32
second tone 1532.9641830719565 Hz , Count 51

waiting for audio...

The next part of the project will be to create a CSV look up table for tones to departments and units.