Measurements and Classification

ROCCA

Measurements and Classification

Classification technique

ROCCA classifies whistles, clicks and encounters using Random Forest classifiers. ROCCA currently uses a Random Forest classifier model based on the open-source statistical software package Weka. For more information on Random Forests and the WEKA package, the user is encouraged to refer to the book Data Mining: Practical Machine Learning Tools and Techniques

Whistle Contour Measurement and Classification

Table 1 lists the 50 variables measured from each whistle contour.

Variable Code

Variable Name

Units

Explanation

FREQBEGSWEEP

Beginning Sweep

Categorical variable

Slope of the beginning sweep (1=positive -1=negative, 0=zero)

FREQBEGUP

Positive beginning sweep

Binary variable

1=beginning slope is positive, 0=beginning slope is negative

FREQBEGDWN

Negative beginning sweep

Binary variable

1=beginning slope is negative, 0=beginning slope is positive

FREQENDSWEEP

Ending sweep

Categorical variable

Slope of the beginning sweep (1=positive -1=negative, 0=zero)

FREQENDUP

Positive ending sweep

Binary variable

1=ending slope is positive, 0=ending slope is negative

FREQENDDWN

Negative ending sweep

Binary variable

1=ending slope is negative, 0=ending slope is positive

FREQBEG

Beginning frequency

Hz

Beginning frequency

FREQEND

Ending frequency

Hz

Ending frequency

FREQMIN

Minimum frequency

Hz

Minimu frequency

DURATION

Duration

Seconds

Duration of the whistle

FREQRANGE

Frequency range

Hz

Maximum frequency - minimum frequency

FREQMAX

Maximum frequency

Hz

Maximum frequency

FREQMEAN

Mean frequency

Hz

Mean frequency

FREQMEDIAN

Median frequency

Hz

Median frequency

FREQSTDDEV

Standard deviation of the frequency

Hz

Standard deviation of the frequency

FREQSPREAD

Frequency spread

Hz

Difference between the 75th and the 25th percentile of the frequency

FREQQUARTER1

First quarter frequency

Hz

Frequency at one-quarter of the duration

FREQQUARTER2

Half frequency

Hz

Frequency at one-half of the duration

FREQQUARTER3

Third quarter frequency

Hz

Frequency at three-quarters of the duration

FREQCENTER

Center frequency

Hz

(minimum frequency + (maximum frequency - minimum frequency)) / 2

FREQRELBW

Relative bandwidth

Hz

(maximum frequency - minimum frequency)/center frequency

FREQMAXMINRATIO

Maximum-minimum ratio

None

Maximum frequency / minimum frequency

FREQBEGENDRATIO

Beginning-ending ratio

None

Beginning frequency / end frequency

FREQCOFM

Coefficient of frequency modulation

None

Take 20 frequency measurements equally spaced in time, then subtract each frequency value from the one before it. COFM is the sum of the absolute values of these differences, all divided by 10,000

FREQNUMSTEPS

Number of steps

None

10 percent or greater increase or decrease in frequency over two contour points

NUMINFLECTIONS

Number of inflection points

None

Changes from positive to negative or negative to positive slope

INFLMAXDELTA

Maximum delta

Seconds

Maximum time between inflection points

INFLMINDELTA

Minimum delta

Seconds

Minimum time between inflection points

INFLMAXMINDELTA

Maximum-minimum delta ratio

None

Maximum delta / minimum delta

INFLMEANDELTA

Mean delta

Seconds

Mean time between inflection points

INFLSTDDEVDELTA

Standard deviation delta

Seconds

Standard deviation of the time between inflection points

INFLMEDIANDELTA

Median delta

Seconds

Median of the time between inflection points

FREQSLOPEMEAN

Mean slope

Hz/second

Overall mean slope

FREQPOSSLOPEMEAN

Positive slope

Hz/second

Mean positive slope

FREQNEGSLOPEMEAN

Negative slope

Hz/second

Mean negative slope

FREQABSSLOPEMEAN

Absolute slope

Hz/second

Mean absolute value of the slope

FRQESLOPERATIO

Positive-negative slope ratio

None

Mean positive slope / mean negative slope

FREQSWEEPUPPERCENT

Percent positive

None

Percent of the whistle that has a positive slope

FREQSWEEPDWNPERCENT

Percent negative

None

Percent of the whistle that has a negative slope

FREQSWEEPFLATPERCENT

Percent flat

None

Percent of the whistle that has a zero slope

NUMSWEEPSUPDWN

Positive-negative slope

None

Number of inflection points that change from positive slope to negative slope

NUMSWEEPSDWNUP

Negative-positive slope

None

Number of inflection points that change from negative slope to positive slope

NUMSWEEPSUPFLAT

Positive-flat slope

None

Number of times the slope changes from positive to zero

NUMSWEEPSDWNFLAT

Negative-flat slope

None

Number of times the slope changes from negative to zero

NUMSWEEPSFLATDWN

Flat-negative slope

None

Number of times the slope changes from zero to negative

NUMSWEEPSFLATUP

Flat-positive slope

None

Number of times the slope changes from zero to positive

FREQSTEPUP

Steps up

None

Number of steps that have increasing frequency

FREQSTEPDOWN

Steps down

None

Number of steps that have decreasing frequency

INFLDUR

Inflection points / duration

None

Number of inflection points / duration

STEPDUR

Steps/duration

None

Number of steps / duration

To classify a whistle, the vector of variables measured from that whistle is analysed with the random forest model, which contains hundreds of classification trees. Each tree in the forest classifies the whistle and final classification is based on the species that the greatest percentage of trees voted for. If the greatest percentage of tree votes falls below the whistle threshold (as specified in the ROCCA Parameters window) , the whistle is classified as Ambiguous.

Click Classification

Table 2 lists the 17 variables measured from each click.

Variable Code

Variable Name

Units

Explanation

DURATION

Duration

Seconds

Duration of the click

FREQPEAK

Peak frequency

Hz

frequency with the highest amplitude

BW3DBLOW

-3dB bandwidth lower limit

Hz

First frequency lower than the peak frequency at which the amplitude has dropped by 3dB

BW3DBHIGH

-3dB bandwidth upper limit

Hz

First frequency higher than the peak frequency at which the amplitude has dropped by 3dB

BW3DB

-3dB bandwidth

Hz

BW3DBHIGH - BW3DBLOW

BW10DBLOW

-10dB bandwidth lower limit

Hz

First frequency lower than the peak frequency at which the amplitude has dropped by 10dB

BW10DBHIGH

-10dB bandwidth upper limit

Hz

First frequency higher than the peak frequency at which the amplitude has dropped by 10dB

BW10DB

-10dB bandwidth

Hz

BW10DBHIGH - BW10DBLOW

RMSSIGNAL

Signal RMS

dB

Root-mean-square of the click amplitude

RMSNOISE

Noise RMS

dB

Root-mean-square of the noise amplitude

SNR

Signal-to-noise ratio

dB

RMSSIGNAL - RMSNOISE

NCROSSINGS

Number of zero crossings

None

Number of times the waveform crosses zero

SWEEPRATE

Sweep rate

kHz/ms

sweep rate of the zero crossings

MEANTIMEZC*

Zero crossing mean time

ms

mean time between zero crossings

MEDIANTIMEZC*

Zero crossing median time

ms

median time between zero crossings

VARIANCETIMEZC

Zero crossing variance

ms2

variance of the time between zero crossings

ICI

Inter-click Interval

seconds

Time from the end of one click to the start of the next click

*Mean and median zero crossing times are not used in the current classifier, but still calculated by the Rocca algorithms. Rocca will ignore these variables during classification.

To classify a click, the vector of variables measured from that click is analysed with the random forest model, which contains hundreds of classification trees. Each tree in the forest classifies the click and final classification is based on the species that the greatest percentage of trees voted for. If the greatest percentage of tree votes falls below the click threshold (as specified in the ROCCA Parameters window) , the click is classified as Ambiguous.

School Classification

Table 3 lists the 17 variables calculated based on whistle and click detections for each encounter (if specified by the user in the ROCCA parameters window):

Variable Code

Variable Name

Units

Explanation

Encounter_Duration_s

Encounter duration

Seconds

Duration from the start of the first whistle/click to the end of the last whistle/click

Number_of_whistles

Number of whistles

None

Number of whistles

Whistle_Duration_s

Whistle duration

Seconds

Duration from the start of the first whistle to the end of the last whistle

Min_Time_Between_Whistle_Detections_s

Minimum time between whistles

Seconds

Minimum time between whistles

Max_Time_Between_Whistle_Detections_s

Maximum time between whistles

Seconds

Maximum time between whistles

Ave_Time_Between_Whistle_Detections_s

Average time between whistles

Seconds

Average time between whistles

Whistle_Detections_per_Second

Whistles per second

Counts/s

The number of whistles / whistle duration

Whistle_Density

Whistle density

None

Sum of the whistle durations / the encounter duration

Ave_Whistle_Overlap

Average whistle overlap

None

Total duration during which whistles overlap / encounter duration

Number_of_Clicks

Number of clicks

None

Number of clicks

Click_Duration_s

Click duration

Seconds

Duration from start of first click to end of last click

Min_Time_Between_Click_Detections

Minimum time between clicks

Seconds

Minimum time between clicks

Max_Time_Between_Click_Detections

Maximum time between clicks

Seconds

Maximum time between clicks

Ave_Time_Between_Click_Detections

Average time between clicks

Seconds

Average time between clicks

Click_Detections_per_Second

Clicks per second

Counts/s

Sum of the click durations / encounter duration

Ave_Click_Overlap

Average click overlap

None

Total duration during which clicks overlap / encounter duration

Lat*

Latitude

Deg

Latitude

Long*

Longitude

Deg

Longitude

*Latitude and Longitude are not measured from the whistle and click data, but taken from the GPS source as specified in the Rocca Parameters Window Source tab.

Each encounter number holds a list of possible species based on the whistle/click classifier models used. There are two values stored for each species: the number of times a whistle/click has been classified to that species (displayed), and a cumulative total of all the percentage tree votes for the species (not displayed). When a new whistle/click classification is saved to an encounter number, the count of the classified species is increased by one and the percentage tree votes for each species are added to the corresponding cumulative totals.

The encounter is classified in one of two ways:

  1. If an encounter classifier has been loaded, the vector of encounter parameters and the random forest probabilities from the whistle and click classifiers are analysed with the encounter random forest and the encounter is classified as the species with the highest percentage of tree votes.

  2. 2. If no encounter classifier has been selected by the user, the encounter is classified as the species with the highest cumulative percentage of tree votes. Note that this may be different than the species most often classified - the value shown in the sidebar species list. If the highest cumulative percentage of tree votes falls below the school threshold (as specified in the ROCCA Parameters window), the detection is classified as Ambiguous.

Previous: Contour Extraction / Manipulation

Next: ROCCA Sidebar