Saturday, February 18, 2012

Calculating the Mode of a Dataset

Scenario
SQL script to calculate the mode of a dataset.

Approach
Determine the frequency of occurrence of each value. Sort from largest to smallest. Select the first frequency so long as there is more than one occurrence. Allow for ties in case of bimodal distributions. The script for the parts table can be found in the post for the median.
Script
USE parts
SELECT TOP 1 WITH TIES COUNT(*) occurrences, part_wgt mode
FROM parts
GROUP BY part_wgt
HAVING count(*) > 1
ORDER BY COUNT(*) DESC;

Result

No comments:

Post a Comment