Calculating Averages from a CSV with Perl

Posted by hank, Sat Dec 08 18:09:00 UTC 2007

Code

Here’s a quick one-liner using some UNIX utilities and Perl to construct some nice averages from CSV data:


for i in `seq 2 20`; do cat crim_rate_2005_by_state.csv | cut -d , -f $i | perl -e '$c=$d=0;$e;while(<>){if(/^\d/){$c+=$_;$d+=1}else{s/\s{2,}/ /g;s/"//g;chomp($e=$_);}} print $e, ": ", $c/$d, "\n"'; done

And now, the spaced out version:


#!/bin/bash
for i in `seq 2 20`; do 
  cat crim_rate_2005_by_state.csv | \
  cut -d , -f $i | \
  perl -e '$c=$d=0;
    $e;
    while(<>){
      if(/^\d/){
        $c+=$_;
        $d+=1
      } else {
        s/\s{2,}/ /g;
        s/"//g;
        chomp($e=$_);
      }
    } 
    print $e, ": ", $c/$d, "\n"'; 
done

Output

  • Population: 5775431.88461538
  • Violent crime rate: 418.930769230769
  • Murder/manslaughter rate: 5.59038461538462
  • Forcible rape rate: 33.1634615384615
  • Robbery rate: 114.455769230769
  • Assault rate: 265.728846153846
  • Property crime rate: 3339.50961538462
  • Burglary rate: 685.671153846154
  • Larceny/theft rate: 2273.43269230769
  • Motor vehicle theft rate: 380.417307692308
  • Violent crime: 26928.3461538462
  • Murder and nonnegligent manslaughter: 335.730769230769
  • Forcible rape: 1809.67307692308
  • Robbery: 8128.30769230769
  • Aggravated assault: 16654.6346153846
  • Property crime: 196569.711538462
  • Burglary: 41756.0961538462
  • Larceny-theft: 130880.442307692
  • Motor vehicle theft: 23933.1730769231

So, now we have our averages. More work to be done. The data file used is available here:

crim_rate_2005_by_state.csv

Tags:

Comments

Have your say

A name is required. You may use Markdown in your comments.