publishing nike runs, part 1: numeric lists

| 2 Comments | No TrackBacks
I've been told to post something interesting soon or I'll lose my last reader (wassup broofa). It's probably obvious by now that the insufferably boring recent posts to this blog were generated by some program (call it nikepub). It would only be fair to write a blog entry that describes nikepub. This might also qualify as an interesting post and keep broofa around a little longer. I'll break it down into several parts and milk it as much as I can.

The data used in these posts is collected during runs by a Nike+Sportsband. Through USB and a clunky uploader app from Nike+ the data eventually makes it to the Nike+ website where it can be viewed in a reasonably nice way. Unfortunately that website is fantastically slow to load (it's all flash) and hard to navigate. The actionscript makes subsequent http requests that return XML with the actual data, and those requests are very fast. If ones Nike+ profile is public it's very simple to do the same requests and get the run data. With a little XML processing, some numeric list processing, the Google Charts API and some xml-rpc to metaWeblog.newPost nikepub can then pollute a blog with running charts.

The first version of nikepub was written in Scala. Scala is for academics and egomaniacs, so how could I resist. Scala can invoke java code without hassles which means the universe of java libraries out there is immediately available, so progress happens fast. On the other hand I found that it's still too easy to slip back into imperative style in Scala and I wanted to be as close to pure functional fun as possible which can only mean Haskell.

Quoting from the brief and mostly wrong history of programming languages: A committee formed by Simon Peyton-Jones, Paul Hudak, Philip Wadler, Ashton Kutcher, and People for the Ethical Treatment of Animals creates Haskell, a pure, non-strict, functional language. Haskell gets some resistance due to the complexity of using monads to control side effects. Wadler tries to appease critics by explaining that "a monad is a monoid in the category of endofunctors, what's the problem?"

This series of posts will describe the Haskell version of nikepub, which is still a work in progress. Some knowledge of Haskell is assumed and can be obtained here and here among many other places. In this first part we look at processing the running data to prepare it for google charts.

The data is a list of doubles representing the distance (in km) measured at intervals of 10 secs. The values are the cummulative distances, in other words each value is the total distance traversed so far at the time of the measurement. We want to chart how fast we were during the run. Runners don't talk in speed (distance / unit of time). Instead they use pace (time / unit of distance). In our case pace will be minutes it took to run one km.

To get to pace we apply a series of transformation to the original list of doubles.

First we subtract from each value the previous value to get speed:

diff :: (Num a) => [a] -> [a]
diff [] = []
diff xs = zipWith (-) (tail xs) xs

Then we filter out speed values of zero, since that would lead to infinite pace:

filter (/=0)

We then convert speed to pace:

map (\x -> (1.0 / (6.0 * x)))

 

Smaller pace numbers are faster, ie better. When viewing a chart though one has the tendency to associate values in the upper y-regions with better. That's why we want to flip the y axis making it point downwards. We cannot actually do this with google charts so instead we flip the values upside down. We compute the min and max values in the list and replace each value v with min + max - v:

minInList :: (Ord a) => [a] -> a
minInList = foldr1 min

maxInList :: (Ord a) => [a] -> a
maxInList = foldr1 max

flipInRange :: (Ord a, Num a) => [a] -> [a]
flipInRange [] = []
flipInRange xs = map (\x -> minV + maxV - x) xs
                 where
                   minV = minInList xs
                   maxV = maxInList xs

One surprise awaits us (when we actually chart the data, pretend we already have). The run data is noisy, the chart looks somewhat ugly. So we want to smooth it out a little. To do that we will use convolution and correlation with some kernels. First we define a kernel operation:

kernelOp :: (Num a) => [(a, a)] -> a
kernelOp = (foldr (+) 0) . (map (uncurry (*)))

This takes a list of tuples, multiplies the numbers in each tuple and adds up all the multiplication results.

Next we define convolution and correlation:

correlate :: (Num a) => [a] -> [a] -> [a]
correlate ks xs | length xs < length ks  = []
correlate ks xs = kernelOp (zip ks xs) : correlate ks (tail xs)

convolve :: (Num a) => [a] -> [a] -> [a]
convolve ks xs = correlate (reverse ks) xs

We also define two useful kernels:

gaussianKernel :: (Floating a) => Int -> [a]
gaussianKernel n = 
  map (g . fromIntegral) [(-n)..n]
  where
     y = fromIntegral n
     g x = (exp (-(x^2) / y^2)) / (y * (sqrt (2 * pi)))

movingAverageKernel :: (Floating a) => Int -> [a]
movingAverageKernel n = take n (repeat (1.0 / (fromIntegral n)))

Convolve and correlate will eat into our list making it shorter. The usual way to deal with this is to pad the list on the right with some value like the last element of the original list. We need a pad function:

pad :: (Num a) => Int -> a -> [a] -> [a]
pad n p [] = take n (repeat p)
pad n p xs = xs ++ (take k (repeat p))
             where 
               l = length xs
               k = (n - (l `mod` n)) `mod` n

In the next part we will see how to feed this into google charts. In the meantime here is the first module with all the functions defined so far:

module Codemanic.NumericLists
(
 correlate,
 convolve,
 gaussianKernel,
 movingAverageKernel,
 scale,
 diff,
 minInList,
 maxInList,
 pad,
 padPeriodic,
 flipInRange
) 
where

kernelOp :: (Num a) => [(a, a)] -> a
kernelOp = (foldr (+) 0) . (map (uncurry (*)))

correlate :: (Num a) => [a] -> [a] -> [a]
correlate ks xs | length xs < length ks  = []
correlate ks xs = kernelOp (zip ks xs) : correlate ks (tail xs)

convolve :: (Num a) => [a] -> [a] -> [a]
convolve ks xs = correlate (reverse ks) xs

gaussianKernel :: (Floating a) => Int -> [a]
gaussianKernel n = 
  map (g . fromIntegral) [(-n)..n]
  where
     y = fromIntegral n
     g x = (exp (-(x^2) / y^2)) / (y * (sqrt (2 * pi)))

movingAverageKernel :: (Floating a) => Int -> [a]
movingAverageKernel n = take n (repeat (1.0 / (fromIntegral n)))

scale :: (Num a) => [a] -> a -> [a]
scale s xs = map (\x -> x * s) xs

diff :: (Num a) => [a] -> [a]
diff [] = []
diff xs = zipWith (-) (tail xs) xs

minInList :: (Ord a) => [a] -> a
minInList = foldr1 min

maxInList :: (Ord a) => [a] -> a
maxInList = foldr1 max

pad :: (Num a) => Int -> a -> [a] -> [a]
pad n p [] = take n (repeat p)
pad n p xs = xs ++ (take k (repeat p))
             where 
               l = length xs
               k = (n - (l `mod` n)) `mod` n

padPeriodic :: (Num a) => Int -> [a] -> [a]
padPeriodic _ [] = []
padPeriodic n xs = xs ++ (take k (foldr (++) [] (take m (repeat xs))))
                   where
                     l = length xs
                     k = (n - (l `mod` n)) `mod` n
                     m = (k `div` l) + 1 

flipInRange :: (Ord a, Num a) => [a] -> [a]
flipInRange [] = []
flipInRange xs = map (\x -> minV + maxV - x) xs
                 where
                   minV = minInList xs
                   maxV = maxInList xs

No TrackBacks

TrackBack URL: http://www.codemanic.com/cgi-bin/mt4/mt-tb.cgi/5

2 Comments

Maybe get rid of {min,max}InList in favor of minimum and maximum, which are in Prelude.

Will do, thanks.

Leave a comment

Archives

Recent Comments

  • Uwe Hoffmann: Will do, thanks. read more
  • brian: Maybe get rid of {min,max}InList in favor of minimum and read more