Parsing Sigalert

Been wanting to do this for a while:

sigalert.com
has wonderful data that can be mined, with regards to patterns in freeway traffic. It’d be great to harvest and analyze traffic trends.

TODO:

  • email Ken (urban planning/transportation PhD at UCI) to see if they have any data i can steal
  • email sigalert to see if they can give me access to their raw data (probably not… or should i just covertly spider it?)

Pages to gather from:

  • http://www.sigalert.com/speeds.asp?Region=Greater+Los+Angeles&Road=405%20South
  • http://www.sigalert.com/speeds.asp?Region=Greater+Los+Angeles&Road=405%20North
  • http://www.sigalert.com/speeds.asp?Region=Greater+Los+Angeles&Road=10%20East
  • http://www.sigalert.com/speeds.asp?Region=Greater+Los+Angeles&Road=10%20West
  • http://www.sigalert.com/speeds.asp?Region=Greater+Los+Angeles&Road=5%20North
  • http://www.sigalert.com/speeds.asp?Region=Greater+Los+Angeles&Road=5%20South
  • (and perhaps others…though don’t want to overload their servers too much or alert them to my presence)

Programming TODO:

  1. figure out how to use cron and wget
  2. start sucking down pages, every 25 minutes or so
  3. write a parser to grab the necessary data from the pages in question
  4. find a good DB backend to store my data in
  5. cron extract freeway speeds and write them to DB (get it all working together)
  6. data processing
    • current speeds
    • average speeds (model this after web-traffic data: avg per day-of-week, month, weekday/weekend, etc)
  7. figure out distance measures between points on the map
  8. use the prior to calculate a numerical value of travel-time (e.g. “25 minutes if you leave now”)
  9. explore equations that fit traffic
    • accidents
      • measure severity by time-to-normal
      • clustering of severity
      • resolution with respect to location (because the front cars are accelerating and the back cars are slowing down, congestion would move backwards along a road like a wave, even once the original accident site is resolved, huh?)
      • resolution with respect to time (how long does it take an accident to dissolve?)
    • rush-hour trends (can i treat rush-hour and congestion as an accident without a specific center?)
    • smoothing, with respect to day-of-week, accident resolution, hour-of-day
    • The effects of reverse commutes (e.g. the 405 where both sides are backed up)
    • Average traffic speeds for specific exits
  10. adapt estimated time with averages, to see if I can get an estimate of “how long will it take me if i leave +5 minutes, +10 minutes, +20 minutes, etc…”
  11. measure a minutes-here-waiting-vs.-minutes-on-the-road ratio
  12. put all of this into a web frontend

Leonard recommends

  • This for an example of www:mechanize and tokeparse
  • DBI to hook perl into mysql
  • output to web using php linking into mysql
  • jpgraph as an alternative to gnuplot which i was thinking of using

Ken says:

  • The Transportation Research Board has a bunch of good papers
  • The University of California Transportation
    Center
    does too, specifically

    • http://www.uctc.net/access/access.asp
  • But, analyzing traffic dynamics just by knowing traffic speeds at points on the freeway is a pipedream…there are way too many other variables involved.
  • UCI’s ITS (and UCLA, too) should have all the data i could ever want…
    • Brian Taylor is a “transportation guru” at UCLA
    • Hiro Iseki is a PhD candidate at UCLA (hiseki @ ucla.edu) (his background is in transportation engineering )

stuff i want to learn

  • vi
  • procmail
  • mutt
  • cron
  • wget
  • arabic
  • hindi
  • german
  • lisp or scheme
  • the use of subliminal psychology in day-to-day interpersonal interactions
  • how to use cards as deadly throwing weapons

firefox extensions

from a /. thread this morning: firefox extensions (here are some hacks to get them to work in 0.9.1 again like they did in 0.8)

email processing

USC (because they don’t want to go to too much trouble when they’re subpoena’d, i guess) is deleting all 6-month-or-older mail as of Sept 2. I need to find some other solution for syncing my mail both on home and work machines.

  • gmail is providing good caching, though it would be nice if i could automate tagging of mass, personal, etc. (why can’t i define lists of people, and say “anyone from this group, label that tag as “personal” or “family” or “isi” ?)
  • my friend leonard has a nice setup with getmail, crm114, Maildir, procmail, mysql, running on a machine at work. he has automatic statistic generation (at http://webster.usc.edu/~lhl/mail/ and the procmail locs at http://webster.usc.edu/~lhl/mail/0628), and uses squirrelmail as a webmail frontend. That guy is on top of things…

So here’s what i need

  • local mailserver (?)
  • webmail interface (squirrelmail?)
  • maildir/mbox local files to process on
  • popfile or crm114 as preprocessing for delivered mail (spam, and general sorting)
  • procmail (learn how to use it…)

Maybe i’ll bring a small computer to ISI to act as a webserver for personal hacking & projects. It’s a pain, with my current webserver only available when fairuz is booted into windows. And i want to transition more into linux…

Eclipse v3.0

Wow, my life is a bit more complete! Eclipse v3.0, a beautiful Java IDE made by IBM, was just released this morning…

The Prisoner of Azkaban

Just heard a replay of Elvis Mitchell’s “The Treatment” on NPR this evening, that started off with a wonderful interview of Alfonso Cuarón, director of the latest Potter movie (and also the director of Y Tu Mama Tambien and A Little Princess). The high point of the interview occured when Cuarón discussed how he would talk to the young actors to get them into his vision of the movie:

Just picture Professor Lupin as “Your favorite gay uncle who does smack.”

Such a vivid, true, true image…

Listen to the interview here on KCRW’s web site.

hungry enough for dinner

Problem constraints:

  • It’s supposedly “breakfast” time here, but my stomach is still on Italian time and i’m hungry enough for dinner…
  • Nothing in the house but nonperishables, i’ve been gone for 3 weeks and my apartmentmate has an aversion to shopping…
  • Craving Chinese food–haven’t had any quality chinese food in ages, due to my time in Europe…

Solution:
a hacked-up attempt at 麻醬麵 (noodles with sesame sauce)

  • sesame paste (spoonfull)
  • hot water (spoonfull)
  • vinegar (spoonfull–add more to taste later as needed)
  • soy sauce (a little less than a spoonfull)
  • sugar (one teaspoon)
  • minced garlic & ginger
  • a bit of salt, if not salty enough
  • cool vegetables (like carrots, if i have any, or cucumbers, which i know i don’t…) thinly thinly sliced
  • mix it all together, trying to preserve an optimal balance of {salty, savoury, sour, sweet, and nutty}, using the hot water to dilute if it gets too strong

–update:
Very tasty, but the flavor balance is not as easy to acheive as i remember. very thirsty-making.
for next time:

  • add a little chili oil or paste for some spiciness (just not this time, not for breakfast)
  • add a little thyme (and the nation of Taiwan colletively shudders at the suggestion…but i think it could be tasty…)

UK music

all this good stuff we never got in the states… went to a club in Sheffield two nights ago, where they were playing 80s music. half of the songs were “oh yeah, i remember this”, and the other half were… well, they were 80’s sounding, but it was like i’d had a lobotomy and they’d extracted half the tracks from my brain so i’d never heard them before. Like a portion of my childhood was missing–except it wasn’t.

my cousin’s boyfriend here in bath, now, is giving me a musical tour of everything i’ve missed.

  • Stone Roses (predecessors to nearly every band that i like…why oh why had i not heard of them ’til now)
  • LTJ Bukem (he’s amazing…NEED to get some of his stuff before i go back to the states
  • The Kinks (ok, i’ve heard them, but nothing beyond their classic “you’ve got me going”)
  • Feeder (like Oasis only quite good)
  • The Darkness (these guys should have been from the 80s… “I believe in a thing called love” is hilariously, embarassingly, like something we woulda listened to back in the day… a Queen for the new era)
  • from the stateside of things: Johnny Cash’s remix of DM’s Personal Jesus

Schedule

  • 1405 : arrive Manchester
  • picked up from airport by Sarah
  • Check out Sheffiled, hang w/ Maria & Sarah’s family (and Dan too)

photo album for my trip

http://fairuz.isi.edu/gallery/2004Europe