↓ Twitter is updated more often, so read it! ↓

Single-pass standard deviation in PHP

I recently had a need for a standard deviation function for a project. It wasn’t until after I’d already implemented it that I discovered the existence of a virtually-undocumented stats_standard_deviation() in PHP’s Statistics package.

Anyway, I did some research and found Don Knuth’s on-line, single-pass standard deviation function and implemented it in PHP.

<?php
function stddev($array){
  //Don Knuth is the $deity of algorithms
  //http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#III._On-line_algorithm
  $n = 0;
  $mean = 0;
  $M2 = 0;
  foreach($array as $x){
      $n++;
      $delta = $x - $mean;
      $mean = $mean + $delta/$n;
      $M2 = $M2 + $delta*($x - $mean);
  }
  $variance = $M2/($n - 1);
  return sqrt($variance);
}
?>

Please note that there are more efficient ways of doing this if you have have written a mean() or variance() function, as you should. There’s also a stats_variance() function in PHP.

I think I might do some serious benchmarking with PHP’s implementation of stddev and mine and see which is faster. I’ll be that the PHP one is faster because it’s written in C and mine’s being interpreted at runtime.

Update 19:40 stats is undocumented because it’s a PECL extension which isn’t included by default, and it’s not terribly clear to install. Install php-dev then use pecl install stats to get it. Don’t forget to sudo if you need to.

Update 2009/11/02 14:52 I had a missing -1 on the variance calculation for nearly a year and finally someone caught it. Thanks, Shane.

Common functions of social news sites

I did a brief survey just now of a few of the more well-known social bookmarking—social news, really— sites. My intent was to find commonalities in features and functions in order to assess what’s been done already.

I inspected Digg, Reddit, Del.icio.us, and Newsvine. I did not consider Bloglines, StumbleUpon, nor Technorati because none of those three show links the same way that the others do—they require account creation or other steps before coming directly to links.

Obviously, the most common feature is voting. A registered user can mark his or her approval (or disapproval!) of a bookmark. If the bookmark is liked by enough people in a certain amount of time, the bookmark gets “promoted” to the front page, where it sits until the amount of users voting for the bookmark slows down to a rate at which it is overtaken by other bookmarks with a higher rate of votes.

Each has user registration as a requirement for voting. Non-registered users are permitted to peruse the bookmarks, but they cannot submit nor vote on bookmarks. Registered users generally have control of a limited profile, which optionally lists at least the user’s nickname and web site URL.

On Digg, reddit, and Newsvine, users can comment on bookmarks. While reddit’s and Newsvine’s respective comment systems are fully threaded (meaning that a user can reply to a comment, and another user can reply to that comment, and so on, and it’s visually separated), Digg’s commenting system is limited to two levels of commenting (New comment or reply to a top-level comment only).

Another product, Pligg—an open source CMS that operates much like Digg and Newsvine combined, has these features, as well.

What unique features does each have? Digg has a lot of tools (DiggSpy, SWARM, STACK, etc.) to visualize incoming bookmarks and stories. Newsvine pays its members for generating content (submitting stories/bookmarks, writing columns), thus creating more page views and more advertising revenue. Both Del.icio.us and Newsvine use tags to help users search for stories/bookmarks based on keywords. Digg and Newsvine both have friend systems, where you can add other users to a list of friends and be able to quickly view your friends’ submissions and links on which your friends have voted positively.

Reddit and Del.icio.us are far more lightweight both visually and page load size than the other two. However, DiggRiver is even more lightweight (and meant for mobile phone browsers, actually).

What other features, both common and unique, am I missing? I care little about the size of each community, nor do I care about its stereotypical users’ behavior. I’m focusing on features.

Social bookmarking links gone wild

Social bookmarking links gone wild picture

I noticed on dailydomainer.com today the site’s veritable cornucopia of social bookmarking links at the bottom of single-post pages. The sheer amount of them—all in one row—baffled me.

Are there really that many social bookmarking sites?

Do they really do well enough to merit keeping them open?

I realize the explosive success of Digg and Newsvine and Del.icio.us and Reddit (of which I only use the first two). I guess social bookmarking is yet another Internet phenomenon that everyone is trying to wrangle and profit from.

It would be cool to write a social bookmarking web application. PHP is used for almost all of them, with Ruby for the others and maybe something less popular for some others. I wonder if any of them are done in Scheme or Smalltalk? Hmm.

In other news, classes start tomorrow and I don’t have any books yet. It’s my last semester of undergrad, and I’ve spent more on books this semester than I have for the past five. That tells you something about history and English classes, eh?

Things to do this year, 2007

JD tagged me. Ready? Go.

  • Graduate.
    • Key to success, here. The only people who could keep me from graduating are Dr. Shaffer and Dr. Hickman, who are teaching Capstone and Calculus, respectively. Passing capstone is a matter of simply doing my project, and passing calculus is a matter of getting enough people to help me even though I’m mildly terrified of it.
  • Go to grad school.
    • I want to get my M.S. in Business Education ASAP. Robert Morris, Bloomsburg, Gwynedd-Mercy, and Mercyhurst (just a certification there, though) are my options for that. Otherwise, Pitt has a M.Ed. English/Communications to which I’m also applying, figuring that I can get a job as an English or journalism teacher and earn my business, computer, and information technology teaching certification for my required continuing education hours.
  • Keep writing
    • Evann Garrison, the woman behind my writing minor (had her for four of the six required classes!), once said, “Keep writing.” I intend to do just that. My new blog will help with that, as I’m going to keep most of my punditry there and my cutesy stuff on my Livejournal.
    • I want to get into writing about politics and technology a little more. Net Neutrality is a perfect example of the topics about which I want to write. I will continue to avoid religion, thank you.
    • I’m enjoying doing reviews for ThinkComputers, any may soon start doing them for ThinkGaming, too.
  • Keep coding
    • I finished a site in about 24 hours this week (thank you, Smarty). I need to make manifest more of my site ideas so that I can perhaps start pulling in a little bit of money from them.
    • Also, I want to get my idea for a social bookmarking site off the ground. I doubt that it’ll start prior to summer, perhaps launching this time next year.
    • I want to improve my Python and Smalltalk knowledge.
  • Learn more Esperanto and German
    • My Esperanto is slowing getting better. It’s difficult to really work on it because there’s no one else around here that speaks it. I’ll continue to use Kurso de Esperanto and lernu.net to expand my vocabulary.
    • I kinda wanted to take German in college, but I didn’t have enough time during the few semesters it was offered. My mother bought me a “learn German” program a few Christmases ago, but I have yet to even open it. Perhaps I’ll open it this summer.

I tag rosejoliefemme, morrigan32, jessbevan, billmeir, and dauphin1.

The dangers of the REPL

Auto response from Rhettigan: Why does “I’m caught in a REPL!” sound like a perfect excuse to miss something?
Jeremy: ctrl+d doesn’t work!!!

In other news, tomorrow’s Christmas and I’m not done shopping.

En aliaj novaĵoj, morgaŭ estas Kristnasko kaj mi ne finiĝas butikumanto.