As previously discussed, it's important that we monitor whether or not our work here is having a positive, negative or any impact at all.
There are two ways we will do this:
1 - site metrics. We'll grab a snapshot of certain site metrics at regular intervals (quarterly) and compare
2 - survey. We'll take a regular survey to capture attitudes to contribution to see if these change over time.
Survey:
I've created a survey here.
Can you please take the survey and give me any feedback on the questions/answers before we distribute this more widely.
Site Metrics
Here are some metrics that would be great to capture for the six months ending 31 March 2011 (in anticipation of quarterly snapshots)
Participation
Total no. Drupal.org members who have logged in at least once (this is the count the Drupal.org front page uses):
Today (May 23, 2011): 592521
SELECT COUNT(uid) FROM users WHERE status = 1 AND login > 0;
March 31, 2011: 564507
SELECT COUNT(uid) FROM users WHERE status = 1 AND login > 0 AND created < UNIX_TIMESTAMP('2011-03-31 00:00:00');
(28,014 new members since March 31.)
Total no. of Drupal.org members who have been active in the last 6 months (active = posted to issue queue, discussion forum, updated documentation, posted to GDO, committed code, etc.
This one's a bit trickier, because that information is kinda scattered all over the place.
This is the number of users who authored either a comment or a node. This would catch discussion forum and issue queue activity. It would not catch documentation revisions, GDO, or code commits.
Total: 16,332
SELECT COUNT(DISTINCT u.uid)
FROM users u
LEFT JOIN node n ON u.uid = n.uid
LEFT JOIN comments c ON c.uid = u.uid
WHERE u.login > 0
AND (n.uid = u.uid OR c.uid = u.uid)
AND u.status = 1
AND (n.status = 1 OR c.status = 0)
AND (n.created BETWEEN UNIX_TIMESTAMP(
DATE_SUB(NOW(), INTERVAL 6 MONTH)
)
AND UNIX_TIMESTAMP(
NOW()
))
AND (c.timestamp BETWEEN UNIX_TIMESTAMP(
DATE_SUB(NOW(), INTERVAL 6 MONTH)
)
AND UNIX_TIMESTAMP(
NOW()
))
;
(This query also takes like 2 minutes to complete, and I welcome a more performant/accurate one. :P)
Collaboration
Of issues posted in the past six months:
Total: 61019
SELECT COUNT(n.nid) FROM node n WHERE n.type = 'project_issue' AND n.created > UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 6 MONTH));
- Total open issues
Total: 18558
SELECT COUNT(n.nid)
FROM node n
INNER JOIN project_issues pi ON pi.nid = n.nid
WHERE type = 'project_issue'
AND n.created > UNIX_TIMESTAMP(DATE_SUB(NOW(), INTERVAL 6 MONTH))
/* 1 = active, 4 = postponed, 8 = needs review, 13 = needs work, 14 = rtbc, 15 = needs porting */
AND pi.sid IN (1, 4, 8, 13, 14, 15);
- Total issues resolved in past six months
- Average no. people contributing per issue
- Average comments per issue
This (thanks, mikey_p!) will get you a big "spreadsheet" of issue numbers + number of comments. Need to parse it down into averages, and filter by date:
SELECT pi.nid as nid, count(c.cid) as count FROM project_issues pi LEFT JOIN comments c ON c.nid = pi.nid GROUP BY pi.nid ORDER BY count DESC;
- Average duration (time from date opened to date closed)