This is a summary of some of the discussions around automated performance testing of core at DrupalCon Chicago. It's very incomplete and a bit rushed but this is a wiki so please help fill in the gaps.
Test runs
Rather than trying to integrate this with PIFR, we want to run tests against git branches triggered by git pushes. Initially this would allow for post-commit/push testing of HEAD, which would only find regressions after the fact, but from there we can extend this to forks of HEAD in sandboxes, per-issue branches etc.
Measurement
To avoid issues with hardware inconsistencies Damien had the idea of using control groups http://www.kernel.org/doc/Documentation/cgroups/. These allow you to assign resources to groups of processes, audits the usage, and has a reporting framework - so we could measure CPU, memory, disk, (potentially network traffic) for PHP and the database separately - and reliably across different hardware (at least much more reliable than ab, jmeter or any equivalents that are measuring wall time).
We could also profile the test runs with xhprof, then aggregate and diff results across and between test runs - Narayan found a project that wraps some of this. xhprof also measures CPU time as well as wall time, so again this should be mostly reliable in different enviornments (at least a lot more reliable than xdebug or microtime).
Test plans
Since we are trying to measure application performance rather than actually load testing hardware, and to have test plans that can be contributed to and maintained by as many people as possible, we’d like to use one of the more recent browser APIs (selenium, watir, or one of the JavaScript ones like zombie.js) - to have clients that can browse pages, submit forms etc.
Doing it this way also gives us a start towards using the same overall framework to do automated front end testing (http traffic, javascript execution etc.) via something like browsermob.
As well as full page requests, we’d also start a contrib project with some microbenchmark PHP scripts (i.e. different steps of the bootstrap, render a node to json, check_plain() some different strings etc.) and then test plans could request these to generate reporting for that kind of thing as well.
Timeline
We’d like to get something up and running as soon as possible - since anything is better than nothing and the more is in place as Drupal 8 picks up speed the better - we can then improve both the integration, reporting and test cases over time.