Transport for London has made 3 months of data available from the cycle counters installed in February on route CS3 Embankment and CS6 Blackfriars Road.
Here’s some initial thoughts. Note that I’m writing this on 12 September 2018 so things may change if the TfL data boffins tinker with the files …
The data files are a mess
TfL’s data boffins have taken the files provided by the counters’ manufacturer and published in them in the cycling open data folder. They’re a mess.
- There are three months of data for the CS6 Blackfriars Road counter.
- The July data for the CS3 Embankment counter is in the Blackfriars folder and vice-versa.
- The May and June data for the CS3 Embankment counter is a duplicate of the CS6 Blackfriars data – there is no data for CS3 Embankment for May and June.
- The files are comma-delimited CSV files. The file names have, incorrectly, an XLS file type. Open source office packages such LibreOffice will open them. MS Excel is initially give you an error message.
Summary: there’s four discrete months of data – three for CS6 Blackfriars and one for CS3 Embankment.
There’s lots of data fields
I guess the counters’ technology is based on that used for heavier vehicles, and there’s lots of data fields. There’s one record for each vehicle detected as it passes the counter:
- SITE_NUMBER, SITE_ID: likely to be identifiers from the manufacturer for each counter;
- SERIAL_NUMBER: is an incremental number for each record. Let’s assume this is the counter’s count;
- DATE, TIME, TimeString: three fields which give the date and time to 10ths of a second. They’re not well formatted but enough to play with. Looking at the distribution across 24 hours, it is possible the counters are using GMT (lots of people are counted from 0600-0800 rather than 0700 -0900).
- LANE, LANE_DETAILS, DIRECTION, DIRECTION_NUMBER : Try to identify whether someone is moving north/south on CS6 Blackfriars, or east/west on CS3 Embankment. As we’ll see below, they’re not very accurate.
- SPEED, SPEED_MPH : calculation of vehicle speed initially in, I assume, KMH, then converted to MPH and rounded to the nearest whole number;
- CLASS_INDEX, CLASS : from fields here onwards, the counter tech tries to identify the type of passing vehicle – CYCLE (bicycles), M/C (motorcycles), MOPED (obvs), and 2N (unidentified). 2N is a classification used by the Traffic and Road Research Laboratory and refers to unidentified 2-axle vehicles. As we’ll see below, it’s a good estimate but unlikely to be accurate;
- LENGTH, AXLES, WHEELBASE: assessment of vehicle length and wheelbase in, presumably, cm, and the number of axles (invariably 2!);
- VALIDITY, STRADDLE, OVERLOADED: little useful here;
- GROSS: appears to be sum of AX_WT1 and AX_WT2 axle weight fields below;
- HEADWAY, GAP, TIME_GAP: no idea – given the position in the field list, possibly something about time gap between axles crossing the detection mechanism. Includes negatives.
- LEGAL_STATUS, CHASSIS_CODE, TEMP: little useful here;
- AX_WT1 … AX_WT25: weight on each axle – not clear what the units are (possibly lbs), and GROSS (above) is the sum. Range seems to be in range of 0 to 950+, so not sure how useful this is;
- AXLE_TYPE_1 ..25: nothing useful;
- AX_SP1 … AX_SP24: appears to be spacing between axles, presumably in CM;
- FRONT_MIN_CHASSIS … REAR_MAX_CHASSIS: nothing useful.
“Nobody uses them, those cycle lanes are empty” – er, not true
It is no surprise that much of the cycle traffic along the CS3 and CS6 cycle routes is commuter-driven. Both counters are placed on major commuting routes to and from the City and Westminster. The counters are unlikely to track people cycling to commercial or entertainment centres such as Oxford Street or West End. The distribution of use across 24/7 reflects this.
But both counters detect people cycling past in all 24 hours in all 7 days each week. And the majority of people cycling on both CS3 Embankment and CS6 Blackfriars Road do so outside the Monday-Friday 7am-9am and 5pm-7pm commuting hours.
Like a lot of London’s transport, there’s a slight POETS effect on Friday, and the sum of weekend usage is similar to that of a weekday, albeit with users liking a more relaxed start.
Graph and figures below based on records identifying passing vehicle as ‘Cycle’.
|Monday – Friday 0600-0800 (GMT assumed)||155351|
|Monday – Friday 1600-1800 (GMT assumed)||137606|
|Total peak hours||292957|
|Total cycle records||623807||47%|
Looking at the distribution, I assume the detectors’ clocks work on GMT (particularly, the Embankment counter). The displays’ clocks, however, appear to reset at BST midnight when ‘cyclists today’ count resets to zero.
What the detectors detect and what the displays display are different!
If we assume that SERIAL_NUMBER is the count, then the data records and the numbers we’ve seen on the displays are different. And behave differently for each counter!
Since February, and in the absence of real-time data from TfL, Londoners have been crowd-sourcing the counters’ numbers on Twitter using the #CS3Count and #CS6Count hashtags, with the accounts @CS3Count and @CS6Count retweeting and providing summaries.
Comparing the numbers in the data files with reports of verified numbers on the displays. The examples below are based on start of day/ midnight position, using the data files’ Serial_Number minimums and the #CSxCount YTD-Day reports:
The CS6 Blackfriars detector leads its display by 22,000+ with the lead increasing.
The CS3 Embankment detector lags its display by 43,000+, although shrinking. (Presumably, the detector’s counter was reset earlier in the spring for some reason.) The Embankment counter’s display clicked over 1,000,000 on the afternoon of Monday 23 July – the data file’s serial number doesn’t do so until Friday 27.
The counter numbers above are based on all records – no filtering for vehicle type, although the numbers of non-bicycles are relatively small and do not explain the variances. It’s more likely that the counters’ detectors and displays frequently fall out of sync …
Are the detectors and the displays separate mechanisms?
Going from above, this suggests the counters’ detecting and display mechanisms are separate, with the detection mechanism sending the display a trigger to increment.
On 16 May, the CS6 counter’s display was not incrementing during the evening rush hour (tweet with video). However, the data file does have a continuous stream of records throughout the afternoon, with a total of 5,594 records for the day – a typical CS6 weekday total. This would help explain, in the table above, the increasing difference between the detected number stored in the data file versus the displayed number.
If the detectors and displays are separate mechanisms, then there’s hope for the currently out-of-order CS6 Blackfriars display, which has been stuck since 4 August. Could the detection mechanism still be counting people cycling by? We’ll have to wait for a future data drop to find out.
North, south, east and west
The counters are not very good at assessing the direction of travel. I wouldn’t expect a 50:50 split, but each counter appears to have a bias – especially, CS3 Embankment.
|CS6 Blackfriars (3 months)||364,425||157,762||206,663|
|CS3 Embankment (1 month)||271,916||264,004||7,912|
Vehicle types: ‘cycles’, ‘motorcycles’, ‘mopeds’ and ‘2N’
The counters attempt to identify the type of each passing vehicle, presumably from a combination of the speed, wheelbase and axle-weight data fields. I guess the counter was originally developed to count a stream of 4-wheeled vehicles travelling in single file along a carriageway.
However, people on bicycles don’t cycle past the counters in single file. On the #CS6count and #CS3count reports, we know the counters struggle to detect volumes of overlapping cyclists at peak times and under-count.
That said, while it’s not perfect, the counter makes a reasonably good fist of differentiating bicycles.
Unpacking the cycle versus moped figures, there may be some correlation between wheelbase length, speed and (not shown below) gross axle weight.
If you’re a fit Clydesdale riding a large bicycle, then you’re probably being detected as a moped.
Gross Axle Weights – are they reported in lbs?
If we make a broad assumption that most people using the cycle lanes weigh 60-100 kilograms and riding 10-20kg bicycles (yes, I know Santander hire bikes are 23kg), then we’d expect the gross weight to be a range of 70-120kg.
I’m not sure what units the counters are using to assess axle weights. Looking at Embankment for ‘cycle’ vehicles only, over 7000 have no weight at all, and the majority are clustered in the range of 160-360 units.
Good tweeps such as @commuter76 have suggested the axle weights might be expressed in lbs. That would be the best answer. If the detectors’ mechanisms are based on those for heavier motorised vehicles that might explain why relatively light cyclists are hard to detect and weigh accurately.
“They’re all doing 30mph” er, no, they’re not
If you’re still reading, you’ve sussed that the counters are directionally correct but, understandably, not accurate. Speed figures are estimates based on a set of data points captured fleetingly. The upper and lower ends are probably noise, but the stuff in the middle is useful. Speed distributions between the two counters are similar, with CS6 Blackfriars being slightly slower.
|“Cycles” only||CS3 Embankment||CS6 Blackfriars|
|Peak speed kmh/ mph||22 / 13.6||22/ 13.6|
|Majority % kmh/ mph||52% inside 24 / 15||56% inside 23 / 14.3|
|32kmh/ 20mph||93% inside 32 / 20||96% inside 32 / 20|
|98% kmh / mph||98% inside 35 / 21.7||98% inside 33 / 20.5|
BTW, both counters have an issue with 35kmh – each happily detects dozens to hundreds at speeds either side, but just finger numbers at 35kmh.
Update 16 Sept: there is an interesting variance between the average speeds past the two counters in the morning. Looks like people cycling past the Blackfriars counter have a more leisurely approach than those on Embankment in the mornings, but they’re slightly quicker in the afternoon. Note, all the hourly averages are below 16.5mph/ 26.6kmh.
TfL has obtained 4 months’ worth of data from the counters’ manufacturer and given us something to play with. The notes on TfL’s tech forum say that live data feeds are not possible, so (ir)regular drops of daily files are the best we can expect.
Some questions answered, and lots more raised!