You are on page 1of 184

Cube Analyst Reference Guide

Cube Analyst

CUBE ANALYST
VERSION 6.1.0

Copyright 20072013 Citilabs, Inc. All rights reserved.


Citilabs is a registered trademark of Citilabs, Inc. All other brand names and product names used in this book are
trademarks, registered trademarks, or trade names of their respective holders.
The information contained in this document is the exclusive property of Citilabs. This work is protected under United
States copyright law and the copyright laws of the given countries of origin and applicable international laws, treaties,
and/or conventions. No part of this work may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying or recording, or by any information storage or retrieval system, except as expressly
permitted in writing by Citilabs.
Citilabs has carefully reviewed the accuracy of this document, but shall not be held responsible for any omissions or
errors that may appear. Information in this document is subject to change without notice

60-010-1
April 24, 2013

Cube Analyst Reference Guide

Contents

About This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix


Chapter 1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
What is Cube Analyst? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Scope of this document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Whats new? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Common elements and variations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Reading this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Conventions used in this document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Computing resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Cost information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Chapter 2

Estimation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Framework for handling different data consistently . . . . . . . . . . . . . . . . . . . . 12
Objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Handling data variability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Options for users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Considerations for users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Deciding what information to input. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Inputting data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Estimating the matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Analyzing the estimated matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Improving the estimated matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Estimating highway and public transport matrices. . . . . . . . . . . . . . . . . . . . . 20
Overview of Cube Analyst. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Cube Analyst Reference Guide

iii

Contents

Chapter 3

Possible Data Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25


Types of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Link counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Turning counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Prior trip matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Trip cost matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Partial O-D matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Trip ends. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Routing information. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Cost distribution function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Part-trip data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Sets of data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Chapter 4

Mathematical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Mathematical notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Explaining the letters and symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Notation used in the estimation equation . . . . . . . . . . . . . . . . . . . . . . . . . 34
Introduction to the mathematics in Cube Analyst . . . . . . . . . . . . . . . . . . . . . . 35
Main mathematical features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Estimation equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Model parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Maximum likelihood objective function . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Describing the variation in data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Optimizer: Finding the minimum value . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Mathematical summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Maximum likelihood method: Background theory . . . . . . . . . . . . . . . . . 48
Application of maximum likelihood to Cube Analyst. . . . . . . . . . . . . . . 49
Cube Analyst objective function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Cube Analyst trip estimation model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Estimating model parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Optimization procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Parameter errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Cell reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Extensions to the calculations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Chapter 5

Data Preparation and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59


Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Matrices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Trip ends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Networks and traffic and passenger counts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Screenlines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

iv

Cube Analyst Reference Guide

Contents

Routings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Highways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Public transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Setting confidence levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Characteristics of the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Deciding on confidence values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Tuning estimation performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Control of routing information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Analyzing the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Chapter 6

Estimation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Study area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Estimating the matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Evaluation: Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Including part-trip data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Chapter 7

Hierarchic Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Introduction to hierarchic estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
Approaches to estimating very large matrices . . . . . . . . . . . . . . . . . . . . . 94
Different levels of detail: Districts and zones. . . . . . . . . . . . . . . . . . . . . . . 94
Different approaches to hierarchic estimation . . . . . . . . . . . . . . . . . . . . . 95
Alternative approaches to hierarchic estimation . . . . . . . . . . . . . . . . . . . . . . . 96
Estimation with mixed district and zonal detail . . . . . . . . . . . . . . . . . . . . 96
Local matrices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Summary of the hierarchic estimation process . . . . . . . . . . . . . . . . . . . . 99
Defining districts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Running Cube Analyst for hierarchic estimation . . . . . . . . . . . . . . . . . . . . . . 106
Parameter ZCONF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

Chapter 8

Using Cube Analyst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109


Input data: overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Outputs: overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Estimating large matrices (hierarchic estimation) . . . . . . . . . . . . . . . . . . . . . 112
Estimation process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Chapter 9

Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Summary of Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Sample reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Average confidence level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Final five iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Cube Analyst Reference Guide v

Contents

Matrix totals and zone generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119


Zone attractions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Average confidence level (part trip data) . . . . . . . . . . . . . . . . . . . . . . . . . 121
Part trip totals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
District matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Local matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Chapter 10

Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Chapter 11

Control Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127


&PARAM keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Standard user control parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Secondary user control parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Tuning control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
&OPTION keywords. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

Chapter 12

Program Specific Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139


Screenline file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Link count format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Turning count format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Trip end file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Coordinate file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Model parameter file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Local matrix control file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
District definition file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Intercept file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Gradient search file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Chapter 13

Notes on Program Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151


Approaches to running Cube Analyst. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Initial estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Constrained model parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Controlling the optimization process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Selection of model form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Information in the optimization log file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Computation times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Running Cube Analyst from Cube Voyager . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Running Cube Analyst from a VOYAGER script . . . . . . . . . . . . . . . . . . . . 160
Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Chapter 14

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Estimation with prior trip and count data only . . . . . . . . . . . . . . . . . . . . . . . . 162

vi

Cube Analyst Reference Guide

Contents

Estimation with prior trip, count, and trip end data . . . . . . . . . . . . . . . . . . . 163
Estimation with warm start and cost data . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Estimation with highways part trip data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Estimation with public transport part-trip data . . . . . . . . . . . . . . . . . . . . . . . 166
Hierarchic estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Example of screenline volumes report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

Cube Analyst Reference Guide vii

Contents

viii Cube Analyst Reference Guide

Cube Analyst Reference Guide

About This Document

Welcome to Cube Analyst!


This document provides detailed reference information about
Cube Analyst.
This document contains the following chapters:

Chapter 1, Introduction

Chapter 2, Estimation System

Chapter 3, Possible Data Inputs

Chapter 4, Mathematical Background

Chapter 5, Data Preparation and Analysis

Chapter 6, Estimation Process

Chapter 7, Hierarchic Estimation

Chapter 8, Using Cube Analyst

Chapter 9, Reports

Chapter 10, Files

Chapter 11, Control Data

Chapter 12, Program Specific Data

Chapter 13, Notes on Program Use

Chapter 14, Examples

Cube Analyst Reference Guide

ix

About This Document

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Introduction

This chapter introduces you to Cube Analyst. Topics include:

What is Cube Analyst?

Scope of this document

Whats new?

Background

Common elements and variations

Reading this document

Conventions used in this document

Computing resources

Cost information

Cube Analyst Reference Guide

Introduction
What is Cube Analyst?

What is Cube Analyst?


Cube Analyst is a program which estimates an origin-destination
(O-D) trip matrix. It is an optional, standalone and separately
licensed module in the Cube suite.
Cube Analyst estimates one matrix at a time, and the data should
form a set related to this particular matrix; that is, the data should
correspond to the same time period (hour(s) of day, day of week,
time of year) as the matrix. It should also correspond to the same
units of flow as the matrix (vehicles, pcus, passengers, etc.).
The characteristic common to all estimation options offered by
Cube Analyst is that they make the best use, in a flexible way, of
commonly available data sources to contribute to the estimation
process.
Data is given levels of confidence or reliability by the user which
conditions the influence of varying sources of data in the
estimation. The estimation process is based on the maximum
likelihood technique, coupled with an optimization procedure.

2 Cube Analyst Reference Guide

Introduction
Scope of this document

Scope of this document


This document applies to all levels of functionality offered and
modes of operation of Cube Analyst. Features specific to a variant
are noted.
This document concentrates on Cube Analyst; wider matters on
matrix estimation, and the context within which Cube Analyst may
be used, are described in the Introduction to the Matrix Estimation
Programs. This also explains the terms which have a specific
meaning for Cube Analyst which are also used in this document.

Cube Analyst Reference Guide

Introduction
Whats new?

Whats new?
Cube Analyst can now estimate Cube Voyager Public Transport
matrices by using an intercept file output by the Cube Voyager PT
program.

4 Cube Analyst Reference Guide

Introduction
Background

Background
Cube Analyst enables transport planners to estimate origindestination (O-D) trip matrices and to maintain the currency of
existing O-D matrices, while minimizing survey costs.
As is described in Introduction to the mathematics in Cube
Analyst on page 35, Cube Analyst is suitable for estimating present
day matrices, but not for forecasting future year trip matrices.
The software contains a number of novel and distinctive features. It
was first developed as a collaborative venture with the Dutch
Ministry of Transport, the Rijkswaterstaat. Subsequently, studies
and developments undertaken for Centro (the Passenger Transport
Executive for the West Midlands area of England) led to a
broadening of the softwares capabilities to consider public
transport passenger matrices, as well as highway (vehicle) matrices,
and to estimate detailed matrices for very large study areas.

Cube Analyst Reference Guide

Introduction
Common elements and variations

Common elements and variations


The characteristic common to all variants of Cube Analyst is that
they make the best use, in a flexible way, of most available data
sources in the estimation process. This includes not only vehicle
traffic or passenger flow counts and prior (old) matrices, but also
partially observed matrices, zonal trip end (generation and
attraction) data, vehicle routing, travel cost matrices, and even
previously calibrated trip cost distribution functions. An extension
is the use of a further form of data called part trip data, described
in Part-trip data on page 29.
Data is ascribed confidence, or reliability levels by the user. This
conditions the influence of data when different data items
(inevitably) imply different trip matrix cell values. The estimation
process is based on a statistically rigorous procedure which takes
direct account of inherent traffic data variability. It uses the
maximum likelihood technique, coupled with a powerful
optimization procedure, to derive simultaneously an unusually
large set of model parameters. These then determine the estimated
trip cell values with correspondingly enhanced precision.
Nevertheless, the estimation process remains mathematically
underspecified and a feature of Cube Analyst is the information
available to assess the quality of the estimated matrix. This includes
comparative and sensitivity analyses, and reports which draw on a
range of graphical and tabular presentations. Statistical reports are
available which provide information on the standard errors of
model parameter values, and indicators of the stability of estimated
trip matrix cells (via a sensitivity matrix).
Cube Analyst provides a hierarchic approach to estimation, suited
for use with very large matrices, typically, between 2,500 and 5,000
zones in size. Its basic approach is to estimate a general matrix, in
which zones are automatically grouped into districts. This areawide estimation is then used to control a set of detailed
estimations, which build up to provide a fully detailed estimate for
the entire study area.

6 Cube Analyst Reference Guide

Introduction
Reading this document

Reading this document


The introductory chapters provide:

An overview of Cube Analyst

A set of Standardized Procedures, suitable for different types of


estimations

The document considers estimation of highway and public


transport matrices and all of the Cube Analyst features.
Highway and public transport estimation are very similar, apart
from obvious differences such as the use of line (service) data for
public transport. There are also differences in emphasis, for
example, count data is often more plentiful and reliable for
highways than for public transport. Where such differences arise,
they are noted.
When reading this document note that:

The next four chapters provide an essential overview of Cube


Analyst

Chapter 6, Estimation Process documents an example of


applying Cube Analyst

Chapter 7, Hierarchic Estimation is concerned with the


specialist topic of hierarchic estimation

Cube Analyst Reference Guide

Introduction
Conventions used in this document

Conventions used in this document


The following conventions are used in this document:

Parameters, options, and selections appear in upper case.


For example: COSTM

Technical term introduced for the first time, in upper and lower
case italics.
For example: Hessian

Terms and phrases with particular meaning in the context of


Cube Analyst in quotes. These phrases may also appear in
italics.
For example: Sensitivity Matrix

8 Cube Analyst Reference Guide

Introduction
Computing resources

Computing resources
Cube Analyst is a major system. The programs ensure that the
mechanics of operation for the user are straightforward, but it
requires familiarity with a number of programs, especially for data
preparation and analysis of results, and this should be taken into
account when planning to use it for the first time.
Cube Analyst is designed about a number of rigorous principles,
including the calibration of the mathematical estimation model
which the program undertakes. One consequence is that it is
computationally intensive; the differing sets of data are considered
simultaneously and this requires the availability of relatively large
amounts of random access memory (RAM), memory, and of disk
space.

Cube Analyst Reference Guide

Introduction
Cost information

Cost information
For highways, cost data is produced by Citilabs products.
For public transport in TRIPS, cost data is produced by MVPUBM.

10

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Estimation System

This chapter discusses the nature of the estimation system. Topics


include:

Framework for handling different data consistently

Objectives

Handling data variability

Options for users

Considerations for users

Estimating highway and public transport matrices

Overview of Cube Analyst

Cube Analyst Reference Guide 11

Estimation System
Framework for handling different data consistently

Framework for handling different data consistently


Cube Analyst provides a framework that is used to input a variety of
information to estimate an O-D matrix. The characteristics of the
system are that:

12

Some or all of the types of information introduced in Common


elements and variations on page 6 may be used.

The system can work with little data, but the accuracy of the
estimated matrix is improved as more data is input.

Different information is handled on a consistent basis.

The variability of data is explicitly accounted for.

Cube Analyst Reference Guide

Estimation System
Objectives

Objectives
The aim of Cube Analyst is to maximize the value of existing data
and to limit the need for costly surveys. As such, it is mainly
concerned with processing information in the best (statistical)
manner; though the accuracy of the estimated matrix remains
strongly affected by the amount and the quality of the information
input by the user.
Beside the role of estimating matrices for individual studies, Cube
Analyst is suited for use with regular surveys designed to keep
matrix information up-to-date.

Cube Analyst Reference Guide 13

Estimation System
Handling data variability

Handling data variability


Cube Analyst explicitly considers the variability of data. Inevitably,
there are inconsistencies in what the different data suggest that the
estimated matrix should be. The inherent variability means that
collected data items are merely a sample, and hence the values,
(even of simple traffic counts) may only be considered to fall within
a range (a distribution). The width of this range is a reflection of the
confidence that may be placed in particular items.
Cube Analyst therefore requires the user to input information
about how confident they are that each data item is representative
of the situation for which the matrix is to be estimated. The
information is input as a nominal percentage sample value. In
restricted circumstances, this may be an actual sample obtained in
a survey. This information about the variability is used to determine
what relative influence each item of data has in the estimation
processit acts algebraically as a weighting value, and is referred
to as a confidence level.

14

Cube Analyst Reference Guide

Estimation System
Options for users

Options for users


The user does not have to use Cube Analyst in one manner, but
rather according to the information that is available and the
context within which the matrix is required. Typically, the user will
start with what information is to hand or may easily be collected.
This provides a fast means of obtaining an initial matrix that can
enable a study to proceed, at least for general investigations.
Analysis of the resulting matrix and estimation statistics will show
where there is greatest requirement for further quality data. Cube
Analyst is then used to integrate this new (and possibly different
type of ) data to produce an improved estimated matrix.

Cube Analyst Reference Guide 15

Estimation System
Considerations for users

Considerations for users


Cube Analyst involves the user in a number of stages:

16

Deciding what information to input

Inputting data

Estimating the matrix

Analyzing the estimated matrix

Improving the estimated matrix

Cube Analyst Reference Guide

Estimation System
Considerations for users

Deciding what information to input


This will usually be all information already available, but new data
will normally be appropriate for those parts of the study area where
most change has taken place since previous surveys, or where
traffic schemes or policy proposals require detailed analysis.
Identify notable features and data sources
Feature

Example

Data

Car ownership

Traffic growth

Counts

Land use

New industry, shops


New car parking

Trip ends (generations and


attractions)

New bypass

Travel times, routing

Changes in:

Road/public
transport network

Traffic management
New bus/rail services

Travel habits

Out-of-town shopping

Observed O-D patterns;


PT operators boarding &
alighting surveys;
vehicle licence plate surveys

Appreciating key land uses

Cube Analyst Reference Guide 17

Estimation System
Considerations for users

Inputting data
Information may be input in the form of matrices, as trip ends, or as
network-related information. This data is prepared by the user
within Cube, which offers a variety of modes of data entry. Extra
information is required on data variability. This is input in the same
form as the information to which it corresponds. Each data item, for
example each count, trip end, etc., may have an individual
confidence level attached to it, but in many cases global values will
be used.

Estimating the matrix


The matrix estimation stage simply requires the user to input the
prepared files into Cube Analyst. As is described in Overview of
Cube Analyst on page 21, and with more detail in Chapter 4,
Mathematical Background Cube Analyst performs a set of
iterative calculations which will automatically determine the
statistically most likely matrix for the set of input data values
provided.
The first time Cube Analyst is run, it creates a set of files which can
be used to reduce the run times of subsequent runs of Cube
Analyst. This is either because the need to restructure data is
avoided (the intercept file) or because an estimation can take
advantage of previously calculated results (the gradient search file
and the model parameter file).
This ability to benefit from a previous run of Cube Analyst (for the
same basic study) is usually used to assist in analyzing the
consequences of changes in data values, but, for lengthy runs for
large matrices it can provide a means of breaking an estimation
into more than one run, for convenience.
With an improved optimizer in Cube Analyst and more powerful
computers such staging of estimations is now rarer, but it remains a
typical feature for hierarchic estimations of extremely large

18

Cube Analyst Reference Guide

Estimation System
Considerations for users

matrices. This is assisted by the local matrix control file, which is


open to editing so that estimations are staged in a manner
convenient to the user.

Analyzing the estimated matrix


It is natural and desirable to want to check the quality of the
estimated matrix. A typical approach to checking quality might be
to compare the estimated matrix with some observed data which
has not been used in the estimation process. However, this
approach is not usually appropriate for Cube Analyst, which is
designed to take advantage of all reasonably observed data. For
example, if the estimated matrix implies that the link flows across a
screenline are different from that observed (this is easily checked
by assigning the estimated matrix to the network), then the
solution is to re-run the estimation but now incorporating the extra
observed data.
The approach to analyzing the quality of the estimated matrix is,
therefore, based on:

Comparing the estimated results with input data values

Checking the sensitivity of the results if data values are altered

Analyzing the estimation calculations

Besides information output by Cube Analyst itself, extensive use is


made of other Citilabs programs for creating tabulations and
graphic displays which highlight different characteristics of the
estimated matrix.

Improving the estimated matrix


Deficiencies in the quality of the estimated matrix, when they are
signalled by the results of the analysis phase, are remedied by
improving the quality or quantity, or both, of the input data. The
analysis phase can provide strong pointers as to which data is
contributing to quality problems and hence where the user can
focus attention.

Cube Analyst Reference Guide 19

Estimation System
Estimating highway and public transport matrices

Estimating highway and public transport matrices


For much of the time, it is not necessary to distinguish between the
cases of estimating matrices for use with highways and public
transport analysis; the same principles apply to each. However,
there are a number of points to note. The first one is that the units
of the matrices are usually in terms of vehicles for highways, and in
terms of passengers for public transport.
Much of the data and methods of processing are identical for both
highways and public transport, but the routing information is
derived in quite different ways. There is also the concept of line
groups, which only applies to public transport and not to highways.
Assumptions about the quality and quantity of data vary between
the modes. Link count data is more readily, and accurately,
available for highways than for public transport. Public transport is
often more reliant on part-trip data, as obtained from boarding and
alighting surveys. This form of data may be obtained from licence
plate matching surveys for highways.

20

Cube Analyst Reference Guide

Estimation System
Overview of Cube Analyst

Overview of Cube Analyst


Cube Analysts operations can be considered as a series of activities:
1. Data input and restructuring

For the most part, Cube Analyst simply reads the set of users
input data at this initial stage. However Cube Analyst also
analyzes and restructures routing information (from the TRIPS
route choice probability (RCP) file or Cube Voyager path file),
and count data, from the screenline file, into a more concise
and efficient file, called the intercept file. This restructuring can
be relatively lengthy so, as noted in Considerations for users
on page 16, it is possible to re-use an Intercept file once it has
been created. For Cube Voyager users, the creation of the
Intercept file is handled by the HIGHWAY program.
2. Calculation initiation

The main Cube Analyst calculations may be viewed as a search


for the statistically most likely matrix, given the set of input
data values. As this search relates, typically, to many thousands
of matrix cell values, the manner of searching is a critical aspect
of Cube Analyst.
A calculation called the method of scoring directs the start of
the searching process. This calculation is always done as the
first stage of the estimation calculation, and it may be repeated
later, according to the settings of Cube Analysts ITERH
parameter. (This determines the number of iterations between
gradient search matrix calculations.)
There is a strategy consideration here. The default method for
running Cube Analyst spends time with the method of
scoring calculation in order to limit subsequent calculations.
Cube Analyst also calculates a suitable value for ITERH.
However, it is open to the user to over-ride this strategy by:

Changing the setting of the IHTYPE parameter (used to


determine the optimization process) of Cube Analyst from
its default in order to avoid the method of scoring. This

Cube Analyst Reference Guide 21

Estimation System
Overview of Cube Analyst

reduces the associated calculation time, but means that the


searching process is initially less well directed and so the
net calculation time may still be longer.

Setting ITERH to a lower value than the default, which


means that the searching process is re-appraised by further
application of the method of scoring. This may be suitable
when there are signs that the optimizer is not able to
determine a convergent solution in a reasonable number of
iterations.

The user should note that these options for tuning the
performance of Cube Analyst exist, but should not necessarily
be concerned to apply them, as the default operation is usually
entirely satisfactory. It requires some experience with a
particular estimation problem to determine its best strategy.
3. Function evaluation

Function evaluation is the term used to describe the


calculation of a series of estimation results. These are calculated
by way of an estimation equation (function). The estimation
equation calculates the values of the estimated cells according
to the current values of a series of model parameters. There are
a large number of model parameters, in fact the number is
usually two times the number of zones, plus the number of
screenlines.
These model parameters have an initial value of 1.0, which has
the consequence that the initial function evaluation (usually)
results in an estimated matrix which is identical to the old
(Priorsee Prior trip matrix on page 27).
4. Optimization

The optimizer is a central feature of Cube Analyst; there are two


critical elements to it:
a.

22

Objective function This provides a criterion by which the


optimizer can determine whether one value of a particular
cell is better than another value. Maximum likelihood
objective function on page 40 explains how this criterion

Cube Analyst Reference Guide

Estimation System
Overview of Cube Analyst

is derived from the statistical maximum likelihood theory


and rigorous mathematical calculation. Hence, Cube
Analyst defines better as statistically more likely.
b. Set of search directions and a step length The optimizer

alters the model parameter values, from their starting point


of 1.0, to seek an estimated matrix that is an improvement
on its current estimates. The search direction determines,
for any cell in the matrix, whether model parameters
should be increased or decreased, and the step length
defines by how much.
The final values of the model parameters are available to view
as the model parameter file, so it is possible to see how they
have been changed from 1.0.
5. Iterations and convergence

After the optimizer has calculated new model parameter


values, the function evaluation process is repeated to obtain
the latest estimated matrix (and its derivative values). This
overall process is repeated in a series of iterations; at each
iteration the optimizer will ensure that the new estimated
matrix is an improvement (more likely) than the previous one.
Because there are so many cells to estimate, which Cube
Analyst does not confine to have integer values, it is normally
always possible to make some improvement, however small.
Therefore, it is necessary to define a criterion to determine
when the iterations have reached an acceptable solution. In
Cube Analyst, this criterion is set by the UTOL (user tolerance)
control parameter. UTOL sets a minimum value on the step
length which the optimizer is allowed to use, as very small step
lengths indicate that the optimizer is making correspondingly
small changes to the estimated matrix. It is usual to leave UTOL
at its default value, and allow Cube Analyst to run until it
terminates with a converged message.

Cube Analyst Reference Guide 23

Estimation System
Overview of Cube Analyst

24

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Possible Data Inputs

This chapter describes data inputs. Topics include:

Types of data

Sets of data

Cube Analyst Reference Guide 25

Possible Data Inputs


Types of data

Types of data
Cube Analyst can operate using some or all of the following types
of data:

Link counts

Turning counts

Prior trip matrix

Trip cost matrix

Partial O-D matrix

Trip ends

Routing information

Cost distribution function

Part-trip data

NOTE: Cube Analyst requires confidence level information for all


data types except routing information and cost distribution
function.

Link counts
For highways, this information may be surveyed with considerable
accuracy and exploit automatic counters, but it may not show the
current demand for travel (which the O-D matrix should represent)
if congestion has constricted flows.
For public transport, this data is often obtained from estimates of
passenger numbers in buses and rail carriages, and is of inherently
limited accuracy (but may still be usefully exploited by Cube
Analyst).
For both modes, it should be observed that matrices normally
apply to average situations for which individual counts will match
to only some extent.

26

Cube Analyst Reference Guide

Possible Data Inputs


Types of data

Link counts which are spread randomly across the network


contribute relatively little information to the estimation of matrix
cells. This may be less of a problem for public transport networks
offering limited alternative routes, than for highway networks with
inherently greater route choice options.

Turning counts
The same comments as for link counts apply. Note that turning
counts may only be applied when inputting a Cube Voyager path
file. They are not supported for an estimation using a TRIPS RCP file.

Prior trip matrix


This matrix might be an out-of-date matrix for the study area, or
possibly a previous study forecast for the present day. It is not
essential to input a prior trip matrix, but in practice a matrix is very
desirable for information about the pattern of trip movements.

Trip cost matrix


This matrix summarizes the cost of travel between zones, where
cost is normally defined as a user-specified combination of time
and distance, and any tolls or fares, etc. The trip cost matrix may be
used as a substitute when some or all of a prior matrix is not
available. The costs may be based on either modelled or surveyed
speed data.

Partial O-D matrix


This is simply another approach to providing the prior matrix that
makes it possible to use information that specifies some cells of the
matrix but not all. The user merely identifies a (relatively) high
confidence in those cells which have been observed and allows
other information to determine values in the remaining cells. This
may be data from the cost matrix, in which case the corresponding
prior matrix cells must be zero. Alternatively, non-observed cells are
given non-zero values with zero or low confidence levels. Zero

Cube Analyst Reference Guide 27

Possible Data Inputs


Types of data

values in input matrices are taken to indicate that trips in


corresponding cells are impossible. Cost data are not used to
estimate trips for cells which have non-zero prior-trip values.
This approach makes Cube Analyst useful when surveys have been
conducted around critical parts of a study area (for example, town
centers, travel corridors, etc.), but there remains a need to estimate
the matrix for the rest of the area.

Trip ends
The total number of trips generated from and attracted to zones
(G&A) may be obtained either from surveys or from mathematical
land-use type models. Surveys are appropriate when zone
boundaries are such that traffic may be counted entering and
leaving zones on distinct trips, rather than merely passing through
the zone. This tends to occur only for some zones, for example a car
park or an industrial estate, but these are often important zones for
a study.
It is possible to use data derived from both methods, for example, a
few zones surveyed and the remainder derived from a model, with
the resulting trip ends distinguished through differing confidence
levels.

Routing information
It is possible to survey routing data, though this is rarely done. The
modelling of routing is often not a very good replication of actual
(erratic) driver or passenger routing, and it is often not possible to
place much reliability on this otherwise important data. Cube
Analyst is therefore designed to use routing information, as far as
possible, only where the precise routing does not matter. Thus, for
skim cost information small variations in routes may be ignored,
while count information is used in bottleneck situations where
the number of routes is limited to a few alternative links (ideally
one).

28

Cube Analyst Reference Guide

Possible Data Inputs


Types of data

Cost distribution function


Many areas which have been the subject of previous studies will
have a previously calibrated mathematical trip-cost distribution
function, as used in the gravity model. Because Cube Analyst
contains its own calibration procedures, the information implied by
the distribution function is not normally used directly, although the
a and b parameters, discussed later, may be fixed with reference to
a previously calibrated gravity model.

Part-trip data
This data is surveyed in the form of matrices where the recorded
origin and destination are not necessarily the ultimate origin and
destination of the trip. This is illustrated in the figure Definition of
part-trip data that shows the recorded part of trip (S - E) relative to
the total trip (O - D). It is possible for one or both of points S and E
to coincide with the corresponding points O and D. For highways,
this data is typically obtained from licence plate matching surveys,
and from on-board surveys recording passenger boarding and
alighting points for public transport.

Definition of part-trip data

Cube Analyst Reference Guide 29

Possible Data Inputs


Sets of data

Sets of data
Cube Analyst estimates one matrix at a time, and the data should
form a set related to this particular matrix, that is, the data should
correspond to the same time period (hour(s) of day, day of week,
time of year) as the matrix. It should also correspond to the same
units of flow (vehicles, pcus, passengers, etc.). Sometimes the user
will have to transform data (for example, by factoring) to achieve
this, and this will usually imply a reduction (small or large) in
confidence levels for the transformed data.
Also, only one set of information may be input into Cube Analyst for
an estimation. Hence, if multiple sets exist, say, several traffic counts
for the same link, then the user must derive a single set. This may
simply be to choose the most recently surveyed set, or it might be a
weighted average of all available sets. Multiple sets of data usually
allow confidence levels to be increased relative to single sets of
data.

30

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Mathematical Background

This chapter describes the mathematics that Cube Analyst uses.


Topics include:

Mathematical notation

Introduction to the mathematics in Cube Analyst

Mathematical summary

Extensions to the calculations

Cube Analyst Reference Guide 31

Mathematical Background
Mathematical notation

Mathematical notation
This section discusses mathematical notation. Topics include:

Explaining the letters and symbols

Notation used in the estimation equation

Explaining the letters and symbols


This section uses mathematical notation, which can look daunting
for those who are not accustomed to it. So, first, a word of
background explanation. The notation can be made to appear
worse because of the use of greek letters and some specialist
mathematical symbols. The problem is that the normal 26-letter
Roman alphabet is not sufficient, even considering upper and
lower case letters, and remembering that some letters have
traditional mathematical meanings and associations. The
mathematics which is presented here is only an extract of the full
Cube Analyst mathematics, which uses an even wider range of
letters. Also, some of the traditional mathematical notations are
cumbersome when used with vectors and matrices and their
elements, as Cube Analyst requires, hence it is better to use
alternative forms.
This is mainly a pronunciation guide, but some of the symbols and
letters are explained further:

32

Symbol

Description

alpha

beta

eta

theta

lambda

xi (upper case)

xi (lower case)

pi (upper case); symbol for multiplication (product)

sigma (upper case); symbol for summation

Cube Analyst Reference Guide

Mathematical Background
Mathematical notation

Symbol

Description

phi (upper case)

psi (upper case)

partial differential operator

nabla; symbol for (partial) differentiation of matrix elements

exponent

factorial operator (for example, 4! = 4x3x2x1)

The notation P(x|X) implies the probability of x, given the value X.


Similarly, L(x|X) is the likelihood of x, given X; M(x|X) refers to the
log-likelihood of x, given X. Note the use of bold in the last example
implies that x and X are multi-valued vectors (or matrices).

Cube Analyst Reference Guide 33

Mathematical Background
Mathematical notation

Notation used in the estimation equation


Notation
=
=
=
=

Description
Origin zone
Destination zone
Link count
Screenline count (from count sites

.....)

=
Model parameters

Mean travel cost

Any one of the model parameters


=

Observed data item


Estimated data item

H observed

NOTE: These may take values as shown below:

h estimated

Description
Number of trips from i to j
Number of trips from origin i
Number of trips to destination j
Number of trips through link k

This notation is used in Introduction to the mathematics in Cube


Analyst on page 35 and Mathematical summary on page 48.

34

Cube Analyst Reference Guide

Mathematical Background
Introduction to the mathematics in Cube Analyst

Introduction to the mathematics in Cube Analyst


The design of Cube Analyst means that a user can estimate
matrices simply by supplying the program with the appropriate
input data and accepting the resulting matrix. However, it is
valuable to have some understanding of how Cube Analyst
calculates the value of the estimated matrix cells; this insight both
helps in providing confidence in the results and in guiding the
approach to input data, such as setting confidence levels and
considering the potential effects of extra data or improved data
quality.
This section provides additional information about how Cube
Analyst computes matrix cells. Topics include:

Main mathematical features

Estimation equation

Model parameters

Maximum likelihood objective function

Describing the variation in data

Optimizer: Finding the minimum value

Main mathematical features


This section is intended to cater to those Cube Analyst users who
are interested in the detailed mathematical and statistical
underpinnings of the estimation process. Users who are more
interested in other aspects of the model should proceed to
Chapter 5, Data Preparation and Analysis.
The basis of Cube Analysts calculations is an application of the
standard statistical approach known as the maximum likelihood
method. This method allows estimates of a set of inputs to guide
the estimates of a corresponding set of outputs; the estimates of
the set of inputs are obtained from likelihood functions, which are
expressions of probability distribution functions (pdfs) associated

Cube Analyst Reference Guide 35

Mathematical Background
Introduction to the mathematics in Cube Analyst

with the users input data. The outputs are calculated from an
estimation equation, which must be provided. These points are
further explained below.
Given the range of possible input data, the full mathematical
expression of Cube Analyst is complex, but it involves some
principal components which we use to describe the essential
features of Cube Analyst. Mathematical summary on page 48
explains the standard Cube Analyst calculations by summarizing
the main mathematical steps. Extensions to the calculations on
page 57 shows how additional features are accommodated in the
calculations. This section continues with explaining Cube Analysts
mathematics in largely descriptive terms, while introducing the
main equations. Throughout this section, the mathematical
notation is defined Mathematical notation on page 32, where it is
not otherwise clear from the text.

Estimation equation
The heart of the estimation is an equation (estimation model)
whose output, , corresponds to the values of the cells of Cube
Analysts output matrix for trips between zones and . The form
of this mathematical estimation model in Cube Analyst is:
.....(1)
This equation contains the following elements:

its output,

some data items:

Prior observation of trips between and


Probability of trips between zones and using screenline site
(it
is possible for a screenline to correspond to a single count site, in
suitable circumstances)

36

Cube Analyst Reference Guide

Mathematical Background
Introduction to the mathematics in Cube Analyst

some Model Parameters


ai, bj, XK.

Implies the product of

over all the screenline count sites

If there is no prior observation for movements between some or


all possible origin-destination zone pairs, , then may be
calculated by Cube Analyst from:
.....(2)
Equation (2) introduces further elements:

One data item:


- the generalized cost of travel between zones

and

Two model parameters:


,

It may be noted that screenlines are usually organized so that


or . Also, because provides an estimator of the output, as
well as possibly being an input data item, it may also be considered
as a model parameter. Hence, the data item is also referred to as
. (That is, and
are numerically identical, but are logically
distinct.)
The form of equation (1) has been chosen primarily for reasons of
convenience, and for the appropriateness of its form according to
the data used in the estimation (as we discuss below). It is designed
to be efficient is assisting information to be processed, but is not
behavioral in nature. This implies that Cube Analyst is suitable for
estimating present day matrices, but not for forecasting which
would require some behavioral assumptions.
Equation (2) is borrowed from the well-known gravity model that
makes the behavioral assumption that people prefer lower cost
journeys to higher cost ones, but are influenced by the level of trips
generated by and attracted to different zones. This is a broad

Cube Analyst Reference Guide 37

Mathematical Background
Introduction to the mathematics in Cube Analyst

assumption; it means that cost data may be used where no other


source of prior matrix data is available, but it is not a precise
approach to estimating individual matrix cells.

Model parameters
For Cube Analyst, therefore, the estimated matrix is entirely
dependent on the values given to the model parameters. Cube
Analyst is thus, in effect, solely concerned to establish the most
appropriate values for these model parameters. (Cube Analysts
calculations are in parameter space, which accounts for some of
the behavior that may be observed in Cube Analysts output to the
screen and log file while it is computing, where the values of the
matrix may change in an apparently erratic manner.) Cube Analysts
calculations are mainly in the nature of a search for the best
model parameter values. Apart from the estimation equation itself,
the main features of the Cube Analyst calculations are:

Directing the search for Model Parameters values


optimization

Deciding whether the new Model Parameter values are the


best function evaluation

We now describe the general issues for Cube Analyst when setting
model parameter values.
Unless the user supplies an input model parameter file (created
either by an earlier run of Cube Analyst), the model parameters are
automatically initialized to 1.0. From equation (1), it may be seen
that the initial estimate is identical to the prior matrix (or based on
the cost matrix, equation (2), if no prior matrix value exists).
It is possible to compare the estimated matrix with all of the items
of the users input data. For example, the sum of rows and columns
of the estimated matrix may be compared with input trip ends
(Mathematical summary on page 48 shows this in mathematical
terms for all data items). If the result of this comparison indicates

38

Cube Analyst Reference Guide

Mathematical Background
Introduction to the mathematics in Cube Analyst

that the current estimate is too low, then an improved estimated


matrix may be achieved by increasing the value of, at least, some
model parameters.
The problem for Cube Analyst is that there are many items of user
data, implying many comparisons of the type just described; some
of these comparisons may require the current estimate to be
improved in one way (increased, say), while other comparisons
need the estimate to be altered another way (decreased, say). The
large number of model parameters provides the basis for
reconciling these apparent conflicts;by definition there are (2 x the
number of zones) model parameters provided by the s and the
s alone. It may be demonstrated that these are sufficient for
equation (1) to define any possible combination of positive, nonzero matrix cell values. Hence, if, by some means, suitable values of
the model parameters may be found, equation (1) can produce a
matrix which is consistent with all of the users input data. That is, at
least, if the input data is self-consistent in the first place.
Of course, this consistency is never the case in real applications of
Cube Analyst, and the best that may be hoped for is to estimate the
matrix which is most likely, given the users input data. Achieving
this most likely result is the next main topic to discuss, but we will
stay with model parameters to make a few more points.
In principle, there is nothing particular to distinguish the set of
model parameters
; mathematically, they are equal and
each may be affected by any item of data. However, the form of the
estimation equation allows parameters to be associated naturally
with different types of data such as:
Trip ends, for trips generated at zone i and attracted to zone j
Counts on screenline site K
Trip c information
(

Prior trip matrix

Cube Analyst Reference Guide 39

Mathematical Background
Introduction to the mathematics in Cube Analyst

This association is useful to the optimizer in reflecting the different


(quality) characteristics of the data sets. The nominally redundant
parameters provide extra degrees of freedom to handle data
inconsistencies. This is useful, as the matrix cells affected by a set of
screenline data are precisely defined by the
routing
information.

Maximum likelihood objective function


When Cube Analyst establishes values for the model parameters, it
requires a criterion to determine if the corresponding Tij estimates
either are correct or are better than another set of model
parameter values. This criterion is provided by a mathematical
equation called an objective function. The objective function, ,
for Cube Analyst has the following form:
.....(3)
where:
- is an estimated data item
- is an observed data item
- is the confidence level associated with

Notation used in the estimation equation on page 34 shows


which items and can represent but, in general terms, is the
input data which the user supplies and is the corresponding
value implied by the estimated matrix.
We have already discussed how the form of the estimation
equation (1) has been determined for reasons of effectiveness, but
which remain essentially arbitrary; also, how equation (2) derives a
weak behavioral basis from the gravity model. It is therefore
important to appreciate that in contrast, the objective function,
equation (3), is the result of a statistically rigorous procedure,
namely the maximum likelihood method.

40

Cube Analyst Reference Guide

Mathematical Background
Introduction to the mathematics in Cube Analyst

The consequence of this is a guarantee, subject to some


qualifications which we consider below, that the estimated matrix
is the statistically most likely, given the data supplied by the user.
The correctness of the estimate remains, of course, dependent on
the quality of the input data. Maximum likelihood theory shows
that the most likely values are indicated when M in equation (3),
which is negative, reaches its minimum possible value. (For reasons
of computational convenience, Cube Analyst minimizes the
negative of the log-likelihood objective function, rather than
maximizing the positive version, as the name maximum
likelihood might suggest.)
The qualifications mentioned before respectively concern the input
data sets representing independent observations, which is not
normally a problem for Cube Analyst users, and of the input data
being described by a probability distribution function, which we
now discuss. The derivation of equation (3) for the objective
function is outlined in Mathematical summary on page 48.

Describing the variation in data


The maximum likelihood method assumes that each item of input
data represents an observation from a random distribution of
possible values, but where the variation of values may be described
by a probability distribution function. That is when the user
supplies Cube Analyst with, say, a screenline traffic count value of
1684 vph; this is not considered to be the count for that screenline
but, rather, a sample from a distribution. It is common experience
that counting the same screenline on another, but equivalent
occasion (for example, the same time the following week) will
provide another count value, say 1739 vph, simply on account of
the random variation which is inherent in all traffic (and passenger)
data.

Cube Analyst Reference Guide 41

Mathematical Background
Introduction to the mathematics in Cube Analyst

The assumption is made, therefore, that all input data for Cube
Analyst is subject to variation which may be described by the
Poisson probability distribution function (pdf ).

llustration of a Poisson probability distribution function

The Poisson is a well-known pdf, often associated with data which


can involve many events (for example, 1684 vehicles passing an
observer in an hour). It has the statistical property that its mean
equals its variance. This is valuable for data such as count
information where the variation of 100 vph is significant when the
mean figure is 200 vph, but not when it is 1000 vph; alternatively, a
10% variation implies many vehicles on a mean of 5000 vph, but
not on 50 vph. The Poisson distribution reflects these changes in
significance in an appropriate way.
During the original development of Cube Analyst, alternative
assumptions about the pdf used to describe data variation were
reviewed; the Log-Normal distribution for example, but these were

42

Cube Analyst Reference Guide

Mathematical Background
Introduction to the mathematics in Cube Analyst

considered only to add complexity, rather than accuracy. It is


usually that case that the Poisson is a good way of describing traffic
and passenger data. The Poisson distribution also has the
considerable merit that it leads to some mathematical relationships
where the role of confidence levels is clearly apparent. In particular,
Mathematical summary on page 48 shows an element of the
calculation concerned with calculating the optimum value of the
objective function which has the following general form (see
equation (18) later for details):

The 1, , and represent, respectively, the confidence levels (),


observed (H), and estimated (h) values for the first data item,
similarly for the second, third, etc., data items. The form of this
equation is directly attributable to the use of the Poisson pdf;
another pdf, the Normal pdf for example, would give a different
and more complex form.
The significance of equation (18) is two-fold: first, each and every
data item is represented in this equationthat is, each cell of the
prior matrix, each trip end, each screenline count, and so on. Thus,
all items of data are considered together, not in separate
categories. (It is not only equation (18) which shows this, most
significantly, so does equation (3), the objective function, amongst
others.) The second point is that the data contributes as:
1. A ratio of observed to estimated values
2. A linear combination (that is, simple addition (+)) of data items,

each multiplied (weighted) by its own confidence level


This enables the Cube Analyst user to view confidence levels as
simple weighting factors, even though the derivation of is
originally from considerations of data sampling, as discussed in the
following section. This would not be the case if a non-Poisson pdf
had been used.

Cube Analyst Reference Guide 43

Mathematical Background
Introduction to the mathematics in Cube Analyst

Optimizer: Finding the minimum value


We have already discussed how Cube Analyst is designed to adjust
the model parameters, from their initial value of 1.0, so that
equation (1) leads to a new value of , which provides a new set of
estimated data values, .
Equation (3) can then be used to determine if the new estimates are
more likely (more consistent with) the input data, . Cube
Analyst therefore incorporates a powerful optimizer to amend the
model parameters so that the value of
is minimized as much as
possible. This minimum is defined mathematically by locating the
point at which the gradient of the objective function, with respect
to the set of model parameters, , is zero, that is
. This
well-known approach to determining minimum or maximum

44

Cube Analyst Reference Guide

Mathematical Background
Introduction to the mathematics in Cube Analyst

points is shown in the following figure, which shows in a schematic


fashion how the value of the objective function, , varies
according to the value of a parameter, .

Two dimensional schematic view of variations in objective function according to


model parameter values

It is at this stage, in particular, that Cube Analyst is operating in


parameter space. The principle is, simply, to adjust each
parameter by an amount (the step length) and by a search
direction (up or down). The optimizer ensures that Cube Analyst
only makes adjustments which improve the situation (that is, to
further minimize the objective function, ). Once a set of
(improving) adjustments has been made, the Cube Analyst
optimizer performs another iteration of adjustments to determine
whether more improvements are possible, and so on, until no
further decrease in the (negative) value of the objective function is
possible.

Cube Analyst Reference Guide 45

Mathematical Background
Introduction to the mathematics in Cube Analyst

This approach places several requirements on the optimizer:

Efficiency in determining optimum step lengths and directions

Avoidance of local minima and location of the global


minimum (this means being sure that no values of step length
and direction could lead to a better result)

Identification of the minimum point when in the neighborhood


of one (this means achieving a stable convergence point)

There are several possible approaches to calculating optimum step


lengths and directions. These may be considered to represent a
spectrum characterized, at one end, by methods which use a
simple strategy to define a step length and direction, but spend
more time adjusting these elements through more iterations; at the
other end, the methods spend more effort calculating the optimum
step length and direction, but require fewer iterations.
The direction information is held by Cube Analyst in the gradient
search matrix file; this is also known as the Hessian matrix, as the
gradient search matrix is an approximation for the Hessian. The
degree of approximation depends on the method and certain
aspects of the calculation, notably the proximity to convergence
and the number of iterations since the gradient search was last recomputed (controlled in part by Cube Analyst control parameter
ITERH).
The significance of the Hessian matrix for Cube Analyst is that it
provides a mathematical description of the relationships between
model parameters; indeed the Hessian itself approximates to the
variance-covariance matrix. This can be exploited by the optimizer
to update the direction information in an optimum manner.
Through the Cube Analyst control parameter IHTYPE, the user can
select alternative methods. These are listed below in order of
increasing calculation effort given to the step length and direction:
1. Method of steepest descent
2. Newtons method

46

Cube Analyst Reference Guide

Mathematical Background
Introduction to the mathematics in Cube Analyst

3. Quasi-Newton method
4. Method of scoring.

The default procedure in Cube Analyst uses a combination of


methods (iii) and (iv). It starts by using the method of scoring to
calculate an approximation to the Hessian, which requires
considerable computational effort. Further improvements to the
solution are obtained by the quasi-Newton method, which needs
less computation. This method works well and requires very few
iterations if the solution is in the region of the optimum value.
Otherwise the gradient search matrix is recalculated using a
method to determine the exact Hessian matrix, a new step length is
adopted, and the process repeats itself. (If the exact Hessian cannot
be computed, maybe because the results are still far from a
converged solution, the method of scoring is automatically reapplied.)
As the solution approaches the optimum, the step length is
reduced, allowing the optimum to be located more precisely. A
very small step length indicates a close proximity to the optimum
value and so the search is terminated when the step length is
beneath the threshold defined by Cube Analyst control parameter
UTOL. This is a more practical method of determining when the
calculation should finish than monitoring the gradients
approaching zero.

Cube Analyst Reference Guide 47

Mathematical Background
Mathematical summary

Mathematical summary
This section presents a further explanation of Cube Analysts
calculations, as given in Introduction to the mathematics in Cube
Analyst on page 35.
Topics include:

Maximum likelihood method: Background theory

Application of maximum likelihood to Cube Analyst

Cube Analyst objective function

Cube Analyst trip estimation model

Estimating model parameters

Optimization procedure

Parameter errors

Cell reliability

Maximum likelihood method: Background theory


Maximum likelihood is a standard method of estimating
parameters of mathematical modeling equations, based on sets of
relevant data observations. Given values of the model parameters,
the pdf defines the probability associated with the observed data.
When viewed as a function of the model parameters, the pdf is
called a Likelihood function. The values of the parameters which
maximize this function are called maximum likelihood estimates.
They correspond to a model in which the probability of the
observed data is maximized. The estimation process has two
elements of establishing the likelihood function and of
determining the optimum parameter values to maximize it.
Mathematically, the theory may be expressed as:
.....(4)
where:

48

Cube Analyst Reference Guide

Mathematical Background
Mathematical summary

= random variable
= observation
= parameter (or function of a parameter)
The likelihood function is then defined to be:
.....(5)
where:

that is,

is a set of

observations

The optimization process is to find the value of


.

that maximizes

Application of maximum likelihood to Cube Analyst


In accordance with the above theory, but with a slightly altered
notation, the following are defined:
= a data item ( =above)
= an estimated item (

=above)

It is assumed that the appropriate pdf is


.....(6)
where is called the weighting factor. It can be seen that
is
a Poisson random variable with mean . Thus can be
considered a scaling parameter which defines the time units in the
underlying Poisson process.
A likelihood function may thus be defined as:

Cube Analyst Reference Guide 49

Mathematical Background
Mathematical summary

.....(7)
Taking logarithms of equation (7) leads to:
.....(8)
It may be noted that

= constant

Referring to equation (5), and considering all data items, H, a


likelihood function may be defined as:
.....(9)
For computational ease, the task of maximizing L may be converted
to the minimization of:
.....(10)
where
.....(11)
Equation (10) therefore represents the general form of the
objective function which is minimized by Cube Analyst.

50

Cube Analyst Reference Guide

Mathematical Background
Mathematical summary

Cube Analyst objective function


Cube Analyst allows varied data items to be used in the estimation,
that is, H and h may represent different data items, as shown in the
following table:
Observed data, H, and estimated equivalents, h
Observed data value,

Estimated data value,

Description

Nij

Number of trips with


origin at zone i and
destination at zone j

Oi

Number of trips with


origin at zone i

Dj

Number of trips with


destination at zone j

QK

Number of trips
through screenline K

where: RijK is the proportion of trips in matrix cell (i, j) using screenline K

Substituting these observed and estimated data items into


Equation (10) gives an objective function shown below, with the
source of the data indicated.
For reasons to do with function evaluation, the estimated tij is
treated as a least squares minimization in the objective function.
The objective function then becomes:
Objective function, M =

Comment
Screenline counts
Trip origins

Cube Analyst Reference Guide 51

Mathematical Background
Mathematical summary

Objective function, M =

Comment
Trip destinations
Prior matrix
Cost matrix derived

.... (12)

where
indicates summation over cells which are zero in the
prior matrix, but not the cost matrix.

Cube Analyst trip estimation model


The objective function, equation (12) above, is used to calibrate the
trip estimation model of the form:
.....(13)
where tij = Nij
or

Estimating model parameters


It follows, by differentiation of equation (11):
.....(14)
.....(15)
(Note: undefined for h=0)

52

Cube Analyst Reference Guide

Mathematical Background
Mathematical summary

The minimum value of the objective function, M, for a parameter ,


is found when

The remaining steps are to:


1. Calculate

using equation (13) and current values of Model

Parameters.
2. Use the table Observed data, H, and estimated equivalents, h

on page 51 to calculate
data.
3. Calculate

for each set of input and estimated

as we show below, for each set of estimated data.

Substitutions for equation (15)

leads to
.....(16)
where
.....(17)
and
.....(18)
Note:

are constants

Cube Analyst Reference Guide 53

Mathematical Background
Mathematical summary

is undefined if

or

In equation (16) we need to substitute each set of model


parameters for . We start by determining
reproducing Model Equation (13),

for each parameter

.....(13)
where

= constant

or

let
Then differentiating (13) gives:

(for each )

.....(19)

(for each )

.....(20)

(for
( )

) .....(21)
.....(22)

.....(23)
Finally, we substitute (19) to (23) into (16) for each value of , and
use an optimization procedure to choose parameter values that
give values of that minimize the objective function (9).

54

Cube Analyst Reference Guide

Mathematical Background
Mathematical summary

Optimization procedure
Given an initial guess
Cube Analyst computes the maximum
likelihood estimates by generating a sequence of estimates
from

where
is a suitable step length, and
vector given by

denotes a search

For the method of scoring used by Cube Analyst,


is equal to the
expected value of the Hessian matrix. It may be shown that this can
be represented as

where indicates the expected value, and


, which
denotes the gradient vector of the objective function, , with
respect to the model parameters, .
The

entry of the matrix

is given by

.....(24)
From equation (16) we can write
.....(25)
This leads to

.....(26)

Cube Analyst Reference Guide 55

Mathematical Background
Mathematical summary

The formulae for

and

are given in equations (19) to (23).

When
is calculated by the quasi-Newton method (as previously
described in Introduction to the mathematics in Cube Analyst on
page 35), the Hessian matrix updates the expected value,
,
using the BFGS update formula.

Parameter errors
The optimization produces an estimate
the variance of
parameter , and an estimate of the parameter value itself,

Therefore,
Standard Error =

.....(27)

and the range within one Standard Error is

Cell reliability
The sensitivity of the estimate of

, is defined to be

.....(28)
where is the objective function,
second differentials.

56

Cube Analyst Reference Guide

and represents a matrix of

Mathematical Background
Extensions to the calculations

Extensions to the calculations


Hierarchic estimation is described in Chapter 7, Hierarchic
Estimation.
Hierarchic estimation calculates two forms of matrix, the district
matrix and a set of local matrices. Apart from the aggregation of
information which is implied by converting a zonal matrix to a
district matrix, the estimation of a district matrix is entirely similar
to a standard estimation. The estimation of the local matrices is,
equally, similar, but it introduces a new set of data, derived from the
district matrix, which are referred to as side constraints.
To understand this side constraint information, we show a local
matrix in a schematic form in the following figure.

Relationship of side constraints with local matrices

Cube Analyst Reference Guide 57

Mathematical Background
Extensions to the calculations

The set of side constraint variables, in terms of prior observed (H)


and estimated (h) data, and associated confidence levels, , are:
H
PZTZ
PZTR
PRTZ

h
FZTZ =

Tij

PZTZ

FZTR =

Ti1

PZTR

FRTZ =

T1j

PRTZ

NOTE: The specifications of PZTZ (observed), FZTZ (estimated), etc.,

are indicated in the figure, Relationship of side constraints with


local matrices.
Note that the corresponding confidence levels, PZTZ, PZTR and
PRTZ are all set by the user with Cube Analysts ZCONF control
parameter.
(The confidence levels for the trip ends applied to the district
matrix are set according to the minimum values of the generation
and attraction trip ends confidence levels found in the trip end file.)
These values of H and h are the substituted in the same manner
which applies to other sets of data represented by H and h.

58

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Data Preparation and


Analysis
This chapter focuses on the tasks which the user undertakes as part
of the estimation process. Topics include:

Overview

Matrices

Trip ends

Networks and traffic and passenger counts

Screenlines

Routings

Setting confidence levels

Tuning estimation performance

Control of routing information

Analyzing the results

Cube Analyst Reference Guide 59

Data Preparation and Analysis


Overview

Overview
There are a series of data preparation tasks which are discussed in
the following sections. Most of the tasks only require data files to
be created in a relatively mechanistic manner, but two of the tasks
require the user to make considered choices. These are discussed in
Screenlines on page 65 and in Setting confidence levels on
page 70.
The final sections in this chapter explain the estimation stage in
terms of tasks facing the user. As Cube Analyst usually requires
minimal input from the user, apart from the supply of prepared
data files, the estimation stage is very straightforward. However,
advice is given on possible ways of improving the speed of
estimation. This may be achieved through:

Influencing the strategy used to calculate the Hessian matrix,


which is used in the optimization stages of Cube Analystsee
Tuning estimation performance on page 73

Avoiding unnecessary detail in the routing files, which can be


burdensome for the data processing elements of Cube
Analystsee Control of routing information on page 74

The final set of activities for the user are to analyze the results to
assess the quality of the estimation, partly to determine if and how
they might need to be improved. This topic is discussed in
Analyzing the results on page 75.
The ideas introduced in this chapter are subsequently illustrated in
later chapters with an example application of Cube Analyst, based
on an actual study. Further details on points covered in this section
are provided in the standardized estimation procedures.

60

Cube Analyst Reference Guide

Data Preparation and Analysis


Matrices

Matrices
Cube is used to set or modify individual cells or ranges of cells. This
also permits confidence levels to be easily set to global or
individual values. For example, you can use a prior matrix (Table
101) to give information about basic trip patterns.
Prior matrix (Table 101)
|
| 20 |
| TABLE = 101
(Prior
)
| 20 |
|
1
2
3
4
5
6
7
8
9
10 | 20 |
|
------------------------------------------------+ 20 |
| 1:
1
1
0
5
45 126
50
21
30
55 | 20 |
| 2:
1
5
0
70 125
36
38
50
58
14 | 20 |
| 3:
1
1
0
2 108 119
90
69 148
44 | 20 |
| 4:
69
3
0
1
6
7
6
3
25
3 +----+
| 5:
100
1
0 192
71
20
12
11
14
7 |
| 6:
36
2
0
88
52
6
3
7
16
13 |
| 7:
62
3
0
32
36
58
9
63
9
61 |
| 8:
0
1
0
64
65
30 119
19 121
64 |
| 9:
0
7
0
57 123
70 178 279
7
38 |
| 10:
0
10
0
7
31
3
1
10
21
3 |
| 11:
0
13
0
19
35
4
96 170
28
29 |
| 12:
0
5
0
41 286
52 103 117
29
56 |
| 13:
0
9
0
24
99
50
90
91
23
12 |
| 14:
4
3
14
20
56
19
67
58
21
7 |
| 15:
28
2
36
1 185
1
1
2
15
1 |
+----------------------------------------------------------+

Cube Analyst Reference Guide 61

Data Preparation and Analysis


Matrices

You can use an associated confidence matrix (Table 102) to


discriminate between data reliability for different groups of
movements.
Confidence levels (Table 102)
|
|
| TABLE = 102 (Confidences
)
|
|
1
2
3
4
5
6
7
8
9
10 |
+-------------------------------------------------------------+
|
1:
20
20
20
20
40
40
20
20
20
20 |
|
2:
20
20
20
20
40
40
20
20
20
20 |
|
3:
20
20
20
20
40
40
20
20
20
20 |
|
4:
20
20
20
20
40
40
20
20
20
20 |
|
5:
40
40
40
40
40
40
40
40
40
40 |
+-+--------------------------------------------------------+ 40

Intrazonals can be included in the matrix. Note that because


routings only cover inter-zonal trips, the intrazonals will not be
affected by the screenline counts. They will just impact on the trip
ends. So as their role is limited, there is a case for omitting
intrazonals from the estimation. Note that if intrazonals are
included in the trip ends, then they should also be included in the
matrix. If the trip ends do not include intrazonals, the intrazonal
cells of the input matrices should be zero.

62

Cube Analyst Reference Guide

Data Preparation and Analysis


Trip ends

Trip ends
Trip ends may be determined either by reference to an existing
matrix, surveys (for example, of parking), or they may be calculated
from equations.

Cube Analyst Reference Guide 63

Data Preparation and Analysis


Networks and traffic and passenger counts

Networks and traffic and passenger counts


Cube is used for preparing networks. Traffic and passenger counts,
together with confidence level information, is input into the
volume field storage areas associated with each link.

64

Cube Analyst Reference Guide

Data Preparation and Analysis


Screenlines

Screenlines
Screenlines are used to minimize the effects of assignment errors.
Screenlines are defined as the set of count sites which intercept
traffic/passenger flows between sets of zones which share the
same general corridors of movement (across which the screenlines
are suitably located).
The extent of a screenline is determined by the number of
alternative (reasonable) paths which are available. In many public
transport networks where services are sparse, or in rural highway
networks, there may only be a single reasonable route between
one general area and another. In this case, screenlines may
correspond to single links (although they are still treated as
screenlines in this context of Cube Analyst). In general, however, a
screenline will represent a set of links.
In the case of highways, a useful type of screenline is provided by a
river or a railway line, that has only a few crossing points. In this
case all traffic must be routed through known points, and so
assignment error associated with the screenline will be minimized.
For Cube Analyst, there is no difference between a group of traffic
counts on separate links (that form a logical screenline) and a single
link count amalgamating the flows on separate traffic lanes.
There will normally be few, if any, screenlines that entirely bisect a
study area and so intercept all trips either side of it. Cube Analyst
therefore employs the concept of partial screen lines. They are
partial in the sense that they do not extend between the
boundaries of a study area, but they intercept all trips between, at
least, certain defined pairs of zones.
The method for defining such partial screenlines is manual, and
based partly on judgement and the availability of count data sites.
The routing information, together with user-defined screenlines, is
used to define the set of O-D pairs whose routes they intercept. The
aim is to group count sites into screenlines that balance the
objectives to:

Cube Analyst Reference Guide 65

Data Preparation and Analysis


Screenlines

1. Maximize the number of O-D pairs that have all routes passing

through a screenline.
2. Minimize the number of O-D pairs per screenline, as this

maximizes the information value of the counts for the


corresponding matrix cells.
The following figure shows an example of screenlines for an
example urban area.

Typical screenline configuration for an urban area

66

Cube Analyst Reference Guide

Data Preparation and Analysis


Screenlines

Features that these screenline locations demonstrate are shown in


the following table.
Screenline location

Function

Northern

Screenline over a single link (for example, a bridge)


intercepts all traffic to and from the North.

Western

Parallel, alternative routes from the West require a single


screenline intercepting both routes for this corridor.

Southern Ring Road

Non-radial traffic is intercepted by (two) screenlines on


orbital road.

Eastern

Similar parallel routes for long distance traffic to Western


side, but parallel routes for local traffic require additional,
shorter screenline. Note use of count location in more
than one screenline.

Central Area

Detailed movements in centre intercepted with several


short screenlines.

Cube Analyst Reference Guide 67

Data Preparation and Analysis


Routings

Routings
Matrix estimation requires information about which routes are
used to connect each pair of origin and destination zones, and the
probability that each route is used. Ideally this would come from
survey information, but this is onerous and not very practical, so
the method uses modeling instead. This routing information is one
of the outputs from the assignment process. For TRIPS users it is
stored in the route choice probability (RCP) file. For Cube Voyager
Highway users, it is stored in the Cube Voyager path file. For Cube
Voyager Public Transport users, it is stored in route files.
This section discusses two types of routings:

Highways

Public transport

Highways
The main requirement for Cube Analyst is for the routings to reflect
all reasonable alternative paths whilst avoiding spreading out too
much so that they become unrealistic.
For Cube Voyager users, the paths reflected in the Intercept file
derive from combining the all-or-nothing paths from each
assignment iteration into one set. This can be done directly in the
HIGHWAY program. Alternatively, HIGHWAY can be used to
generate a path file, and the appropriate path sets and volumes
selected from it for use in Cube Analyst.
TRIPS users could use a similar approach, or apply one of the
stochastic methods. When considering networks where congestion
is a factor, the assignment itself relies on the trip matrix that the
estimation is trying to provide. Hence it may be preferable to apply
routes derived using methods that can calculate multiple routes
between zones based on stochastic (statistical) methods, rather
than to rely on the paths from a capacity-restrained assignment.
TRIPS supports two such methods, known after their originators as
Burrell and Dial. Both methods can be used successfully with Cube

68

Cube Analyst Reference Guide

Data Preparation and Analysis


Routings

Analyst, but Burrell can have limitations in large networks when


routes traverse large numbers of links. In this case, the central limit
theorem of statistics means that the chances of routes having the
same cost for a different set of randomized link costs (which is the
approach used in Burrell) become higher the more links occurring
on an average route. The consequence of this is that it is more
difficult to generate varied routes. (It can be noted, in passing, that
the length of routes in terms of distance is not a problem for the
implementation of Burrell used in TRIPS.) The Dial method is not
subject to this effect concerning routes with many links so it is the
approach that is advised. Note that in cases where estimation is
being used to update a matrix that is not anticipated to have
changed by very much, for instance, it was obtained from a
relatively recent survey, then the RCP file from an existing
converged capacity restraint assignment may be used in
preference to Dial. The choice here is a matter of judgement on
relative accuracies of the RCP information.

Public transport
Cube Voyager PT outputs to route files by user class. Many controls
affect the routing, but a factors file provides a means to determine
the extent of multirouting. TRIPS automatically produces
multiroute paths and can also store them in a RCP File. The
determination of which links are used to connect pairs of origindestination zones is a function of a path building algorithm which
generates a set of reasonable paths. These are based on
considerations of generalized cost, which reflect users data about
transit times, fares, boarding and transfer penalties, and so on. A
submode split model can be used to reflect passenger biases when
deciding if different modes (bus, metro, rail, etc.) are candidates for
inclusion into the set of reasonable paths.

Cube Analyst Reference Guide 69

Data Preparation and Analysis


Setting confidence levels

Setting confidence levels


Mathematically, confidence levels have the dual facets of being
sampling rates and weighting factors. Confidence levels are
entered as percentages but, from both points of view, values of
greater than 100 are legitimate.
This section discusses:

Characteristics of the data

Deciding on confidence values

Characteristics of the data


The ability of a confidence level to help match an estimated data
item (trip end, screenline flow, matrix cell) to its corresponding
observed value is influenced by:
1. Data consistency

If data is consistent and free of errors, then the confidence


levels will have no influence as they, essentially, help to
mediate between different estimates implied by different data
items. Conversely, more discrepancies within the data increase
the importance of confidence levels.
2. Data quantity

As all data is present in the objective function (see Maximum


likelihood objective function on page 40), the quantity of data
is influential, besides the confidence levels. This means that, for
example, relatively large confidence levels applied to the prior
matrix, which has many data elements, will tend to restrict the
scope of a few count sites to influence the estimated matrix to a
significant degree. Of course, this may be the desired effect in
some circumstances.

70

Cube Analyst Reference Guide

Data Preparation and Analysis


Setting confidence levels

An improved match with any data item can always be achieved


with an arbitrarily large confidence level, but it will normally be
necessary for users to check the appropriateness of confidence
levels that are input.

Deciding on confidence values


A practical approach to setting confidence levels is often to
establish a dataset as a reference benchmark, and then set the
confidence levels of other data relative to this. For example, if a
program of automatic counting means that traffic counts are well
and recently observed, then these may be given a high confidence
level, say 100, and confidences for other data set relative to that
value.
Note that an implied range of 1 - 100 (or of that order of
magnitude) has been found to be suitable for many studies. Large
applications (say, of 500 zones or more) will tend to encounter a
greater range of absolute data values, which can imply the need for
a wider range of confidence levels (see the discussion above). The
need for this is suitably assessed by means of sensitivity analysis on
the confidence levels.
Some general observations applying to confidence levels for
different categories of data are given below, in descending order of
magnitude of confidence levels for most applications:

At least some count sites should have observations made over


several days (weeks, etc.) to determine basic levels of variability
associated with single observations.

Count confidences should be set with respect to the time


period applying to the estimated matrix (for example, a series
of counts made on Tuesdays is only a partial observation if the
matrix is to correspond to an average working day).

In the case of highways, trip end confidences are unlikely to


exceed count confidences, and will usually be less due to
observational difficulties; in the case of public transport, the
two sets of confidences are more likely to be similar.

Cube Analyst Reference Guide 71

Data Preparation and Analysis


Setting confidence levels

72

Even when trip ends have been determined simply from the
row and column totals of the prior matrix, the aggregation of
the data means that the trip end confidences will be higher
than the corresponding individual cell confidences. For this
reason, trip end data should always be used when a prior
matrix is input.

Prior matrix cells are, individually, unlikely to have high


confidences even when collected by recent, good surveys
because there are so many elements of the matrix. This
becomes truer as the number of study area zones increases
(due to the difficulty of observing all possible movements
adequately).

Cost matrix data may be obtained reasonably reliably, but the


relevant confidence concerns the use of this data for trip
estimation and this normally only offers an approximation.

Cube Analyst Reference Guide

Data Preparation and Analysis


Tuning estimation performance

Tuning estimation performance


In general Cube Analyst should be run with default parameter
settings. In the majority of cases this will lead to a converged
solution, within a reasonable number of iterations.
In some cases an excessive number of iterations may be required or
Cube Analyst may be unable to find a converged solution. In the
latter case Cube Analyst will report that it has halted optimization
for a reason such as No further progress possiblelinear search
failed, rather than the successful message Convergence detected.
Such a message is usually caused by excessively inconsistent data
being input to Cube Analyst which pulls the optimizer in opposite
directions to the extent that no solution can be found.
To correct this, the user is normally required to check the input
data. However, Cube Analyst does provide an extra control in the
form of the parameter ITERH. This determines the frequency by
iteration for the calculation of the Hessian matrix (see Optimizer:
Finding the minimum value on page 44) which directs the
optimizer towards the solution. Although this calculation is a time
consuming process, it will result in the optimizer converging in
significantly fewer iterations. For the case of unconverged
problems, recalculation of the Hessian may provide the direction
which the optimizer needs to find a solution. For example, if a
problem was halted after 58 iterations, try setting ITERH=50 to see
if a new Hessian will allow the optimizer to converge.
In most cases, recalculation of the Hessian matrix will result in
longer run times. In particular, time will be wasted if ITERH is set to
low values such as 40 or less. Cube Analyst will determine a suitable
value for ITERH. It is only recommended for the user to set ITERH in
order to attempt to solve convergence problems (which are
encountered only exceptionally).

Cube Analyst Reference Guide 73

Data Preparation and Analysis


Control of routing information

Control of routing information


For many estimation runs, the production of the O-D intercepts for
screenlines and/or part-trip data takes as much or even more time
than the actual estimation itself. Cube Analyst just needs the
reasonable paths so controlling the routing to avoid the production
of routes used only by a small proportion of trips is an important
aspect of achieving practical run times for the estimation. This is
particularly the case for public transport which can often supply a
huge variety of routes. For large models this could result in the
production of the intercepts requiring an excessive time to
complete; this can be an order of magnitude greater than if
parameters are given appropriate settings. Too many routes can
also result in file sizes becoming too large for practical use.
Routing information can be supplied to Cube Analyst in the form of
a TRIPS RCP file, or Cube Voyager path file. Cube Voyager can also
supply an intercept file via the Highway and PT programs. If an
intercept file is not input, then before starting the estimation
proper, Cube Analyst analyses the routes through screenlines
and/or part trip links to produce the intercepts which it saves in an
ICP file. It is important to note that this intercept file can be input
back into subsequent estimations as long as the links of the
screenlines and/or part trip data are not modified. This is achieved
by setting option INTCPT=T or WARMST=T as appropriate and will
result in a considerable time saving.

74

Cube Analyst Reference Guide

Data Preparation and Analysis


Analyzing the results

Analyzing the results


Cube Analyst produces its results as a set of tabulation for printing
or viewing, and as a set of files which may be subject to further
analysisone of these files is the estimated matrix itself.
The tabulations in Cube Analysts printout are ordered as follows,
after the standard header information:
1. Summary of input data characteristics, showing:

Data types were used in the estimation

Average confidence levels, and their ranges

the number of data elements for each type of data.

This information indicates the relative weighting of data in


the estimation process, which is important to know when
assessing the results.
2. A summary of the values of key indicators from the last five

iterations before the optimization halted. The indicators, and


their values, are the same as Cube Analyst outputs to the screen
during the course of its calculations. They are:

Iteration number

Step size

Value of the objective function

Estimated matrix total number of trips

The reason for halting is also shown, which will normally be


convergence detected.
This information is mainly provided for confirmation that the
estimation calculations operated in an appropriate manner (for
example, that the objective function value never increased).
These two elements of Cube Analysts printout are shown in
Results of estimationincluding part trip data on page 89
(and in an abbreviated form in Confidence and convergence
summary on page 83);

Cube Analyst Reference Guide 75

Data Preparation and Analysis


Analyzing the results

3. The remainder of Cube Analysts tabulations are concerned

with comparisons between the users input data and the


corresponding values derived from the estimated matrix.
Comparative information is output, when applicable, for:

Trip matrix totals

Part-trip data

Total trip generations from zones

Total trip attractions to zones

Screenline flow counts

The general pattern of this comparative information from Cube


Analyst is shown in Results of estimationincluding part trip
data on page 90 (Trip end comparison of prior (observed) and
estimated values on page 84 and Screenline comparison of
prior (observed) and estimated values on page 85 contain this
information in a slightly altered format).
Results of estimationincluding part trip data on page 89
and Results of estimationincluding part trip data on
page 90 illustrate the case for Cube Analyst including part-trip
data. Hierarchic estimation output conforms to this same basic
pattern, but extra information is provided, as explained in
Chapter 7, Hierarchic Estimation and illustrated in Figures
8.12a - 8.12d. xxx
As a rule, the user will be looking for good correspondences
between input data and estimated results. However, it is
important to note that a poor comparison between input and
estimated information is not, by itself, a sign of a poor quality
estimation. The reason is (or should be) that a data item with a
higher confidence level is dominating the estimation with
respect to data which is also relevant, but which has a lower
confidence level.
The approach to analyzing Cube Analysts comparative results
is, therefore, to identify data which has not been matched well
in the estimation and to determine what the other data might
be causing the discrepancy. Often this is straightforward, for

76

Cube Analyst Reference Guide

Data Preparation and Analysis


Analyzing the results

example, a screenline flow count with a markedly different


value from trip end values for adjacent zones. If the discrepancy
seems unwarranted then this may be a cause to review either
the data values themselves, or their confidence levels. (One
cause of discrepancies which may not be immediately
apparent, is poor routing information, for example, on account
of inappropriate generalized cost parameters.)

Cube Analyst Reference Guide 77

Data Preparation and Analysis


Analyzing the results

78

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Estimation Process

This chapter discusses the estimation process in the Cube Analyst


application. Topics include:

Study area

Data

Estimating the matrix

Evaluation: Sensitivity analysis

Including part-trip data

Cube Analyst Reference Guide 79

Estimation Process
Study area

Study area
This section discusses a highways based application of Cube
Analyst for an 82-zone study area for the town of Guildford in
Surrey, UK, (pop. 100,000). The network shown has a major bypass
for the town, which is shown as a thicker line. Zone centroid
connectors are shown as pale blue lines. Eleven zones were
designated cordon-crossing zones at the study area boundary.

Guildford highway network

80

Cube Analyst Reference Guide

Estimation Process
Data

Data
The network was well provided by current traffic counts and these
were all given a confidence level of 80, which served as a
benchmark for other data confidences. Most of the trip end data
was synthesized, by disaggregation of UK Department of Transport
data with reference to zonal population and employment figures,
and was given confidence levels of 40. Higher confidence values of
80 were set for external trip ends, determined from a cordon
crossing survey, and to a set of five zones in the town center area
that were the subject of a car park survey. An out-of-date trip
matrix existed, which served as the prior matrix, and which was
given a uniformly low confidence for each cell of 5. Sixteen
screenlines were defined, which are shown in the following figure.

Screenlines for Guildford

MVHWAY in TRIPS was used to calculate three sets of Burrell paths.


The degree of randomization was controlled by setting the SPREAD
parameter to 25, a relatively low value selected after viewing paths

Cube Analyst Reference Guide 81

Estimation Process
Data

for different values using MVGRAF and using local knowledge of


the network. MVHWAY was also used to prepare a cost matrix based
on minimum cost routes.

82

Cube Analyst Reference Guide

Estimation Process
Estimating the matrix

Estimating the matrix


Cube Analyst offers a number of controls on the calculation process
and convergence criteria, but these were left to take default values
and the process of running Cube Analyst itself was entirely
straightforward. However, a series of estimation runs were
undertaken, as described below.
The results provided by Cube Analyst of the first estimation are
shown. These show extracts of the Cube Analyst printed reports,
from which a number of observations can be made.

Because each data item enters the objective function, the


number of elements associated with each different type of data
is significant, as well as their confidence levels.
Confidence and convergence summary

AVERAGE CONFIDENCE LEVELS (EXCLUDING ZERO VALUES)


Average Maximum Minimum Number of
Elements
Trip matrix confidence levels
5.0
5.0
5.0
6724
Screen line confidence levels
80.0
80.0
80.0
16
Trip end (dest) confidence levels 47.8
80.0
40.0
82
Trip end (orig) confidence levels 47.8
80.0
40.0
82
Optimisation halted because:
Convergence detected

The optimizer adjusts model parameter values and evaluates


the resulting cell estimations in a series of iterations. The
mathematics of the optimizer implies that it will converge to a
solution in a number of iterations which is less than the number
of model parameters.

Cube Analyst Reference Guide 83

Estimation Process
Estimating the matrix

Trip end comparison of prior (observed) and estimated values


MVESTM with Counts, Input Prior Matrix and Trip Ends Only
REPORTING OBSERVED/ESTIMATED GENERATIONS AND ATTRACTIONS

ZONE
1
2
3
4
5
6
7
8
9
10
Some
30
31
32
33
34
35
36
37
38
39
40
Some
70
71
72
73
74
75
76
77
78
79
80
81
82

84

GENERATIONS
ATTRACTIONS
NO
OBS.
EST. OBS-EST
%
OBS.
EST. OBS-EST
%
4869.0
4324.3
544.7
11.2%
3657.0
3591.6
65.4
1.8
3825.0
3745.0
80.0
2.1%
2984.0
3571.1 -587.1 -19.7
1798.0
2559.5 -761.5 -42.4%
5715.0
5710.1
4.9
0.1
419.0
383.2
35.8
8.5%
558.0
528.3
29.7
5.3
1256.0
1572.5 -316.5 -25.2%
2018.0
2156.1 -138.1 -6.8
2045.0
1731.1
313.9
15.4%
2084.0
1998.6
85.4
4.1
1935.0
1815.4
119.6
6.2%
2112.0
2194.3
-82.3 -3.9
1794.0
1894.8 -100.8
-5.6%
2673.0
2815.2 -142.2 -5.3
3662.0
3364.9
297.1
8.1%
4763.0
4247.7
515.3 10.8
430.0
388.9
41.1
9.5%
273.0
307.1
-34.1 -12.5
missing....
3870.0
3176.5
693.5
17.9%
2370.0
2375.0
-5.0 -0.2
2778.0
2618.2
159.8
5.8%
1304.0
1616.1 -312.1 -23.9
5450.0
4633.8
816.2
15.0%
3257.0
3175.7
81.3
2.5
2943.0
2741.1
201.9
6.9%
3006.0
2807.4
198.6
6.6
736.0
806.5
-70.5
-9.6%
1151.0
1107.2
43.8
3.8
368.0
785.9 -417.9 -113.5%
930.0
909.7
20.3
2.2
4042.0
4062.2
-20.2
-0.5%
1523.0
1570.5
-47.5 -3.1
1821.0
1964.4 -143.4
-7.9%
2026.0
2083.9
-57.9 -2.9
4719.0
4763.3
-44.3
-0.9%
2683.0
2326.7
356.3 13.3
3116.0
3440.8 -324.8 -10.4%
6410.0
6234.9
175.1
2.7
3030.0
3369.2 -339.2 -11.2%
5227.0
6016.3 -789.3 -15.1
missing....
1829.0
1639.9
189.1
10.3%
1251.0
1214.6
36.4
2.9
1089.0
1160.4
-71.4
-6.6%
1298.0
1364.2
-66.2 -5.1
4396.0
4122.0
274.0
6.2%
4226.0
3952.8
273.2
6.5
10600.0 11231.3 -631.3
-6.0% 11100.0 11146.4
-46.4 -0.4
6950.0
6931.0
19.0
0.3%
5806.0
5720.2
85.8
1.5
9200.0
9605.6 -405.6
-4.4%
9200.0
9384.8 -184.8 -2.0
14423.0 15045.9 -622.9
-4.3% 14313.0 14109.6
203.4
1.4
1008.0
824.4
183.6
18.2%
722.0
655.4
66.6
9.2
2270.0
2217.8
52.2
2.3%
2270.0
2236.2
33.8
1.5
5665.0
5396.6
268.4
4.7%
5665.0
5465.7
199.3
3.5
26660.0 26727.9
-67.9
-0.3% 28912.0 27872.0 1040.0
3.6
5310.0
5258.3
51.7
1.0%
5990.0
5940.6
49.4
0.8
6033.0
6390.8 -357.8
-5.9%
6085.0
6601.1 -516.1 -8.5

Cube Analyst Reference Guide

Estimation Process
Estimating the matrix

Cube Analyst prints basic comparisons of input and estimated


data for:

Trip ends (Trip end comparison of prior (observed) and


estimated values)

Screenline inputs (Screenline comparison of prior


(observed) and estimated values)

This information must be interpreted with care, as a difference


may be a good feature, indicating that some other, more
reliable information has determined the estimated result.
Screenline comparison of prior (observed) and estimated values
MVESTM with Counts, Input Prior Matrix and Trip Ends Only
REPORTING OBSERVED/ESTIMATED SCREEN LINE COUNTS
SCRLINE NO
OBSERVED
ESTIMATED
OBS-ESTM
%
1
11677.0
11301.1
375.9
3.2%
2
11677.0
11925.8
-248.8
-2.1%
3
27947.0
26234.4
1712.6
6.1%
4
25504.0
25213.3
290.7
1.1%
5
28539.0
31075.9
-2536.9
-8.9%
6
28431.0
30261.4
-1830.4
-6.4%
7
18981.0
15441.2
3539.8
18.6%
8
18809.0
18445.5
363.5
1.9%
9
24000.0
23770.1
229.9
1.0%
10
24435.0
23585.0
850.0
3.5%
11
7225.0
7635.8
-410.8
-5.7%
12
7225.0
8479.7
-1254.7
-17.4%
13
16285.0
16367.7
-82.7
-0.5%
14
22670.0
23883.7
-1213.7
-5.4%
15
6261.0
6511.4
-250.4
-4.0%
16
6022.0
6886.0
-864.0
-14.3%

Cube Analyst Reference Guide 85

Estimation Process
Evaluation: Sensitivity analysis

Evaluation: Sensitivity analysis


The estimated matrix was also evaluated by examining how
sensitive the results were to changes in the input data:

Alterations in confidence levels The effect of assumptions in


setting confidence levels was tested by increasing the
confidence levels from 80 to 200 on two screenlines for the
major traffic carrying road (the town bypass).
Using the previously calculated model parameter and gradient
search (Hessian) matrix, the re-estimation, in this case, required
only six iterations. The differences between observed and
estimated screenline counts were correspondingly improved.
Flow differences

Flow % differences

Screenline

(i)

(ii)

(i)

(ii)

Before (80)

1713

291

6.1

1.1

After (200)

1094

219

3.9

0.9

Elsewhere, other screenlines were marginally affected, both


better and worse, apart from one screenline where the
improvement was much more noticeable.
In general, the results suggested that small changes in
confidence levels were not significant, but that improvements
were obtainable where it was possible to refine values of
confidence levels rationally.
Matches of estimated and input data can always be improved
for individual data items by increasing the corresponding
confidence, but this will only have a net improvement on the
estimated matrix when it does not exacerbate data
inconsistencies.

86

Cube Analyst Reference Guide

Estimation Process
Including part-trip data

Including part-trip data


The original estimation of the Guildford matrix was later updated
using a set of data which corresponded to a license plate match
survey taken around the center of the town. The data was
preprocessed and converted into a set of link flows, as illustrated in
terms of bandwidths; this also serves to indicate the extent of the
survey.

Part-trip data shown as link flows, using bandwidths

The estimation was re-run, now incorporating the following sets of


information:

Prior matrix

Trip ends

Link counts

Cube Analyst Reference Guide 87

Estimation Process
Including part-trip data

Part-trip data

The figure Part-trip data and link counts shows the two sets of link
flow information which were used. Link counts, shown as open
bandwidths, and part trip, as shown previously in the figure Parttrip data shown as link flows, using bandwidths on page 87. It may
be noted that some links had both link count data and part trip
data. In this application, the confidence levels for link counts were
set higher, at 80 or more, than those for part-trip data, which were
set at 60 in recognition of the sampling process inherent in license
plate surveys.

Part-trip data and link counts


Results of new estimation

Extracts of the Cube Analyst results of new estimation are shown in


Results of estimationincluding part trip data on page 89 and
Results of estimationincluding part trip data on page 90. These
are similar to those presented in Estimating the matrix on
page 83, but with additional information concerning part-trip data,
and with some differences of presentation format. From Results of

88

Cube Analyst Reference Guide

Estimation Process
Including part-trip data

estimationincluding part trip data on page 90, it may be noted


that the estimated part-trip flows match the overall number of
observed part trips, in this case, to within 1.9%. This, of course,
partly reflects their relatively high confidence levels and number
of elements, which are reported near the top in Results of
estimationincluding part trip data. Number of elements for
part-trip data is the number of (one-way) links with part-trip data.
The figures, Part-trip data shown as link flows, using bandwidths
on page 87 and Part-trip data and link counts on page 88, in fact,
show respectively estimated and observed part-trip data, but the
difference is too small to make clear graphically in this particular
application. It is therefore useful to view the correspondence as a
tabulation. This report is shown in Report on observed and
estimated part trip data on page 92, which is headed by a key
explaining the storage of information in volume fields.
Results of estimationincluding part trip data
AVERAGE CONFIDENCE LEVELS (EXCLUDING ZERO VALUES)
------------------------------------------------Average
Trip matrix confidence levels
Screen line confidence levels
Trip end (dest) confidence levels
Trip end (orig) confidence levels
Part Trip confidence levels

Maximum
5.0
95.0
47.8
47.8
60.0

Minimum
5.0
200.0
80.0
80.0
60.0

Number of
5.0
80.0
40.0
40.0
60.0

6642
16
82
82
226

SUMMARY OF FINAL FIVE ITERATIONS


-------------------------------Iteration
34
35
36
37
38

Stepsize
(Tolerance=0.00010)
0.0003559
0.0001208
0.0001890
0.0001123
0.0000580

Optimisation halted after


Convergence detected

Objective
Value
-8859208.83
-8859208.83
-8859208.83
-8859208.83
-8859208.83

Matrix
Total
229655.8
229656.0
229655.9
229655.9
229655.9

38 iterations because:

Cube Analyst Reference Guide 89

Estimation Process
Including part-trip data

Results of estimationincluding part trip data


REPORTING PRIOR/ESTIMATED MATRIX TOTALS
CONFIDENCE
PRIOR
ESTIMATED ESTM-PRIOR (ESTM-PRIOR)/PRIOR(%)
5.0 238498.0
229655.9
-8842.1
-3.7%
REPORTING OBSERVED/ESTIMATED PART TRIP FLOW TOTALS
CONFIDENCE
OBSERVED
ESTIMATED
ESTM-OBSV
(ESTM-OBSV)/OBSV(%)
60.0
972944.0
991158.2
18214.2
1.9%
REPORTING OBSERVED/ESTIMATED GENERATIONS AND ATTRACTIONS
GENERATIONS
ZONE NO CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
1
40.0
4869.0
4714.7
-154.3
-3.2%
2
40.0
3825.0
3756.0
-69.0
-1.8%
3
40.0
1798.0
2015.2
217.2
12.1%
4
40.0
419.0
398.8
-20.2
-4.8%
5
40.0
1256.0
1381.2
125.2
10.0%
6
40.0
2045.0
1879.7
-165.3
-8.1%
7
40.0
1935.0
1866.6
-68.4
-3.5%
8
40.0
1794.0
1786.8
-7.2
-0.4%
9
40.0
3662.0
3490.7
-171.3
-4.7%
10
40.0
430.0
411.9
-18.1
-4.2%
Some missing....
ATTRACTIONS
ZONE NO CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
1
40.0
3657.0
3661.8
4.8
0.1%
2
40.0
2984.0
3142.7
158.7
5.3%
3
40.0
5715.0
5668.6
-46.4
-0.8%
4
40.0
558.0
535.7
-22.3
-4.0%
5
40.0
2018.0
2067.6
49.6
2.5%
6
40.0
2084.0
2000.6
-83.4
-4.0%
7
40.0
2112.0
2092.6
-19.4
-0.9%
8
40.0
2673.0
2629.0
-44.0
-1.6%
9
40.0
4763.0
4437.3
-325.7
-6.8%
10
40.0
273.0
279.5
6.5
2.4%
Some missing....

90

Cube Analyst Reference Guide

Estimation Process
Including part-trip data

REPORTING OBSERVED/ESTIMATED SCREEN LINE COUNTS


SCREENLINE CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV
OBSV(%) NO OF
ODs
NO & NAME
1 A'shot Rd W-E
80.0
11677.0
11370.7
-306.3
-2.6%
219
2 A'shot Rd E-W
80.0
11677.0
11651.1
-25.9
-0.2%
221
3 A3-Hogs Back S-N
200.0
27947.0
26670.6
-1276.4
-4.6%
153
4 A3-Hogs Back N-S
200.0
25504.0
24896.4
-607.6
-2.4%
154
5 A3-Parkway W-E
80.0
28539.0
29956.5
1417.5
5.0%
538
Some missing....

Cube Analyst Reference Guide 91

Estimation Process
Including part-trip data

Report on observed and estimated part trip data


NETWORK IDENTIFIER <Network with Estimated Part Trip Flows>
VOLUME FIELD 1 NAME <Obsv> - Observed Link Counts
VOLUME FIELD 2 NAME <Conf> - Confidences Levels for Link Counts
VOLUME FIELD 3 NAME <PrtT> - Observed Part Trip Data
VOLUME FIELD 4 NAME <PrtC> - Confidence Levels for Part Trip Data
VOLUME FIELD 5 NAME <EPtr> - Estimated Part Trip Data
Print Comparisons of Part Trip Data and Estimates
REPORT 4:
LINK VOLUME FIELDS
ANODE
2119
BNODE
2105
2112
2644
-------------- ---------- ---------1 <Obsv>
8552.
18809.
0.
2 <Conf>
80.
80.
0.
3 <PrtT>
6871.
17775.
9073.
4 <PrtC>
60.
60.
60.
5 <EPtr>
7139.
18420.
10808.
REPORT 4:
LINK VOLUME FIELDS
ANODE
2120
BNODE
2127
2207
2212
-------------- ---------- ---------1 <Obsv>
0.
5387.
0.
2 <Conf>
0.
80.
0.
3 <PrtT>
4906.
3497.
2821.
4 <PrtC>
60.
60.
60.
5 <EPtr>
4565.
3356.
2822.
ANODE
2843
BNODE
2113
2194
2841
-------------- ---------- ---------1 <Obsv>
1226.
0.
0.
2 <Conf>
80.
0.
0.
3 <PrtT>
3635.
213.
0.
4 <PrtC>
60.
60.
60.
5 <EPtr>
3325.
231.
0.

92

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Hierarchic Estimation

This chapter discusses hierarchic estimation. Topics include:

Introduction to hierarchic estimation

Alternative approaches to hierarchic estimation

Defining districts

Running Cube Analyst for hierarchic estimation

Cube Analyst Reference Guide 93

Hierarchic Estimation
Introduction to hierarchic estimation

Introduction to hierarchic estimation


This section provides an overview of hierarchic estimation. Topics
include:

Approaches to estimating very large matrices

Different levels of detail: Districts and zones

Different approaches to hierarchic estimation

Approaches to estimating very large matrices


There are formidable data processing and computational issues to
be faced when estimating very large matrices, whose size may lie in
the range of 2,500 to 10,000 zones for major transport studies.
Theoretically, the matrices can have between 25002 and 100002
(6,250,000 to 100,000,000) cells to estimate, although the practical
number of cells with non-zero trips will only be a fraction of this.
Nevertheless, the number of cells to be estimated in typical
applications will be of the order of 250,000 to 750,000 cells.
The natural approach, which is used in hierarchic matrix estimation,
is to reduce the estimation problem to a more manageable size by
grouping information. However, it is necessary to recognize that
the pattern of trips across many large study areas, such as
conurbations, is not readily partitioned. For example, a data item
such as a flow count or a trip end may relate to trips with dispersed
origins and destinations which may not easily be grouped.
It is therefore a feature of Cube Analyst hierarchic estimation that
each of the different approaches to estimation offered, and which
are described below, always considers all of the trips in the entire
study area.

Different levels of detail: Districts and zones


The approaches offered by Cube Analyst hierarchic estimation
considers the OD matrix at two levels of detail:

94

Cube Analyst Reference Guide

Hierarchic Estimation
Introduction to hierarchic estimation

Fine level, which is the original zoning system and results in a


zonal matrix

Coarser level, which aggregates (groups) sets of zones into a


limited number of districts, from which a corresponding district
matrix may be produced

The total number of trips in the zonal and district matrices is the
same.

Different approaches to hierarchic estimation


The main method is called hierarchic estimation as the estimated
district matrix is used to control a series of estimations primarily
conducted at the zonal level. This process leads to a fully updated
zonal matrix.
Hierarchic estimation also allows a variant method in which the
district matrix is defined as a mixture of district and zonal detail.
The resulting district matrix which is estimated includes some
cells estimated at the zonal level. The output estimated matrix has
fewer rows and columns than the input matrix, but there will be a
direct correspondence between certain of the cells as selected by
the user. This variant is valuable when it is only necessary for the
application to update cells relating to only parts of the large study
area, for example, to update cells for an administrative borough
within a large city region. The method only requires a single
estimation, rather than the series of estimations used in the main
hierarchic estimation process. This hierarchic estimation variant is
referred to as combined district and zonal estimation.
The underlying estimation process is common to all Cube Analyst
runs but there are differences in how information is grouped in
hierarchic estimation. Apart from differences in information
grouping, the combined district and zonal estimation is very similar
to a standard estimation. The hierarchic estimation method
introduces a new concept, which is called a local matrix. This is
explained in Local matrices on page 98.

Cube Analyst Reference Guide 95

Hierarchic Estimation
Alternative approaches to hierarchic estimation

Alternative approaches to hierarchic estimation


This section describes alternative approaches to hierarchic
estimation. Topics include:

Estimation with mixed district and zonal detail

Local matrices

Summary of the hierarchic estimation process

Estimation with mixed district and zonal detail


The majority of this section is concerned with hierarchic estimation,
but it begins with a view of the approach for combined district and
zonal estimation, shown in the figure Combined estimation of
selected zones and districts on page 97. This shows the estimated
matrix where the sides of the cells have been scaled according to
the geographical size of the areas to which they relate. That is, the
large sides correspond to districts and the small sides to zones. This
has resulted in three types of cells:

96

Large squares All information is estimated at district level

Small squares All information is estimated at zonal level

Cube Analyst Reference Guide

Hierarchic Estimation
Alternative approaches to hierarchic estimation

Rectangles Information is estimated at a mixture of district


and zonal detail

Combined estimation of selected zones and districts

The user may choose whether to retain information at mixed levels


of details, as shown, or (manually) to extract the cells fully
estimated at zonal detail (the small squares the figure) to update a
portion of the zonal prior matrix.
As shown in the figure, the detailed estimation has been for trips
traveling from one part of the study area to another; if the small
squares were located on the diagonal of the main square shown,
then the detailed estimation would be for all trips within, and
traveling to and from, a particular part of the study area, such as a
town center area.
Some points to note about this approach are:

Although the terms zonal and district have been used to


indicate different levels of detail, Cube Analyst considers this
form of estimation as a special form of district estimation,
without recognizing that a selected number of districts are
simply individual zones.

Cube Analyst Reference Guide 97

Hierarchic Estimation
Alternative approaches to hierarchic estimation

There must be the same number of origin and destination


districts, which is not the case for hierarchic estimation.

this approach requires a single estimation.

Local matrices
When using hierarchic estimation, Cube Analyst first estimates a
district matrix, which is used to influence the calculation of a set of
local matrices. These local matrices contain a mixture of zonal detail
and district-based information. The estimated zonal detail is
captured automatically by Cube Analyst and, as each local matrix is
estimated, is used to develop progressively an update of the entire
matrix at the zonal level of detail. The district matrix simply
represents the zonal matrix aggregated into a district matrix,
although the district matrix may be non-square, that is, there may
be a different number of origin and destination districts. Further
information about districts is given later in this section.
Consider a local matrix that is an extension of the combined district
and zonal matrix shown and discussed in Estimation with mixed
district and zonal detail on page 96.

Zonal estimation controlled by district matrix

98

Cube Analyst Reference Guide

Hierarchic Estimation
Alternative approaches to hierarchic estimation

In this diagram all of the large squares, where information is only


estimated at district level, have been shaded. This is because this
portion of the matrix is treated in a local matrix as a single unit,
termed Rest-of (the)-World RoW.
A local matrix, therefore, has the following elements:

Detailed zonal level set of cells (the small squares)

Trips in the Rest-of-World (shaded area)

Trips from RoW to zonal level area (rectangular cells)

Trips to RoW from zonal level area (rectangular cells)

A local matrix is defined for each origin and destination district pair
(the unshaded part in the figure represents one such pair), and the
fully estimated (zonal) matrix is produced when all local matrices
have been estimated.
Information involving trips from the RoW is obtained from the
district matrix. This element, and the fact that the total number of
trips is the same (in principle) for each local matrix, ensures that
consistency is maintained across the entire study area, even though
detail is calculated separately in estimations for different parts.

Summary of the hierarchic estimation process


The hierarchic estimation process may be summarized in four
stages:

Creation of districts from zones

Estimate district matrix

Estimate local matrices

Build-up full estimated matrix

Cube Analyst Reference Guide 99

Hierarchic Estimation
Alternative approaches to hierarchic estimation

Creation of districts from zones

The following figure shows a study area divided into many (small)
zones (denoted by ij). These are grouped into a number of fewer
(and larger) districts (denoted by IJ). Subsequent topics in this
chapter give more information about creating districts.

Districts (I,J) and zones (i,j)


Estimate district matrix

This is the first operation by Cube Analyst, which estimates a small


matrix for the 5 to 15 origin and destination districts which are
typically defined.
One of the cells, corresponding to a pair of origin and destination
districts, which contribute to a local matrix, is referenced as Mij. The
figure Estimate district matrix on page 101 indicates the
information in the district matrix estimation: the prior matrix and
trip ends are automatically aggregated from the users input zonallevel information. Internally, Cube Analyst creates a condensed

100

Cube Analyst Reference Guide

Hierarchic Estimation
Alternative approaches to hierarchic estimation

network but does not aggregate the screenline count data. This
treatment of data is reflected in Cube Analysts reports on the
district matrix (see Figure 7.12b).

Estimate district matrix


Estimate local matrices

Cube Analyst can estimate all Local matrices in one run, but the
user may exercise considerable control over this process.
This example relates to a single Local matrix, but this stage is
repeated for all Local matrices. The example considers the same
structural elements introduced in the discussion on Zonal
estimation controlled by district matrix on page 98. The
information used to estimate Zonal cells, referenced as Mij,
includes:

Prior matrix and trip ends are used at zonal level in the
estimation

Cube Analyst Reference Guide 101

Hierarchic Estimation
Alternative approaches to hierarchic estimation

Count data is used as input where relevant to the local matrix.

Other items are obtained from the corresponding district


matrix estimation.

This use of information is reflected in Cube Analysts reports on


local matrices (see Figures 7.12c and 7.12d).

Estimate local matrix


Build-up full estimated matrix

This example indicates the construction of the fully estimated


matrix from detailed information (Mij) calculated from a set of local
matrices. When the matrix is in the form shown in the figure (with
only some of the cells estimated), it is referred to as the partially
estimated matrix. Those cells in the partially estimated matrix
which have not yet been estimated contain copies of the
corresponding prior matrix cells.

102

Cube Analyst Reference Guide

Hierarchic Estimation
Alternative approaches to hierarchic estimation

(This can provide another means of estimating just part of a study


area, namely, by restricting the estimation to selected
districts/zones of interest.)
When all cells of the partially estimated matrix have been
estimated, it, of course, becomes the final fully estimated zonal
matrix.

Combine local matrices in partially estimated matrix

Cube Analyst Reference Guide 103

Hierarchic Estimation
Defining districts

Defining districts
Hierarchic estimation is a heuristic method which approximates the
formal mathematical methodology provided by a standard run of
Cube Analyst. It is most appropriate when the study area is large
enough to encompass sub-areas which can become districts where
the travel patterns are reasonably independent of one another.
The purpose of the estimated district matrix is, largely, to consider
the inter-district movements, while the focus of local matrices is the
intra-district movements. Because precision (greater detail) is
associated with the latter, it is desirable to minimize the amount of
inter-district movements.
The number of local matrices is approximately the square of the
number of districts. It therefore can make a considerable difference
to computational times whether, say, 10 districts are chosen (about
100 local matrix estimations) or 8 districts (about 64 local matrix
estimations).
Not all study area zones may be allocated to districts in this way,
either because some or all trips from or to a zone do not pass
through a screenline, or because allocation of the zone to a district
would violate the maximum number of zones per district. Zones
are then allocated to the adjacent district, based on the coordinates
associated with zone centroids. The effect of allocating zones to
district which is not based on routing behavior is potentially to
worsen the effects of the approximation implicit in hierarchic
estimation. In many cases, this worsening may be negligible in
practice, but will be more significant if those zones involve
relatively large numbers of trips, or if a significant proportion of
zones are involved. It is this latter consideration which makes it
inadvisable to use hierarchic estimation on study areas with less
than 500 zones.
The considerations involved in defining districts may be
summarized as:

104

The fewer districts the better

Cube Analyst Reference Guide

Hierarchic Estimation
Defining districts

The maximum local matrix size is determined by the maximum


size of standard estimation that may be conveniently run on
the available computer (say 1000 - 2500 zones)

The more allocation of zones to districts on the basis of


routings through screenlines the better

Note that it is a feature of hierarchic estimation districts that there


may be a different number of origin and destination districts (that
is, the district matrix may be non-square), and the allocation of
origin zones to origin districts is independent of the allocation of
that same zone to a destination district. This enables the
asymmetries of trip patterns to be reflected, as, for example, in a
morning peak matrix when trips originate from many zones in the
suburbs and head for only a few destination zones in the city
center. This is of value to the estimation process, but means that the
district matrix and the local matrices cannot be reported directly.

Cube Analyst Reference Guide 105

Hierarchic Estimation
Running Cube Analyst for hierarchic estimation

Running Cube Analyst for hierarchic estimation


Cube Analyst is run in a similar manner to non-hierarchic
estimation except that:

Option DSTRCT=T, to indicate calculation/use of a district


matrix

LMC and DDF files are input additionally

Parameter ZCONF is set

If Cube Analyst is run with an incomplete LMC file, then the


estimated matrix is a partially estimated matrix. This matrix
provides an additional input file when further local matrices are to
be estimated.
The model parameter file only ever contains information relating to
the district matrix (and not any local matrices), and the execution
log file contains brief summary information for both district and
local matrix estimations.
The printout file for hierarchic estimation contains the same type of
information as for non-hierarchic estimation, as illustrated in
Estimating the matrix on page 83. However, there may be many
sets of this information: the first set of information always refers to
the district matrix estimation. This is followed by a set of
information for each local matrix being estimated, noting that this
may be none in the case of a combined district and zone
estimation. (Because estimations involving many local matrices can
generate very large print files, it can be convenient to edit the local
matrix control file to create a series of runs of Cube Analyst in which
the size of individual print files is reduced.)
An additional item of information is provided for hierarchic
estimation concerning the influence of the district matrix on each
local matrix estimation. The table with this information, shown in
Figure 7.12c, is labeled Side constraints on matrix totals. This term
refers to the constraints of the district matrix on various sides (and

106

Cube Analyst Reference Guide

Hierarchic Estimation
Running Cube Analyst for hierarchic estimation

elements) of the local matrix, as illustrated previously in Estimate


local matrix on page 102. Reporting Hierarchic Estimation Results,
discusses the printout for hierarchic estimation.

Parameter ZCONF
The extent of the constraining effect of the district matrix on the
local matrices is determined by Cube Analyst parameter ZCONF,
which acts as a confidence level, treating the district matrix as
observed data and the local matrix as estimated. For the local
matrix estimation, therefore, the district matrix is just another item
of observed data and ZCONF should be set in relation to
confidence levels for other items of observed data.
From the users point of view, the setting of ZCONF should be a
reflection of the degree and importance of the interaction between
districts, in terms of trips which cross more than one origin or
destination district boundary. (An effect of the automatic
generation of districts is to minimize such boundary crossings.) The
district matrix contains information about these interactions; if they
are important then the district matrix should be made
correspondingly significant with a relatively high setting of ZCONF.
A low value of ZCONF allows local matrices to reflect local data
more precisely, at the expense of the larger picture across the
entire study area. A possible symptom of an inappropriate setting
of ZCONF might be an unwarranted distortion of the distribution of
trip costs/lengths in the estimated matrix.

Cube Analyst Reference Guide 107

Hierarchic Estimation
Running Cube Analyst for hierarchic estimation

108

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Using Cube Analyst

This chapter discusses the process for using Cube Analyst. Topics
include:

Input data: overview

Outputs: overview

Estimating large matrices (hierarchic estimation)

Estimation process

Cube Analyst Reference Guide 109

Using Cube Analyst


Input data: overview

Input data: overview


The data that can be used in estimating the new O-D matrix may
include some or all of the following types of data:

110

A prior (existing) trip matrix

Traffic generations and attractions of zones

Traffic counts on links and/or turns

Modeled (multiple) paths between zones

Cost of travel between zones

Parameters of a calibrated trip distribution function

Part-trip data, where trips are observed traveling between


points which are not necessarily their ultimate origins and
destinations

Cube Analyst Reference Guide

Using Cube Analyst


Outputs: overview

Outputs: overview
The outputs from Cube Analyst are:

The estimated O-D matrix

Summary Reports, in the form of a print (*.prn) file, describing


the differences between input data and corresponding values
implied by the estimated matrix. The print file also provides a
return code indicating problems during execution, or a
successful completion.
For more information, see Reports on page 115.

A set of files with information on:

Model parameter values

A log of the optimization steps

Internal gradient search and intercept data

Cube Analyst Reference Guide 111

Using Cube Analyst


Estimating large matrices (hierarchic estimation)

Estimating large matrices (hierarchic estimation)


Cube Analyst provides a hierarchic approach to estimation for use
with very large matrices; typically more than 2,500 zones. This is
required to make the process more manageable and less time
consuming.
The basic approach is to estimate a general matrix, in which zones
are automatically grouped into districts. This area-wide estimation
is then used to control a set of detailed estimations, these build up
to provide a fully-detailed estimate for the entire study area. This is
discussed in detail in Chapter 7, Hierarchic Estimation.

112

Cube Analyst Reference Guide

Using Cube Analyst


Estimation process

Estimation process
The only program directly involved in the estimation process itself
is Cube Analyst, although other Cube programs play an important
part in the pre- and post processing of the data.
The data used may be some or all of the data described earlier in
Input data: overview on page 110.
Cube Analyst may also use model parameters, gradient search, and
intercept files from a previous run of Cube Analyst for the current
estimation to warm start the calculations.
Internally Cube Analyst can be considered to be made up of two
main parts each of which is executed alternately, namely:

Estimation model

The function of this is, given some particular values of the


model parameters, to calculate the estimated matrix, trip ends,
screenline volumes, etc., and also to perform the likelihood
calculation.

Optimization step

This procedure attempts to change the values of the model


parameters to improve the likelihood value (the objective
function).
These two stages are carried out alternately in a series of iterations
until no further improvement can be made.

Cube Analyst Reference Guide 113

Using Cube Analyst


Estimation process

114

Cube Analyst Reference Guide

Cube Analyst Reference Guide

Reports

This chapter discusses reports you can prepare with Cube Analyst.
Topics include:

Summary of Reports

Sample reports

Cube Analyst Reference Guide 115

Reports
Summary of Reports

Summary of Reports
The Analyst reports described in this chapter are saved in a print
(*.prn) file during program execution. The reports include:

A listing of input parameters and options, and input binary


header information.

Mean, minimum, and maximum confidence levels set by the


user for each type of input data are given.

Memory requirements.

A report of each iteration of the optimization process, during


execution in interactive mode. This shows the current value of
the objective function, the gradient tolerance, and the sum of
all the estimated matrix elements. These values for the last five
iterations are always reported.

On completion, Cube Analyst provides summary reports on the


comparison between sets of input data and the corresponding
estimated values, with the confidence levels that apply. Where
relevant data is input to Cube Analyst, these reports are
produced giving comparisons for prior and estimated:

116

Matrices Matrix totals

Trip ends Zone generations and attractions, with input


zone generations and attractions

Link flows Screenline volumes and input screenline


volumes

Part trips Part trip matrix totals, distinguished by line


groups, where appropriate.

Finally, Analyst provides a return code indicating problems


during execution, or a successful completion. The codes are:

0 = Normal Termination

4 = Warning. Review the print file and find the (W) tag for
information on the warning(s).

Cube Analyst Reference Guide

Reports
Summary of Reports

8 = Fatal Error, Non Immediate Termination. Review the


print file for information on the error(s).

16 = Fatal Error, Immediate Termination. Review the print


file for information on the error(s).

Further information may be obtained by using Cube programs to


report on the estimated matrix file.
For an estimation using part-trip data, the output network file
contains detailed information on estimated part-trip link flows
(equivalent to an assignment of the estimated Part Trip matrix).
Cube Analyst reporting for hierarchic estimations

Cube Analyst reporting for a hierarchic estimation varies according


to whether the estimation is for a district or a local matrix. The
reports for district estimation are the same as for other levels,
except, of course, the results apply to districts rather than zones.
For local matrices, Cube Analyst additionally provides summaries of
the row and column side constraints from the district matrix, and
equivalent values from the prior matrix. The first reported zone
corresponds to the Rest-of-the-World (RoW), while the other
reported zones are the set of zones relevant to that local matrix. No
screenline reports are produced for local matrices.
The execution log file is output by the optimization step of Cube
Analyst, and three levels of report may be produced. These are
controlled by the IREP parameter. The contents of the log file will
not normally be of interest to general users, but are of assistance in
summarizing the progress of the calculation should investigation
be required.

Cube Analyst Reference Guide 117

Reports
Sample reports

Sample reports
This section contains examples of reports:

Average confidence level

Final five iterations

Matrix totals and zone generation

Zone attractions

Average confidence level (part trip data)

Part trip totals

District matrix

Local matrix

Average confidence level


AVERAGE CONFIDENCE LEVELS (EXCLUDING ZERO VALUES)
-------------------------------------------------

Trip matrix confidence levels


Screen line confidence levels
Trip end (dest) confidence levels
Trip end (orig) confidence levels

118

Cube Analyst Reference Guide

Average Maximum Minimum Number of


Elements
20.0
20.0
20.0
6724
95.0
200.0
80.0
16
47.8
80.0
40.0
82
47.8
80.0
40.0
82

Reports
Sample reports

Final five iterations


SUMMARY OF FINAL FIVE ITERATIONS
-------------------------------Iteration
149
150
151
152
153

Stepsize
(Tolerance= 0.0001)
0.0004152
0.0005055
0.0003342
0.0002368
0.0000781

Objective
Value
-4735264.48
-4735264.48
-4735264.49
-4735264.49
-4735264.49

Matrix
Total
239547.4
239548.6
239550.3
239551.0
239551.0

Optimization halted after 153 iterations because:


Convergence detected
Final Value of Maximum Search Step, UMAX = 0.01

Matrix totals and zone generation


REPORTING PRIOR/ESTIMATED MATRIX TOTALS
CONFIDENCE
PRIOR
ESTIMATED ESTM-PRIOR (ESTM-PRIOR)/PRIOR(%)
20.0 238498.0
239551.2
1053.2
0.4%
REPORTING OBSERVED/ESTIMATED GENERATIONS AND ATTRACTIONS
GENERATIONS
ZONE NO CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
1
2
3
4
5
6
7
8
9
10
11
<continued>

40.0
40.0
40.0
40.0
40.0
40.0
40.0
40.0
40.0
40.0
80.0

4869.0
3825.0
1798.0
419.0
1256.0
2045.0
1935.0
1794.0
3662.0
430.0
9200.0

4387.9
3763.3
2562.3
386.7
1574.9
1743.0
1827.1
1904.4
3288.3
381.3
9347.4

-481.1
-61.7
764.3
-32.3
318.9
-302.0
-107.9
110.4
-373.7
-48.7
147.4

-9.9%
-1.6%
42.5%
-7.7%
25.4%
-14.8%
-5.6%
6.2%
-10.2%
-11.3%
1.6%

Cube Analyst Reference Guide 119

Reports
Sample reports

Zone attractions
ZONE NO CONFIDENCE
1
40.0
2
40.0
3
40.0
4
40.0
5
40.0
6
40.0
7
40.0
8
80.0
9
40.0
10
40.0
11
80.0
<continued>

ATTRACTIONS
OBSERVED ESTIMATED
3657.0
3586.4
2984.0
3500.3
5715.0
5556.7
558.0
518.4
2018.0
2162.9
2084.0
1948.0
2112.0
2129.7
976.0
1030.0
2673.0
2804.5
0.0
0.0
5665.0
5549.3

ESTM-OBSV
-70.6
516.3
-158.3
-39.6
144.9
-136.0
17.7
54.0
131.5
0.0
-115.7

(ESTM-OBSV)/OBSV(%)
-1.9%
17.3%
-2.8%
-7.1%
7.2%
-6.5%
0.8%
5.5%
4.9%
n/a%
-2.0%

The trip end summaries can also be produced with the zone labels. Short
zone labels are printed if NODLAB=T, LNGLAB=F:
ATTRACTIONS
ZONE NO,NAME CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
1 <Beaumont>
40.0
3777.0
3382.2
-394.8
-10.5%
2 <Cross_Ro>
40.0
3482.0
3441.1
-400.9
-10.4%
3 <Binley_S>
40.0
5815.0
5220.2
-594.8
-10.2%
<continued>
Long zone labels are printed if NODLAB=T and LNGLAB=T. The example below
shows hierarchic zone numbers and long zone labels in the report:
ZONE CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
NUMBER & NAME
28480 <Beaumont Avenue>
40.0
5069.0
5544.5
475.5
9.4%
28172 <Cross Roads, town centre>
40.0
4025.0
4392.2
367.2
9.1%
27848 <Binley Street>
40.0
1898.0
2076.7
178.7
9.4%
<continued>

120

Cube Analyst Reference Guide

Reports
Sample reports

Average confidence level (part trip data)


AVERAGE CONFIDENCE LEVELS (EXCLUDING ZERO VALUES)
------------------------------------------------Average

Maximum

Minimum

Number

of
Elements
Trip matrix confidence levels
Screen line confidence levels
Trip end (dest) confidence levels
Trip end (orig) confidence levels
Part Trip confidence levels

10.0
80.0
46.7
46.7
7.0

10.0
80.0
80.0
80.0
7.0

10.0
80.0
40.0
40.0
7.0

1083
2
95
95
594

Part trip totals


This report is produced if option PRTTRP=T.
For Public Transport data, the report is as follows:
REPORTING OBSERVED/ESTIMATED PART TRIP FLOW TOTALS
GROUP
CONFIDENCE OBSERVED ESTIMATED
ESTM-OBSV
(ESTM-OBSV)/OBSV
(%)
ALL
7.0 1386723.0 1232440.0
-154283.0
-11.1%
1 Local
7.0
624295.0
597702.1
-26592.9
-4.3%
2 Express
7.0
521925.0
532005.7
10080.7
1.9%
For Highways data, the report is as follows:
REPORTING OBSERVED/ESTIMATED PART TRIP FLOW TOTALS
CONFIDENCE
OBSERVED
ESTIMATED
ESTM-OBSV (ESTM-OBSV)/OBSV(%)
20.0
1590478.0
1606103.8
15625.8
1.0%

Cube Analyst Reference Guide 121

Reports
Sample reports

District matrix
REPORTING PRIOR/ESTIMATED MATRIX TOTALS
CONFIDENCE
PRIOR ESTIMATED ESTM-PRIOR (ESTM-PRIOR)/PRIOR(%)
20.0 238498.0
240291.2
1793.2
0.8%
REPORTING OBSERVED/ESTIMATED GENERATIONS AND ATTRACTIONS
GENERATIONS
DISTRICT CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
1
40.0 14616.0
13368.5
-1247.5
-8.5%
2
40.0 48050.0
47995.1
-54.9
-0.1%
3
40.0
7855.0
7711.1
-143.9
-1.8%
4
40.0 40478.0
42008.8
1530.8
3.8%
5
40.0 62530.0
59877.4
-2652.6
-4.2%
6
40.0 15462.0
16832.4
1370.4
8.9%
7
40.0 18734.0
19158.2
424.2
2.3%
8
40.0
6744.0
6707.7
-36.3
-0.5%
9
40.0 26890.0
26631.9
-258.1
-1.0%
ATTRACTIONS
DISTRICT CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
1
40.0 21562.0
22434.0
872.0
4.0%
2
40.0 43850.0
43476.7
-373.3
-0.9%
3
40.0 43963.0
44217.7
254.7
0.6%
4
40.0 21627.0
20809.8
-817.2
-3.8%
5
40.0 30926.0
30638.1
-242.9
-0.8%
6
40.0 37198.0
39445.7
2247.7
6.0%
7
40.0
8070.0
7973.4
-96.6
-1.2%
8
40.0 15332.0
16906.5
1574.5
10.3%
9
40.0 14423.0
14344.4
-78.6
-0.5%
REPORTING OBSERVED/ESTIMATED SCREEN LINE COUNTS
SCREENLINE CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV OBSV(%)
NO & NAME
1 A'shot Rd W-E
80.0
11677.0
11303.8
-373.2
-3.2%
2 A'shot Rd E-W
80.0
11677.0
11947.6
270.6
2.3%
<continued>

122

Cube Analyst Reference Guide

NO OF ODs

244
244

Reports
Sample reports

Local matrix
REPORTING SIDE CONSTRAINTS ON MATRIX TOTALS
DISTRICT
IN-PRIOR ESTIMATED ESTM-DISTRICT (ESTMDISTRICT)
CONSTRAINT
/ZONAL(%)
WITHIN DISTRICT
1506.2
958.0
1456.7
-49.5
-3.3%
FROM DISTRICT
6204.9
6276.0
5966.2
-238.7
-3.8%
TO DISTRICT
19303.5
16610.0
19181.9
-121.6
-0.6%
NOT IN DISTRICT
213276.5 214654.0 214338.2
1061.7
0.5%
MATRIX TOTAL
240291.2 238498.0 240943.0
REPORTING OBSERVED/ESTIMATED GENERATIONS AND ATTRACTIONS
GENERATIONS
ZONE NO CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV (ESTM-OBSV)/OBSV(%)
R-o-W
40.0 233504.0
233520.1
16.1
-0.0%
25
40.0
1557.0
1625.1
68.1
-4.4%
26
40.0
1753.0
1654.5
-98.5
-5.6%
27
40.0
378.0
338.8
-44.2
-11.7%
28
40.0
1535.0
1339.8
-195.2
-12.7%
55
40.0
211.0
232.5
21.5
10.2%
56
40.0
875.0
878.7
3.7
0.4%
60
40.0
268.0
296.6
28.6
10.7%
61
40.0
1278.0
1061.8
-216.2
-16.9%

ZONE NO
R-o-W
30
44
48
53
72
77

ATTRACTIONS
CONFIDENCE OBSERVED ESTIMATED
40.0 215324.0
220304.4
40.0
2370.0
2431.7
40.0
12392.0
11794.8
80.0
1708.0
1757.0
40.0
209.0
273.1
80.0
4226.0
3691.0
80.0
722.0
661.0

ESTM-OBSV (ESTM-OBSV)/OBSV(%)
4980.4
2.3%
91.7
3.9%
-597.2
-4.8%
49.0
2.9%
64.1
30.7%
-535.0
-12.7%
-61.0
-8.5%

REPORTING OBSERVED/ESTIMATED SCREEN LINE COUNTS


SCREENLINE CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV
NO & NAME
1 A'shot Rd W-E
80.0 11677.0
11459.1
-217.9
2 A'shot Rd E-W
80.0 11677.0
12580.3
903.3

OBSV(%) NO OF ODs

-1.9%

244

7.7%

244

Cube Analyst Reference Guide 123

Reports
Sample reports

Note that as for standard estimations, short and long zone labels
can be shown in the trip end reports. The label for the R-O-W
(Rest-of-the World) will be left blank.

124

Cube Analyst Reference Guide

Cube Analyst Reference Guide

10

Files

This chapter lists permanent files found in Cube Analyst.

Required (R)/
optional (O)3

Symbolic1
parameter

File2
ext

I/O

File description

.CTL

Control Data File

(R)

IMAT1

.MAT

Prior Trip/

(R)

Cost Matrix File


IDAT1

.DAT

Trip End Records

(R) If TRPEND=T

IDAT2

.DAT

Model
Parameters File

(R) If MODPAR=T or
if WARMST=T

IDAT3

.GDS

Gradient Search
File

(R) If WARMST=T

IDAT4

.DAT

Screenline File

(R) If SCRFIL=T

IDAT5

.ICP

Intercept File

(R) If WARMST=T
or If INTCPT=T

INET1

.NET

Network File

(R) If PRTTRP=T

PATH1

.PTH

VOYAGER Path
File

(R) If WARMST=F
and INTCPT=F and
no IDAF1 input

IDAF1

.RCP

Route Choice
Probability File

(R) If WARMST=F
and INTCPT=F and
no PATH input

Cube Analyst Reference Guide 125

10

Files

Required (R)/
optional (O)3

Symbolic1
parameter

File2
ext

I/O

File description

IDAF2

.PTL

Lines File

(R) If PRTTRP=T and if network


is Public Transport

IDAF3

.DDF

District
Definition File

(R) If DSTRCT=T

IDAT6

.DAT

Local Matrix
Control File

(R) If DSTRCT=T

IMAT2

.MAT

Partial Estimated
Matrix

(R) If DSTRCT=T and if

Coordinate File

(R) If NODLAB=T or using

IDAT7

.DAT

WARMST=T
hierarchic numbering

OMAT1

.MAT

Estimated Trip
Matrix

(R)

ODAT1

.DAT

Model
Parameter File

(R)

ODAT2

.GDS

Gradient Search
File

(R)

ODAT3

.DAT

Optimization
Log File

(R)

ODAT4

.ICP

Intercept File

(R) If WARMST=F and


INTCPT=T and either SCRFIL=T
or PRTTRP=T

ODAT5

.DAT

Text Intercept
File

(O)

ONET1

.NET

Network File

(R) If PRTTRP = T and DSTRCT


=F

OPRN

.NET

Print File

(R)

1. The SYMBOLIC PARAMETERS are those that would appear in an &FILES


record to control file opening.
2. The file extension shown is that used conventionally when running in
Application Manager.
3. File requirements can vary according to the combination of program
PARAMETERS and OPTIONS chosen.

126

Cube Analyst Reference Guide

Cube Analyst Reference Guide

11

Control Data

This chapter discusses control data. Topics include:

&PARAM keywords

&OPTION keywords

Cube Analyst Reference Guide 127

11

Control Data
&PARAM keywords

&PARAM keywords
It is usual to leave Cube Analyst control parameters to their default
values, with the user only setting the parameters associated with
data input and output file definition. These are described in
Standard user control parameters.
Of the remaining parameters, there is a set which is sometimes
changed (Secondary user control parameters on page 133) and
another which is rarely changed (Tuning control on page 135).
Most of the parameters in this last set are connected to the
operation of Cube Analysts optimization process, and hence are
only of interest when there is evidence of poor performance in
achieving convergence.
Topics in this section include:

Standard user control parameters

Secondary user control parameters

Tuning control

Standard user control parameters


TABLES

Type = Integer(4)
Default = 101, 102, 0, 0
Example: TABLES=101,102,103,104
The input matrix numbers to be used. They are respectively the
prior trip matrix and confidence levels, and the cost matrix and
confidence levels.
MATID

Type = Character(60)
Default = Blank
Example: Estimated Matrix for Study Area

128

Cube Analyst Reference Guide

Control Data
&PARAM keywords

Matrix identifier. Up to 60 alphanumeric characters can be used to


describe the contents of the output matrix. The identifier should be
enclosed in single quotes (').
WIDEND

Type = Integer
Default = 2 if hierarchic numbering, 0 otherwise
Range = 0-3
Example: WIDEND = 0
Indicates the format of the Screenline File:
0 = Cube Analyst establishes the format automatically. For this to
happen, all numeric entries in the file need to be right justified for
Cube Analyst to determine the file format unambiguously.
1 = Version six format. This supports the record format where both
link flow and link toll data are stored in the same record type.
2 = Version seven format. Two types of screenline record format can
be defined at version seven; link flow records (S in column one) or
link toll records (T in column one). The former type must be used
for Cube Analyst runs.
3 = Version seven format with a CNode column inserted to support
the input of turning counts in addition to link counts.
If hierarchic numbering is in use (HIERND = T), WIDEND must be set
to 2 to use the version seven format
MFORM

Type = Integer
Default = 0
Range = 0-4
Example: MFORM = 0
Indicates the format of the output matrix:

Cube Analyst Reference Guide 129

11

11

Control Data
&PARAM keywords

0 = Save the matrix in the same format as the input matrix. This is
the default action.
1 = TRIPS
2 = TP+/VOYAGER
3 = TRANPLAN
4 = MINUTP
DEC

Type = Character (1)


Default = Blank
Range = '0' to '9', 'S', 'D', or blank
Example: DEC='4'
Defines the precision with which to store values in the output
matrix:
Blank = Uses same precision as in the input trip matrix. If just a cost
matrix is input, a value of 2 is used.
'0' to '9' = Store numbers in the matrix as integers representing
values to the specified number of decimal places.
'S' or 'D' = Store numbers as floating point numbers in either single
or double precision. Double precision gives more accuracy to a
greater number of decimal places than single. These values give the
best representation in the output matrix, but will generally produce
a bigger output file. This option is only available if the output
matrix format is TP+/VOYAGER.
PSETS

Type = Integer(50)
Default = 1
Range = 1 to Number of paths sets in the input VOYAGER path file
Example = 1,3
Applies only when a VOYAGER path file is input. It defines the path
sets to apply when building the intercepts for the screenlines. At
least one set must be specified, and sets are referenced by their
number rather than by their name.

130

Cube Analyst Reference Guide

Control Data
&PARAM keywords

PVOLS

Type = Integer(50)
Default = 1
Range = 0 to number of volumes in the input VOYAGER path file
Example = 1,2
Applies only when a VOYAGER path file is input. It defines the
volumes to apply when building the intercepts for the screenline. If
a value of 0 is specified, the volumes will be ignored, and the
weighting of alternative routes will be solely defined by the
iteration factors. Otherwise, PVOLS is a list of numbers that are 1 or
more, representing the selected volume.
NETID

Type = Character(40)
Default = Blank
Example: NETID='Network with estimated link flows'
Network identifier. Up to 40 alphanumeric characters can be used
to describe the contents of the output network. The identifier
should be enclosed in single quotes ('). NETID is only used if reading
part trip data (PRTTRP=T).
EFLOW

Type = Integer
Default = 2
Range = 1-20
Example: EFLOW = 4
The number of the volume field in the output network into which
the total link flows estimated by Cube Analyst will be written.
EFLOW is only used if PRTTRP=T.

Cube Analyst Reference Guide 131

11

11

Control Data
&PARAM keywords

ELINEn

Type = Integer
Default = ELINE(n)=2+n*2
PT Only
Range = 1-20
Example: ELINE1 = 6
ELINE(n) is the number of the volume field in the output network
into which the link flows estimated by Cube Analyst will be written
for line group n. ELINEn is only used if PRTTRP=T and doing a Public
Transport matrix estimation.
NFLOW

Type = Character(4)
Default = 'EFLW'
Example: NFLOW = 'EFLW'
Volume field identifier. Up to four alphanumeric characters can be
used to indicate the contents of the volume field specified by
EFLOW. The identifier should be enclosed in single quotes (').
EFLOW is only used if reading part trip data (PRTTRP=T).
NLINE

Type = Character(4)*8
Default = 'NLINEn='ELGn'
PT Only
Example: NLINE1 = 'ELG1'
Volume field identifier. Up to four alphanumeric characters can be
used to indicate the contents of the volume field specified by
ELINEn. The identifier should be enclosed in single quotes (').
NLINEn is only used if PRTTRP=T and doing a Public Transport
matrix estimation.

132

Cube Analyst Reference Guide

Control Data
&PARAM keywords

ZCONF

Type = Integer
Default = 100
Range = 1-10000
Example: ZCONF =200
Confidence level for side constraints applied to local matrices and
derived from estimated district matrix.

Secondary user control parameters


The parameters described in this section would only be used to try
to reduce the processing times required to achieve convergence.
Refer to Computation times on page 159.
MXITER

Type = Integer
Default = 3000
Range = 1-999999
Example: MXITER = 1500
The maximum number of iterations. Cube Analyst will stop if this
number of iterations has been reached and no convergence has
been achieved. The model parameter and gradient search files are
written out and can be used to restart Cube Analyst (from the
position it was in when it stopped) and the optimization continued.
The currently estimated matrix is also output.
ITERH

Type = Integer
Default = Generated by Cube Analyst
Range = 1-9999
Example: ITERH = 4000
The number of iterations between recalculations of the estimated
Hessian matrix.

Cube Analyst Reference Guide 133

11

11

Control Data
&PARAM keywords

UTOL

Type = Real
Default = 0.0001
Range = 0.0-99.0
Example: UTOL = 0.05
The accuracy tolerance in detecting convergence or failure. When
the maximum absolute size of the search vector is less than this
value then the procedure will be deemed to have converged.
IREP

Type = Integer
Default = 3
Range = 1-3
Example: IREP = 2
Reporting level for the optimization log file. See Information in the
optimization log file on page 157.
IHTYPE

Type = Integer
Default = 4
Range = 0-4
Example: IHTYPE = 2
This controls the type of optimization process used by Cube
Analyst, as shown in the following table. The difference for values 1
- 4 correspond to differences in the way the initial Hessian matrix,
H0, is calculated.
Methods of optimization

134

Optimization process

Value of IHTYPE

Comments

Steepest Descent

Simple searching

Quasi-Newton

1,2,3

1 = H0 set to unit matrix


2 = H0 read from file (warm start)
3 = H0 computed every iteration

Newton/Hybrid Newton

Hessian calculated regularly,


according to setting of ITERH

Cube Analyst Reference Guide

Control Data
&PARAM keywords

Tuning control
The parameters documented in this section would normally be
changed only in response to an error message generated by Cube
Analyst. In the event of this occurring, please contact
support@citilabs.com for advice.
MXCALL

Type = Integer
Default = 5000
Range = 1-999999
Example: MXCALL = 10000
The maximum number of function evaluations. (This should be
greater than MXITER. At least one function evaluation is required at
each iteration, possibly more.)
MXFREE

Type= Integer
Default= 4
Range= 1-10
Example: MXFREE = 7
The number of times a parameter may be freed from its bounds.
UMAX

Type = Real
Default = 1.0
Range = 0.0-1000.0
Example: UMAX = 0.5
The maximum allowed search step. If the maximum absolute value
of the search vector (called UNORMX in the log report) is greater
than this then the entire search vector is multiplied by a term
UMAX/UNORMX so that the new maximum entry is equal to UMAX.

Cube Analyst Reference Guide 135

11

11

Control Data
&OPTION keywords

&OPTION keywords
Note: Options TRIPM and COSTM work in conjunction with one
another.
TRIPM

Type = Logical
Default = True
If TRIPM = T then the input matrix file will contain at least two
tables. The first will be the prior trip matrix; the second will be the
associated confidence levels.
IF COSTM = F then these will be the only two matrices present in
the file.
TRIPM = F is only allowed if COSTM = T.
COSTM

Type = Logical
Default = False
If COSTM = T and TRIPM = T, then the input matrix file will contain
four two tables. The first two are as described above; the third will
be the cost matrix and the fourth will be the associated confidence
levels.
If COSTM = T and TRIPM = F, then the cost and confidence level
matrices will be the first and second supplied in the file.
SCRFIL

Type = Logical
Default = True
If SCRFIL = T then an input screenline file is supplied. See
Screenline file on page 140.

136

Cube Analyst Reference Guide

Control Data
&OPTION keywords

TRPEND

Type = Logical
Default = True
If TRPEND = T then an input trip end data file is supplied. See Trip
end file on page 142.
INTCPT

Type= Logical
Default= False
If INTCPT=T then an input Intercept file is supplied. See Intercept
file on page 149.
MODPAR

Type= Logical
Default= False
If MODPAR = T then an input model parameter file is supplied. See
Model parameter file on page 144.
WARMST

Type= Logical
Default= False
If WARMST = T then gradient search, model parameter, and
intercept files are supplied to warm start the estimation calculation.
The input of these files from a previous run of the same model
should assist the speed of optimization, but see Approaches to
running Cube Analyst on page 152.
HIERND

Type= Logical
Default= Set automatically from RCP input file; False if no RCP file
input.

Cube Analyst Reference Guide 137

11

11

Control Data
&OPTION keywords

HIERND=T indicates that a hierarchic node numbering system is in


use. This option only needs to be set if no RCP file is input. If there is
an RCP input file, the setting of HIERND in the control file will be
ignored.
NODLAB

Type= Logical
Default= False
If NODLAB=T, zone labels will be included in the Cube Analyst
reports. A coordinate file must be supplied containing node labels.
Coordinate file on page 143.
LNGLAB

Type= Logical
Default= False
LNGLAB=T to include long zone labels in the Cube Analyst reports.
Note that NODLAB = T must also be set to use long labels. If
NODLAB=T and LNGLAB=F, the short zone labels will be used. A
coordinate file must be supplied containing node labels. See
Coordinate file on page 143.
PRTTRP

Type= Logical
Default= False
If PRTTRP = T then an input network file is supplied which contains
part-trip link flows.
DSTRCT

Type= Logical
Default= False
If DSTRCT = T then a local matrix control file and district definition
file must be supplied. See Local matrix control file on page 147
and District definition file on page 148.

138

Cube Analyst Reference Guide

Cube Analyst Reference Guide

12

Program Specific Data

This chapter discusses files containing Cube Analyst data. Topics


include:

Screenline file

Trip end file

Coordinate file

Model parameter file

Local matrix control file

District definition file

Intercept file

Gradient search file

Cube Analyst Reference Guide 139

12

Program Specific Data


Screenline file

Screenline file
This file is required if SCRFIL = T.
The screenline file is used to supply link/turn count and confidence
level data to Cube Analyst.
There are two formats of the file supported. The original format
(indicate by parameter WIDEND=2) just supports link counts. An
alternative format (WIDEND=3) has an extra column to allow
turning counts to be specified.
This section describes both formats:

Link count format

Turning count format

Link count format


The format of the file containing just link counts is as follows:

140

Columns

Type

Contents

Character

S screenline record identifier

2-5

Integer

Screenline number

6 - 14

Integer

Anode of link

15 - 23

Integer

Bnode of link

24 - 33

Real

Link traffic volume count

34 - 40

Integer

Confidence level. A number between 1 and 10000,


but usually in the range 1-100, that expresses the
users confidence in the link traffic volume count. This
is used only by Cube Analyst.

41 - 58

Character

Screenline name, up to 18 characters (optional)

60

Integer

Direction code. For purposes of matrix estimation this


must be set to 1.

61 - 70

Integer

X-coordinate (optional) at which to display screenline


name on the screen.

71 - 80

Integer

Y-coordinate at which to display screenline name on


the screen.

Cube Analyst Reference Guide

Program Specific Data


Screenline file

Turning count format


The format of the file that supports turning counts is as follows:
Columns

Type

Contents

Character

S screenline record identifier

2-5

Integer

Screenline number

6 - 14

Integer

Anode of link/turn

15 - 23

Integer

Bnode of link/turn

24 - 32

Integer

Cnode of turn (leave blank for link counts)

33 - 42

Real

Link/Turn traffic volume count

43 - 49

Integer

Confidence level. A number between 1 and 10000,


but usually in the range 1-100, that expresses the
users confidence in the traffic volume count. This
is used only by Cube Analyst.

50 - 67

Character

Screenline name, up to 18 characters (optional)

70

Integer

Direction code. For purposes of matrix estimation


this must be set to 1.

Notes:

If a screenline contains more than one link/turn, then Cube


Analyst calculates the screenline count as the sum of the
counts for each link/turn in the screenline. Also, the screenline
confidence level is set as the weighted average of the
individual link/turn count confidence levels.

The file can contain a mixture of link and turning counts. For
link counts, the Cnode should be left blank.

Comment records, which have an asterisk (*) in column one,


may appear anywhere in the file.

Cube Analyst Reference Guide 141

12

12

Program Specific Data


Trip end file

Trip end file


This file is required if TRPEND = T.
The trip end file format for Cube Analyst is therefore:
Columns

Type

Content

1 - 10

Integer

Zone Number

11 - 20

Real

Generations

21 - 40

unused

41 - 50

Real

Attractions

51 - 60

Integer

Confidence Level for Generations

61 - 70

Integer

Confidence Level for Attractions

Comment records, which have an asterisk (*) in column one, may


appear anywhere within the file.

142

Cube Analyst Reference Guide

Program Specific Data


Coordinate file

Coordinate file
The input coordinate file must be supplied if option NODLAB has
been set to TRUE. The file supplies the correspondence between
node numbers and their hierarchic equivalents.
The format of the file is summarized below:
Columns

Type

Content

*1 - 10

Integer

Node number (sequential)

11 - 20

Integer

X coordinate

21 - 30

Integer

Y coordinate

*31 - 40

Integer

Hierarchic node number

41 - 48

Character

Text for node; short label (optional)

49 - 80

Character

Text for node; long label (optional)

Notes:

Items marked * must be coded for hierarchic processing.

Sequential node numbers must be unique.

If hierarchic node numbers are being used then:

Hierarchic node numbers must be unique.

Columns 31-40 must be coded on each record.

Text labels should normally be left justified in their respective


fields.

Blank records will be ignored.

Records with an asterisk (*) in column 1 will be treated as


comment records.

Node coordinates are used by the graphics programs and are


therefore optional.

Cube Analyst Reference Guide 143

12

12

Program Specific Data


Model parameter file

Model parameter file


This file is required if MODPER = T.
This file contains data describing the model parameters and their
attributes. It would not normally be constructed by the user, as on
initiating a run of Cube Analyst the model parameters take a
default value, as shown in the table below.
However, at the end of the Cube Analyst run a file is generated
containing the new model parameters calculated. The amended
file, or indeed the unedited file, can be re-input to Cube Analyst to
invoke a warm start; that is to continue the estimation process
from where the last run finished.
The general format of the file is as follows:
Record

Description

Record one

Header record defining the number/type of


parameters in the file.

The next ZONES records

Values for the A(i) parameters

The next ZONES records

Values for the B(j) parameters

The next SCREENLINE records

Values for the X(k) parameters

The next two records

The a and b parameters of the Distribution


Model

(if COSTM = T)

Where:

ZONES is defined as the number of zones in the matrix.

SCREENLINE is defined as the number of screenlines specified.

Note that comment records, which have an asterisk (*) in column


one, can appear anywhere in the file.

144

Cube Analyst Reference Guide

Program Specific Data


Model parameter file

The format of the individual record types is as follows:


First record:

[Must not be edited]

Columns

Type

Content

1 - 23

Character

Model parameter file

24 - 31

Integer

Number of model parameters

32 - 39

Integer

Number of origin zones

40 - 47

Integer

Number of destination zones

48 - 55

Integer

Number of screenlines

56 - 63

Integer

1 if using cost data, otherwise 0.

64 - 77

Real

Value of objective function

78 - 91

Real

Step size

92 - 99

Integer

Number of iterations completed

Remaining records
Default if parameter
not defined

Columns

Type

Content

1-8

Integer

Parameter number

10 - 22

Real

Parameter value

1.0

24 - 36

Real

Lower bound for parameter

0.1 E-6

38 - 50

Real

Upper bound for parameter

1.0 E10

52 - 64

Real

Scale factor for parameter

1.0

65 - 89

Reserved for Cube Analyst

100 - 107

If the file is not supplied by the user then it is created by Cube


Analyst and the default values shown above are used for each of
the model parameters.
If the second, third and fourth fields all have the same value then
the parameter is deemed to be fixed at this value. It is a
requirement of Cube Analyst that:

At least one parameter must be free otherwise a fatal error is


reported and the program will stop.

Cube Analyst Reference Guide 145

12

12

Program Specific Data


Model parameter file

At least one parameter must be fixed. If not done by the user


than Cube Analyst will fix A(1).

An identical format file is created at the end of an Cube Analyst run,


but it will contain the revised parameter values in it. This is so that
Cube Analyst can be re-started from where the last run was finished
if required, or used as a basis for fixing parameter values. Note that
Cube Analyst adds up to three extra columns on the end of each
record which are for its own internal use. The information put there
should not be edited by the user.

146

Cube Analyst Reference Guide

Program Specific Data


Local matrix control file

Local matrix control file


This file is required if DSTRCT = T. The format of the file is as follows:
Columns

Type

Content

1 - 10

Integer

Origin District

11 - 20

Integer

Destination District

Comments records, which have asterisk (*) in column one, may


appear anywhere in the file.

Cube Analyst Reference Guide 147

12

12

Program Specific Data


District definition file

District definition file


This file is required if DSTRCT = T or WARMST = T.
The user may affect the operation of the estimation according to
the grouping of zones into origin and destination districts. The
district definition file which is input to Cube Analyst is a direct
access file, and so it is not amenable to direct alteration by the user.

148

Cube Analyst Reference Guide

Program Specific Data


Intercept file

Intercept file
This file is required if INTCPT = T or WARMST = T.
Output by Cube Analyst and Cube Voyager HIGHWAY and PT, this
binary file stores information on routings and screenlines in a
concise format. Once established, it may be re-input to Cube
Analyst to save (substantial) processing times when Cube Analyst is
estimating or re-estimating for data where neither the routings or
screenline locations definitions have been altered. This file cannot
be edited by the user.
Note that there is also a text file version of the intercept file that can
be output. Its purpose is for information only; it is not intended for
subsequent input to Cube Analyst or any other program. The file is
written to if the file is named, and is generated from either the
input or output binary intercept file, depending upon which is
used. For each screenline it shows:

The number of intercepting I-J pairs.

A sub-header under the screenline for each origin I that has


routes that intercept the screenline.

Under each origin, a list of pairs of numbers. The first number of


the pair represents the destination zone J. The second number
of the pair represents the percentage of traffic travelling from
the origin to the destination that routes through the screenline.

Cube Analyst Reference Guide 149

12

12

Program Specific Data


Gradient search file

Gradient search file


This file is required if WARMST = T.
This is a binary file output by Cube Analyst which is re-read by Cube
Analyst when warm starting a run (WARMST = T). It contains
information used by Cube Analysts optimizer and cannot be edited
by the user.

150

Cube Analyst Reference Guide

Cube Analyst Reference Guide

13

Notes on Program Use

This chapter contains information you might find helpful when


using Cube Analyst. Topics include:

Approaches to running Cube Analyst

Selection of model form

Information in the optimization log file

Computation times

Running Cube Analyst from Cube Voyager

Cube Analyst Reference Guide 151

13

Notes on Program Use


Approaches to running Cube Analyst

Approaches to running Cube Analyst


There are several approaches that the user may adopt towards
running Cube Analyst, which vary according to the information
available to the user about the estimation of any particular matrix.
The approaches may be categorized as:

Initial estimation

Constrained model parameters

Controlling the optimization process

Initial estimation
Only basic input data is required, as contained in the routes (RCP for
TRIPS, PATH or ICP for Cube Voyager users), matrix, trip end, and
screenline files (as well as optionally a network file with part-trip
data). Program control parameters are allowed to take default
values. Occasionally, an input model parameter file may be
required to influence the model form by fixing some parameter
values.
Re-Estimation with altered data: Warm starting

In this case the model parameter, gradient search, and intercept


files from the last (or initial) estimation run are input, additionally to
the user input data.
Warm starting is only valid when the structure of the estimation is
unaltered, this means that the number of data items, screenline
locations, and routings should not be altered. However, data values
and confidence levels may be changed.
Warm starting is useful either to split an estimation into more than
one run of Cube Analyst, for the sake of convenience, or to
undertake sensitivity analysis on the effects of altered data or
confidence level values. When a run of Cube Analyst is split for
convenience, and no input data is changed, then it is efficient to set
the parameter UMAX to the value reported by Cube Analyst at the
final iteration of the previous run.

152

Cube Analyst Reference Guide

Notes on Program Use


Approaches to running Cube Analyst

Constrained model parameters


Model parameters may be constrained to:

Reflect a user-specified value (for example, of and


parameters)

Partition an estimation into sub-problems to be


accommodated within computing resources

Alter the nature of the trip estimation equation

If model parameters are fixed or freed from run to run then the
gradient search file from one run should not be used in the next.
Note that Cube Analyst may itself constrain model parameters. This
occurs when the estimated Hessian matrix is found to be,
mathematically speaking, non-positive definite, which arises
when one or more model parameters are degenerate. That is, a
model parameter is not contributing independently of another
model parameter.
In these circumstances, Cube Analyst gives a message reporting
how many such model parameters it is constraining, which is of the
form:
ME (I): XXX MODEL PARAMETERS ARE NOT CONTRIBUTING TO THE ESTIMATION

The constrained model parameters are listed in Cube Analysts log


file.
It is not necessarily a cause for concern when Cube Analyst
constrains model parameters in this way, although it is a signal that
not all data is of value to the estimation because it is strongly
correlated with other data. For instance, link flow counts on
adjacent links of a main road may refer to substantially the same
trips and hence one count is (mathematically) redundant. It is thus
most frequently the case that Xx model parameters are constrained
by Cube Analyst, although other model parameters may be
constrained too.

Cube Analyst Reference Guide 153

13

13

Notes on Program Use


Approaches to running Cube Analyst

Controlling the optimization process


Normally program control parameters should be allowed to take
their default values. However, computation times may be improved
by judicious setting of the control parameters. This is discussed
further in Computation times on page 159 and in Tuning
estimation performance on page 73.

154

Cube Analyst Reference Guide

Notes on Program Use


Selection of model form

Selection of model form


Cube Analyst provides capability for the user to control the
structure of the solution and how it is achieved. This section
describes these possibilities. However, it may be observed that
Cube Analyst is usually run with the full, default model form. It may
be shown that in this form, the XK parameter is, strictly redundant,
although it is of value in providing extra degrees of freedom by
which Cube Analyst can handle the effects of errors and
inconsistencies in the input data.
The number of possible parameters in a model is:

Two for each zone (that is the a(i) and b(j))

One for each screenline (the X(k))

Two more if any cost data is to be used ( and )

The model parameter file contains a value for the parameter, the
upper and lower bounds for the parameter, and a scaling factor.
The scaling factor is used only to assist the optimization process in
ensuring that maximum accuracy is obtained. It should be set
equal to the expected value of the parameter in the final
solutionit is only necessary to make this scaling factor of the
same order of magnitude if there are difficulties in ensuring
convergence. If no such difficulties are apparent then the scaling
factor can just be set to 1.0.
The lower and upper bounds for the parameters allow the user to
specify the degrees of freedom which are permitted. In particular if
a model parameter is set to 1.0 and its lower and upper bounds are
also set to this value then a number of standard forms for the
matrix estimation process can be achieved. For example:

Setting all a(i), b(j) and X(k) equal to a fixed value (for example,
1.0) together with their bounds then the problem is reduced to
a Gravity model driven only by the cost data (that is, only and
are allowed to vary). (Note that varying values of a(i) and b(j)
can be used to scale the numbers of trips per zone).

Cube Analyst Reference Guide 155

13

13

Notes on Program Use


Selection of model form

Setting and and their bounds to a particular value allows


the estimation process to use cost data which has been
previously calibrated.

Setting a(i), b(j) and and to a fixed value and allowing x(k) to
be free provides a link constraint model.

Setting x(k), and to fixed value gives a growth factor model.

Setting and to a fixed value allows a growth factor link


model (note that and are only defined if cost data is
supplied) .

Setting a(i) and b(j) to a fixed value defines a link gravity model.

In some of these cases input data although defined and requested


by the program is not used by the estimation process. The data that
is used in these special cases is summarized in the following table.
Data used in different reduced model forms
Data/model
type

Growth
factor

Trip matrix &


confidences

Trip end &


confidences

Gravity

Growth
factor link

Link
gravity

Routing
information
(RCP)
Trip costs

Link
constraint

It should be observed that there are no specific model parameters


associated with part-trip data, so this form of data is not relevant to
discussions of model forms.

156

Cube Analyst Reference Guide

Notes on Program Use


Information in the optimization log file

Information in the optimization log file


The levels of report, determined by the setting of the IREP
parameter are as follows:

IREP=1 A report is only produced at the end of the run. This


shows:

The reason that the optimization has been halted

The value of the objective function (the maximum


likelihood)

The current step size

The minimum tolerance step size

In addition a number of important variables defining the size of


the problem and the parameters input to Cube Analyst are also
displayed.

IREP=2 A report is produced as for IREP 1 and also a report at


each iteration. This shows:

The iteration number and whether or not progress was


made at this iteration.

The number of evaluations made so far in this run. This is


the number of times that a matrix and associated trip and
screenline data have been calculated together with the
likelihood function.

The current step size.

The step size tolerance.

IREP=3 A report is produced as for IREP 2 and also a report at


each time the model is evaluated to calculate the effects of a
particular choice of model parameter. This shows:

The evaluation number

The step size multiplier (used if no progress is made in the


first evaluation to reduce the step size, ALPHA)

The step size (STEP)

Cube Analyst Reference Guide 157

13

13

Notes on Program Use


Information in the optimization log file

The objective function at the last iteration and at the point


at which the function evaluation is made. (FTRIAL and
FBEST)

A measure of the gradient at the last iteration at the point


at which the function evaluation is made

Other internal variables used only for intermediate


calculations (FGOLD, FJJS)

In addition, reports may be output to the execution log file if the


gradient search matrix is found to be unstable and has to be reinitialized.

158

Cube Analyst Reference Guide

Notes on Program Use


Computation times

Computation times
Program control parameters should usually take their default
values. However, computation times may be improved by setting
the set of parameters shown in the following table.
Parameters for influencing computation times
Control parameter

Comment

MXITER

MXITER should only be used to terminate an estimation


prematurely when there is some evidence that it is safe
to do so, that is, after Cube Analyst has been initially
allowed to reach convergence.

ITERH

There is a trade-off between the reduction in the


number of iterations and the number of times the
Hessian matrix must be re-calculated. The default value
of ITERH represents an average best value, but it is
worth experimenting with the value of ITERH for
different types of estimation problem.
In some cases, lowering the value of ITERH may guide
the optimiser to a solution which it otherwise could not
find. In other cases, estimating the Hessian too
frequently will add to the run time, sometimes
significantly.

UTOL

Examination of the log file, and the screen display, will


show how the convergence indicator, UNORM, is
approaching the target value set by UTOL. Larger values
of UTOL increase the risk of Cube Analyst terminating
significantly away from the most likely value of the
estimated matrix for the set of input data, while lower
values of UTOL imply lower standard errors for the
model parameters.

Cube Analyst Reference Guide 159

13

13

Notes on Program Use


Running Cube Analyst from Cube Voyager

Running Cube Analyst from Cube Voyager


You can run Cube Analyst from Cube Voyager. This section
discusses:

Running Cube Analyst from a VOYAGER script

Files

Running Cube Analyst from a VOYAGER script


Cube Analyst can be executed via a RUN PGM statement, where the
program name needs to be specified as MVESTM71.
Cube Analyst needs to be told the name of the control file to use,
and how much memory it can take. The former is achieved via the
CTLFILE keyword. The memory setting is achieved by specifying a
PARAMETERS entry on the RUN PGM command of the form
PARAMETERS="/m=14" where the number 14 indicates the
amount of memory to use in MB. If no PARAMETERS entry is
specified, a default amount will be applied which will insufficient
for larger problems.
For example, the following command:
RUN PGM="MVESTM71", CTLFILE="C:\Test\Me_Test.ctl", PARAMETERS="/m=100"
ENDRUN

will run Cube Analyst, providing 100 MB of memory.

Files
For Cube Voyager users, the intercept file can be generated by
HIGHWAY and PT. In this case the intercept file name should always
be defined, along with the option INTCPT=T. Alternatively,
HIGHWAY can be used to generate a Cube Voyager Path file, which
is input to Cube Analyst, which will create the screenline Intercepts
from the paths.

160

Cube Analyst Reference Guide

Cube Analyst Reference Guide

14

Examples

This chapter contains a set of examples. Topics include:

Estimation with prior trip and count data only

Estimation with prior trip, count, and trip end data

Estimation with warm start and cost data

Estimation with highways part trip data

Estimation with public transport part-trip data

Hierarchic estimation

Example of screenline volumes report

Cube Analyst Reference Guide 161

14

Examples
Estimation with prior trip and count data only

Estimation with prior trip and count data only


The following control data would be appropriate for an initial
estimation when the only data available for updating a matrix is
from count sites.
Column
1...5...10...15...20...25...30...35..40..45..50..55..60
Estimate with Old Matrix and Count Data
&FILES IMAT1='PRIOR.MAT',
IDAT4='SCRL.DAT',
IDAF1='ROUTES.RCP',
OPRN='MVESTM.PRN',
OMAT1='ESTM.MAT',
ODAT1='PARM.DAT',
ODAT2='GRAD.GDS',
ODAT3='LOG.DAT',
ODAT4='INTCPT.ICP' &END
&PARAM MATID='Estimated Matrix' &END
&OPTION TRPEND=F &END

162

Cube Analyst Reference Guide

Examples
Estimation with prior trip, count, and trip end data

Estimation with prior trip, count, and trip end data


This is a similar run to 7.1, but additionally trip end data is available.
Also, short zone labels are included in the reports.
Column
1...5...10...15...20...25...30...35..40..45..50..55..60
Estimate with an Old Matrix, Counts, and Trip End Data
&FILES IMAT1='PRIOR.MAT',
IDAT1='TEND.DAT',
IDAT4='SCRL.DAT',
IDAT7='COORD.DAT'
IDAF1='ROUTES.RCP',
OPRN='MVESTM.PRN',
OMAT1='ESTM.MAT',
ODAT1='PARM.DAT',
ODAT2='GRAD.GDS',
ODAT3='LOG.DAT',
ODAT4='INTCPT.ICP' &END
&PARAM MATID='Estimated Matrix - including Trip End Data', &END
&OPTION NODLAB=T, LNGLAB=F &END

Cube Analyst Reference Guide 163

14

14

Examples
Estimation with warm start and cost data

Estimation with warm start and cost data


The following control data would be suitable if, say, some
confidence levels had been altered in data input files, and where
the data included both trip and cost matrices, as well as trip end
and count data. Long zone labels are to be included in the reports.
Column
1...5...10...15...20...25...30...35..40..45..50..55..60
Re-Estimation with altered Confidence Levels
&FILES IMAT1='PRIOR.MAT',
IDAT1='TEND.DAT',
IDAT2='PARM.DAT',
IDAT3='GRAD.DAT',
IDAT4='SCRL.DAT',
IDAT7='COORD.DAT'
IDAT5='INTCPT.ICP',
IDAF1='ROUTES.RCP',
OPRN='MVESTM.PRN',
OMAT1='ESTM2.MAT',
ODAT1='PARM2.DAT',
ODAT2='GRAD2.GDS',
ODAT3='LOG.DAT' &END
&PARAM TABLES=101, 102, 103, 104 MATID='Re-Estimated Matrix' &END
&OPTION TRIPM=T,
COSTM=T,
NODLAB= T,
LNGLAB=T
WARMST=T &END

164

Cube Analyst Reference Guide

Examples
Estimation with highways part trip data

Estimation with highways part trip data


The following control data would be suitable where the data
included part-trip data as well as a trip matrix, trip ends, and count
data.
Column
1...5...10...15...20...25...30...35..40..45..50..55..60
Estimation including Part Trip data
&FILES IMAT1='PRIOR.MAT',
IDAT1='TEND.DAT',
IDAT4='SCRL.DAT',
IDAF1='ROUTES.RCP',
INET1='PTRIPS.NET',
OPRN='MVESTM.PRN',
OMAT1='ESTM.MAT',
ODAT1='PARM2.DAT',
ODAT2='GRAD2.GDS',
ODAT3='LOG.DAT',
ODAT4='INTCPT.ICP',
ONET1='ESTMPTRP.NET' &END
&PARAM MATID='Estimated Matrix using Part Trip data',
NETID='Estimated Flows (with Part Trip Flows)',
EFLOW=7,
NFLOW='ESTM' &END
&OPTION PRTTRP=T &END

Cube Analyst Reference Guide 165

14

14

Examples
Estimation with public transport part-trip data

Estimation with public transport part-trip data


The following control data would be suitable for estimating a
public transport matrix where the data included part-trip data,
organized into three line groups, as well as a trip matrix, trip ends,
and count data.
Column
1. 5. 10. 15 .20..25. 30..35 40. 45 .50 .55 .60
Estimation including Part Trip data by line group
&FILES IMAT1='PRIOR.MAT',
IDAT2='TEND.DAT',
IDAT4='SCRL.DAT',
IDAF1='ROUTES.RCP',
IDAF2='LINES.PTL'
INET1='PTRIPS.NET',
OPRN='MVESTM.PRN'
OMAT1='ESTM.MAT',
ODAT1='PARM.DAT',
ODAT2='GRAD.GDS',
ODAT3='LOG.DAT',
ODAT4='INTCPT.ICP'
ONET1='ESTMPTRP.NET' &END
&PARAM MATID='Estimated Matrix using Part Trip data',
NETID='Estimated Flows (Prt Trp Flows by Line Group)'
ELINE1=4,
ELINE4=6,
ELINE5=8,
NLINE1='EXPR',
NLINE4='LOCL',
NLINE5='AIRP' &END
&OPTION PRTTRP=T &END

166

Cube Analyst Reference Guide

Examples
Hierarchic estimation

Hierarchic estimation
The following control data would be suitable for estimating a very
large public transport matrix where the data included part-trip data
as well as a trip matrix, trip ends, and count data.
Column
1...5...10...15...20...25...30...35..40..45..50..55..60T
itle:
Large PT Estimation using Part Trip Total Link Flow
&FILES IMAT1='PRIOR.MAT',
IDAT1='TEND.DAT',
IDAT4='SCRL.DAT',
IDAF1='ROUTES.RCP',
IDAF2='LINES.PTL',
IDAF3='DISTRICT.DDF',
IDAT6='LMCTL.DAT',
OPRN='MVESTM.PRN',
OMAT1=ESTM.MAT',
ODAT1='PARM.DAT',
ODAT2='GRAD.GDS',
ODAT3='LOG.DAT',
ODAT4='INTCPT.ICP'
ONET1='ESTMPTRP.NET' &END
&PARAM MATID='Estimated Matrix using Part Trip data',
NETID='Estimated Flows (total Part Trip flow)',
EFLOW=2,
NFLOW='ETOT' &END
&OPTION PRTTRP=T,
DSTRCT=T &END

Cube Analyst Reference Guide 167

14

14

Examples
Example of screenline volumes report

Example of screenline volumes report


REPORTING OBSERVED/ESTIMATED SCREEN LINE COUNTS
SCREENLINE CONFIDENCE OBSERVED ESTIMATED ESTM-OBSV
NO & NAME
1 A'shot Rd W-E
80.0
11677.0
11335.4
-341.6
2 A'shot Rd E-W
80.0
11677.0
11734.9
57.9
3 A3-Hogs Back S-N
200.0
27947.0
26672.2
-1274.8
4 A3-Hogs Back N-S
200.0
25504.0
25160.5
-343.5
5 Onslow St S-N
80.0
18981.0
17479.2
-1501.8
6 Onslow St N-S
80.0
18809.0
18285.2
-523.8
7 Town Centre E-W
80.0
16285.0
16556.5
271.5
8 Town Centre W-E
80.0
<continued>

168

22670.0

Cube Analyst Reference Guide

22494.5

-175.5

OBSV(%) NO OF ODs

-2.9%

244

0.5%

244

-4.6%

160

-1.3%

160

-7.9%

383

-2.8%

687

1.7%

904

-0.8%

870

Cube Analyst Reference Guide

Index

A
analysis
Cube Analyst results 75
C
calculations
hierarchic estimation 57
common elements 6
computation times 159
computing resources 9
confidence levels
setting 70
controlling routing information 74
conventions used 8
coordinate file 143
cost data
example using 164
sources of 10
cost distribution function 29
COSTM 136, 136
count data, example using 162, 163
Cube Analyst
overview 21
running for hierarchic estimation 106
running, approaches to 152
D
data
Cube Analyst estimation process 81
sets, Cube Analyst 30
types, Cube Analyst 26
DDF 106
defining
districts 104

district definition file 148


district matrix, calculation of 57
districts, defining in hierarchic estimation 104
DSTRCT
&OPTION keyword 138
district definition file requirement 148
E
EFLOW, &PARAM control parameter 131
ELINEn, &PARAM control parameter 132
estimation
example 163
highway matrices 20
large matrices 112
matrix calculation process 83
performance, tuning 73
process, overview 113
public transport matrices 20
evaluating matrix sensitivity 86
examples
estimating part-trip highway data 165
estimating public transport matrix 166
hierarchic estimation 167
screenline volumes report 168
F
framework for inputting data, Cube Analyst 12
G
gradient search file 150
H
hierarchic estimation
alternatives to 96

Cube Analyst Reference Guide 169

Index
I

approaches to 95
example 167
large matrices, approach for 94
levels of detail 94
matrices calculated 57
overview 94
HIERND 136
&OPTION keyword, Cube Analyst 137
highway matrix
estimating with part trip data, example 165
estimating, compared to public transport 20
I
ICP 74
IHTYPE
&PARAM keyword, secondary parameter 134
including part-trip data 87
inputs
estimating O-D matrix 110
INTCPT 136, 149
intercept file
description 149
introduction 2
IREP 128, 157
ITERH
&PARAM keyword, secondary parameter 133
influencing computation time with 159
L
large matrices, estimating 112
link counts, Cube Analyst input 26
LMC 106
LNGLAB 136, 136
local matrices
control file 147
hierarchic estimation 98
M
mathematical notation
equation 34
letters and symbols 32
mathematics
calculations summary 48
introduction 35
MATID, &PARAM keyword 128
matrices
estimation process 83
preparing for analysis 61
maximum likelihood method 48
mixed districts 97

170

Cube Analyst Reference Guide

model form, selecting 155


model parameter file 144
MODPAR, &OPTION keyword 137
MODPER, required file 144
MXCALL 128, 128
MXFREE 128, 128
MXITER 128, 159, 159
N
NETID 128, 128
networks
preparing data for 64
NFLOW, &PARAM keyword 132
NLINE 128, 128
NODLAB 136, 143
O
objective of Cube Analyst 13
O-D 2
optimization log file 157
OPTION Keywords 136
options for using Cube Analyst 15
origin-destination matrix, estimating 2
outputs
Cube Analyst, overview 111
P
PARAM Keywords#widend 128
partial O-D matrix, data type 27
part-trip data
including in estimation 87
inputs 29
routing information 74
passenger counts 64
permanent files 125
prior trip data, estimation with 163
prior trip matrix data 27
PRTTRP 136, 136
public transport matrices 20
public transport part-trip data 166
R
RCP 74
reports
summary, Cube Analyst 116
results analysis 75
route choice probability 68
routing information 28, 74
routings 68
running Cube Analyst

Index
S

approaches 152
from Cube Voyager 160
hierarchic estimation 106
running Cube Cargo
hierarchic estimation 106
S
screenline file
description, Cube Analyst 140
screenline volumes report 168
screenlines 65
SCRFIL 136, 140
selecting model form 155
Sensitivity Analysis 86
Setting 70
Confidence Levels 70
Data 30
Side Constraints 57
Study Area 80
summary of reports 116
T
TABLES user control parameter 128
The Estimation Model 113
The Optimization Step 113
traffic counts 64
Trip Cost Matrix 26, 26
Trip End Data 163
Trip End File 142

trip ends
data description 28
determining 63
trip matrix
estimating in Cube Analyst 2
TRIPM 136, 136
TRPEND 136, 142
Tuning 73
Estimation Performance 73
U
UMAX 128, 128
user options 15
UTOL 128, 159, 159
V
Variations 6
W
Warm Start 164
WARMST 136, 150
Whats New in Version 7.1 4
WIDEND 128, 128
Z
ZCONF 106, 128, 128
Zonal Detail 96

Cube Analyst Reference Guide 171

Index
Z

172

Cube Analyst Reference Guide

Citilabs, Inc.
1211 Miccosukee
Tallahassee FL 32308 USA
World Wide Web
www.citilabs.com

You might also like