|
Distinguished Lecture Series in Statistical Science
September 28 and October 26, 2000
Peter G. Hall
|
Australian National University |
|
1. Data Tuning
Until recently, altering one's data was sacrilegious. A major problem
was that we didn't know how to do it objectively. Altering the data
according to objective criteria turns out to be a surprisingly computer-intensive
business, and in many instances wouldn't have been feasible a decade
or two ago. Today, however, thanks to the ready availability of
computing power, we can do all sorts of complex things to the data.
Data-tuning methods alter the data so as to enhance performance
of a relatively elementary technique. The idea is to retain the
advantageous features of the simpler method, and at the same time
improve its performance in specific ways. Different approaches to
data tuning include physically altering the data (data sharpening),
reweighting or tilting the data (the biased bootstrap), adding extra
"pseudo data" derived from the original data, or a combination
of all three. Tilting methods date back to the 1950's, although
only recently have they become popular. Evidence is growing, however,
that sharpening is more effective than tilting, since it doesn't
reduce effective sample size. |
2. Estimating Fault Lines and Boundaries
A fault line in a regression model with bivariate design, Zi = f(Xi,Yi)
+ error, is a curve in the (x,y)-plane along which the function
z = f(x,y) has a fault-type jump discontinuity. Such problems arise,
for example, in the measurement of benthic impacts or the estimation
of lines along which sea-surface temperatures change. The fault
is not necessarily the result of simple 'slippage', and in particular
gradients do not necessarily match at the top and bottom of the
fault. We shall describe methodology for both point and interval
estimation of fault lines, and for related problems such as estimation
of fault lines in density or intensity surfaces, or estimation of
support boundaries. |
|
|