Experimental design techniques were first developed in agriculture where
the data collection process takes years and every bit of information counts.
We can best help our customers if we are involved since the early stages
of data collection, to make it most cost effective and efficient. We can
work closely with the engineers to solve difficult problems of calibration
and normalization. We use well developed tools of statistical learning
theory to predict the minimum size data sets required to train the learning
machine and accurately predict its performance on unseen data.
Exploratory data analysis
We use all the tools of classical statistics, including Principal Component
Analysis and clustering to check the data sanity and visualize its intrinsic
structure.
Confidence intervals and hypothesis testing
The last step of data analysis consists in assessing with what confidence
certain claims can be made. Examples in machine learning include finding
with what confidence we can assert that one learning machine will make
better predictions than another or with what confidence we can assert that
the error rate of a learning machine will be less than a certain value.
Other examples in classical statistics include finding how much we can
trust a given correlation between variables or the invariance of a variable
with respect to a given parameter change. We can give you answers to these
questions using well known hypothesis testing methods, including the T-test
and the analysis of variance. We also design our own tests as needed.