Scientific Methodology and Experimental Evaluation

Objective

The course aims to provide the fundamental basis for a sound scientific methodology of experimental evaluation in computer science. This lecture emphasizes on methodological aspects of measurement and on the statistics needed to analyze computer systems, human-computer interaction systems, and machine learning systems. We first sensibilize the audience to reproducibility issues related to empirical research in computer science as well as to ethical and scientific integrity aspects. Then we present tools that help address the aforementioned issues and we give the audience the basis of probabilities and statistics required to develop sound experiment designs. The content of the lecture is therefore both theoretical and practical, illustrated by a lot of case studies and practical sessions. The goal is not to provide analysis recipes or techniques that researchers can blindly apply but to make students develop critical thinking and understand some simple (and possibly not-so-simple) tools so that they can both readily use and explore later on.

Teachers

Élise Arnaud (Design of Experiments, Data assimilation, Sensibility Analysis)
Céline Coutrix (Human Machine Interaction, Consent forms, Ethics)
Arnaud Legrand (Open Science, Reproducible Research, Statistics, Design of Experiments)
Jean-Marc Vincent (Markovian Models, Performance Evaluation, Epistemology, Tracing, Simulation)

Prerequisites

The lecture is self-content and targets 2nd year master students in computer science. We will mostly use the R language during the lecture but most programs will be a few lines of script and we will provide references to learn the basics.

References

Last year's series of lectures: https://github.com/alegrand/SMPE/tree/master/sessions/2022_10_Grenoble#readme and the website for this year (https://github.com/alegrand/SMPE/tree/master/sessions/2023_10_Grenoble#readme)
The “Reproducible research: Methodological principles for a transparent science” MOOC: https://learninglab.inria.fr/en/mooc-recherche-reproductible-principes-methodologiques-pour-une-science-transparente/

Evaluation Marks are given after :

Several homeworks and practical evaluations counting for 50%,
and a 3 hours final written exam counting for 50%

Program

Here are the topics that will be covered during the lecture.

Epistemology, publications, ethics, scientific integrity, deontology
- Computer Science is an Experimental Science: Randomness is unavoidable whenever human beings are involved but can also not be ignored anymore given the complexity of modern computer systems (network, cpus, hardware/software stack) or when working in a machine learning context which relies on observational data and remains empirical.
- Science is defined by its method, not by its results: Claude Bernard, Karl Popper, Kuhn, Latake, ...
- Credibility crisis, Ethics, scientific integrity, deontology
Open Science and Reproducible Research
- Laboratory notebook
- Version control and archiving
- Data management
- Computational document (jupyter, Rstudio, orgmode)
- Software environment control (containers, package management systems)
- Ethical and legal data usage (data management plan, consent form, ...)
Exploratory Data Analysis
- Data curation (missing data, outliers, typing issues)
- Data visualisation and hypothesis checking
- Data processing pipelines
- Communicating results
Introduction to statistics
- Random variables, central limit theorem, confidence interval, statistical test
- Bayesian framework: Bayes rules, Maximum likelihood vs. Posterior sampling, Credible interval, Hierarchical modeling principles (exemple with clustering)
- ANOVA, Linear regression and extensions (mostly logistic)
- Gaussian Process
Observation vs. Experiment
- Correlation, Causation: mostly "dont's"
- Notions of bias (statistical, experimental, observationnal/sampling, etc.)
- Metrology: measurement and tracing, precision, practical computer science issues and tools
- Counter-factual/causal analysis
Experimental Design
- Méthodology (fishbone, experiment structure)
- Difference between quantitative/qualitative observational/experimental data/analysis
- Sequential vs. incremental approach
- 2-level factorial designs, screening designs, LHS/MaxiMin designs
- Active/online learning with bandits (\epsilon-Greedy, UCB, Thompson) and extensions (surrogates: GP-UCB, EI)