For the slides and preprints, see [slides/papers/photos]
Morning invited speakers
Challenges in Medical Image Analysis: Comparison, Competition, Collaboration
Medical image analysis is a growing field in which machine learning is increasingly used as the core technology. Digital radiology is now available for almost 20 years and large repositories of digital radiological data have by now been established. The performance of computer algorithms for dedicated task in increasing and approaching that of human experts, and in some cases even surpassing it. Validation of algorithm performance, and fair comparisons between algorithms is therefore increasingly important. Starting in 2007, challenges in which different algorithms for a particular task are compared on a common test data set have become regular events at major medical image analysis conferences. A large number of challenges have been summarized in jointly written overview papers which are highly cited. We are maintaining a website, www.grand-challenge.org, keeping track of these events and providing a platform for hosting these events. In my presentation I will review major challenges and discuss lessons learned. New trends are combining algorithms in an automated manner and making algorithms available in virtual machines. I will discuss a few major upcoming high profile challenges such as coding4cancer.org and challenges in new imaging areas outside of radiology, such as retinal imaging and digital pathology.
Techniques and Technologies for Efficient and Realistic Benchmarks: Examples from the MediaEval Multimedia Benchmark and CLEF NewsREEL
A successful and effective machine learning challenge must be efficient to execute and must also maintain a strong connection to real-world use scenarios. This talk discusses how these issues are handled by two different benchmarks: the MediaEval Benchmarking Initiative for Multimedia Evaluation (http://multimediaeval.org) and the CLEF NewsREEL News Recommendation Evaluation Lab (http://www.clef-newsreel.org).
MediaEval offers challenges that are tightly tied to real-world use scenarios. A unique characteristic of MediaEval is that it is a “bottom-up benchmark”: a coalition of autonomous tasks independent of a single central authority. We discuss how MediaEval tasks are designed and organized. An example task is “Person Discovery”. This task requires participants to tag people speaking in broadcast TV content, without using a pre-defined list of speakers. The task is particularly interesting since it has given rise to the CAMOMILE platform (https://github.com/camomile-project/camomile-server) that supports the full benchmarking workflow (registration, distribution, submission, collaborative annotation, evaluation) and offers a live leaderboard. The platform can be used by any challenge that involves the association between a video fragment and an annotation.
NewsREEL offers a news recommendation challenge. This challenge is unique in that it does not use a predefined data set, but is rather run online. Challenge participants must implement systems capable of responding to a stream of recommendation requests while respecting a response time limit. Recommendations produced by the participant systems are presented to website visitors, and systems are judged by the number of clicks that they receive within the challenge time window. Research related to the NewsREEL challenge focuses on developing a recommender system framework, called Idomaar (http://rf.crowdrec.eu) that is capable of replaying streams, thus simulating online evaluation offline. Connecting online and offline evaluation allows participants to understand the challenge and more efficiently develop systems for online use.
AutoML break out session
The AutoML challenge: design and first results
ChaLearn is organizing an Automatic Machine Learning challenge (AutoML) to solve classification and regression problems from given feature representations, without any human intervention. This is a challenge with code submission: the code submitted can be executed automatically on the challenge servers to train and test learning machines on new datasets. However, there is no obligation to submit code. Half of the prizes can be won by just submitting prediction results. There are six rounds (Prep, Novice, Intermediate, Advanced, Expert, and Master) in which datasets of progressive difficulty are introduced (5 per round). There is no requirement to participate in previous rounds to enter a new round. The rounds alternate AutoML phases in which submitted code is “blind tested” on datasets the participants have never seen before, and Tweakathon phases giving time (several months) to the participants to improve their methods by tweaking their code on those datasets. This challenge will push the state-of-the-art in fully automatic machine learning on a wide range of problems taken from real world applications.
We are mid-way through the challenge. We will analyze the first results.
There are $30,000 of prizes donated by Microsoft. At the workshop, we will be announcing a new GPU track with 3 TitanX boards donated by NVIDIA for the three winners.
Team aaad_freiburg. First place AutoML1 phase, second place AutoML2 phase.
Automated Machine Learning: Successes & Challenges
Data scientists regularly need to select the best machine learning algorithm for the data at hand, tune its hyperparameters, and engineer & select features. While this process is tedious, error-prone, and often irreproducible when carried out manually, Automated Machine Learning (AutoML) aims to transform it into an engineering science with principled, effective, reproducible methods.
In this talk, we will discuss the current state-of-the-art in AutoML, recent successes of the field, and challenges we face in improving AutoML systems further. Finally, taking inspiration from the Configurable SAT Solver Competition (CSSC) we organized, we propose the design of a new type of AutoML competition that decouples the machine learning framework to be optimized from the methods to optimize it.
Team jrl44/backstreet.bayes. First place AutoML2 phase, second place AutoML1 phase.
Sensible allocation of computation for ensemble construction
In this talk I will describe the methods implemented in the 1st place entry to the 2nd round of the AutoML challenge, auto-track. The methods are most succinctly described as an extension of freeze-thaw Bayesian optimization to ensemble construction. In particular, we assume that we have access to a finite number of iterative learning algorithms and some method of forming an ensemble prediction from several base predictions. We then consider the decision process of which iterative algorithms to run and for how long in order to maximise the performance of the ensembled predictions at the end of some time limit. Our approach to this Markov decision process is to model the reward function in a Bayesian fashion and then follow a mostly myopic strategy. I will also discuss some technical details such as memory management and asynchrony.
Team ideal.intel.analytics. First place Final0 phase, second place Final1 phase.
Scalable ensemble learning with stochastic feature boosting
Large scale industrial applications frequently deal with massive datasets of different nature requiring flexible and robust learning engines that could be used with minimum supervision and tuning by wider engineering community while providing state of the art results. In this talk we present a framework that demonstrated top results at all stages of the AutoML challenge so far. Our approach combines simultaneous tree and feature boosting methodology capable of handling noisy, very large in both dimensions datasets with potentially only a small fraction of relevant features.
Afternoon invited speaker
Lessons Learned from the PASCAL VOC Challenges, and Improving the Data Analytics Process
The PASCAL Visual Object Classification Challenges ran from 2005 to 2012. I will discuss aspects of the challenge and lessons learned, including augmentation of the dataset each year, approaches to counteract reduction in the diversity of methods submitted over time, and assessing the statistical significance of performance differences. One interesting aspect of the VOC challenges was that the process of feature extraction from images was not defined by the organizers, allowing participants the freedom to use different representations of the raw data. In general the task of going from raw data to actionable knowledge involves many stages and processes, with feedback cycles. It is widely acknowledged that often 80% of the effort in a data mining project is devoted to getting to know the data (exploration) and preparing it ready for analysis. Various academic disciplines study parts of the process, but there is a great need to look at the process as a whole. In the second part of the talk I will discuss the outcomes of a recent Alan Turing Institute scoping workshop on this topic held in Edinburgh (UK) in November 2015.
Henry Z. Lo and Joseph Paul Cohen
Academic Torrents: Scalable Data Distribution
As competitions get more popular, transferring ever-larger data sets becomes infeasible and costly. For example, downloading the 157.3 GB 2012 ImageNet data set incurs about $4.33 in bandwidth costs per download. Downloading the full ImageNet data set takes 33 days. ImageNet has since become popular beyond the competition, and many papers and models now revolve around this data set. For sharing such an important resource to the machine learning community, the sharers of ImageNet must shoulder a large bandwidth burden. Academic Torrents reduces this burden for disseminating competition data, and also increases download speeds for end users 1 . By augmenting an existing HTTP server with a peer-to-peer swarm, requests get re-routed to get data from downloaders. While existing systems slow down with more users, the benefits of Academic Torrents grow, with noticeable effects even when only one other person is downloading. [more]
CiML 2015 >