ICLR 2016
Main
Accepted Papers
Other Years
ICLR 2016
Basic Information
When
May 2 - 4, 2016
Where
Caribe Hilton, San Juan, Puerto Rico
We recommend landing at the
Luis Munoz Marin International Airport
. The best way to get from this airport to the hotel is by taxi, which is about a 15 minute drive and costs around $20.
Important:
Local transmission of the Zika virus has been reported in Puerto
Rico. While in most cases the symptoms of Zika are mild, women who
are pregnant and women and men who may conceive a child in the near
future have more reason to be concerned about the virus.
The US Centers for Disease Control have reliable and current
information on the Zika virus here:
http://www.cdc.gov/zika/
We recommend that anyone with concerns about Zika virus review this
information.
Call for Papers (Main Track)
For instructions on the submission process, go
here
.
Call for Papers (Workshop Track)
For instructions on the submission process, go
here
.
Registration and Hotel Reservations
To register and make hotel reservations, go
here
.
On the day of the meeting, come pick up your badge at the Grand Salon Los Rosales, which is right next to the hotel (ask the staff of the hotel for more directions).
Conference Wireless Access
network:
hmeeting
password:
iclr16
Video recordings of talks
Talks are now available on videolectures.net:
http://videolectures.net/iclr2016_san_juan/
Discussion, Forum, Pictures on the ICLR Facebook Page
https://www.facebook.com/iclr.cc
Feedback Poll
We've created a poll to gather feedback and suggestions on ICLR:
https://www.facebook.com/events/1737067246550684/permalink/1737070989883643/
Please participate by upvoting the suggestions you like or adding your own suggestions.
Committee
Senior Program Chair
Hugo Larochelle, Twitter and Université de Sherbrooke
Program Chairs
Samy Bengio, Google
Brian Kingsbury, IBM Watson Group
General Chairs
Yoshua Bengio, Université de Montreal
Yann LeCun, New York University and Facebook
Area Chairs
Ryan Adams, Twitter and Harvard
Antoine Bordes, Facebook
KyungHyun Cho, New York University
Adam Coates, Baidu
Aaron Courville, Université de Montréal
Trevor Darrell, University of California, Berkeley
Ian Goodfellow, Google
Roger Grosse, University of Toronto
Nicolas Le Roux, Criteo
Honglak Lee, University of Michigan
Julien Mairal, INRIA
Chris Manning, Stanford University
Roland Memisevic, Université de Montréal
Joelle Pineau, McGill University
John Platt, Google
Marc'Aurelio Ranzato, Facebook
Tara Sainath, Google
Ruslan Salakhutdinov, University of Toronto
Raquel Urtasun, University of Toronto
Contact
iclr2016.programchairs@gmail.com
Sponsors
We are currently taking sponsorship applications for ICLR 2016. Companies interested in sponsoring should contact us at
iclr2016.programchairs@gmail.com
.
Platinum
Gold
Silver
Bronze
Conference Schedule
Date
Start
End
Event
Details
May 2
7:30
8:50
breakfast
San Cristobal Ballroom
[Sponsored by Baidu Research]
8:50
12:30
Oral Session - Los Rosales Grand Salon A&B
8:50
9:00
opening
Opening remarks
9:00
9:40
keynote
Sergey Levine (University of Washington):
Deep Robotic Learning
9:40
10:00
oral
Neural Programmer-Interpreters
by Scott Reed, Nando de Freitas
(Best Paper Award Recipient)
10:00
10:20
oral
Regularizing RNNs by Stabilizing Activations
by David Krueger, Roland Memisevic
10:20
10:50
coffee break
10:50
11:30
keynote
Chris Dyer (CMU):
Should Model Architecture Reflect Linguistic Structure?
11:30
11:50
oral
BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies
by Shihao Ji, Swaminathan Vishwanathan, Nadathur Satish, Michael Anderson, Pradeep Dubey
11:50
12:10
oral
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
by Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston
12:10
12:30
oral
Towards Universal Paraphrastic Sentence Embeddings
by John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu
12:30
14:00
lunch
On your own
14:00
17:00
posters
Workshop Track Posters (May 2nd)
Los Rosales Grand Salon C,D,E and Garita & Cariba Salons
[Sponsored by Facebook]
17:30
19:00
dinner
San Cristobal Ballroom
May 3
7:30
9:00
breakfast
Las Olas Terrace
[Sponsored by NVIDIA]
9:00
12:30
Oral Session - Los Rosales Grand Salon A&B
9:00
9:40
keynote
Anima Anandkumar (UC Irvine):
Guaranteed Non-convex Learning Algorithms through Tensor Factorization
9:40
10:00
oral
Convergent Learning: Do different neural networks learn the same representations?
by Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson, John Hopcroft
10:00
10:20
oral
Net2Net: Accelerating Learning via Knowledge Transfer
by Tianqi Chen, Ian Goodfellow, Jon Shlens
10:20
10:50
coffee break
10:50
11:30
keynote
Neil Lawrence (University of Sheffield):
Beyond Backpropagation: Uncertainty Propagation
11:30
11:50
oral
Variational Gaussian Process
by Dustin Tran, Rajesh Ranganath, David Blei
11:50
12:10
oral
The Variational Fair Autoencoder
by Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel
12:10
12:30
oral
A note on the evaluation of generative models
by Lucas Theis, Aäron van den Oord, Matthias Bethge
12:30
14:00
lunch
On your own
14:00
17:00
posters
Workshop Track Posters (May 3rd)
Los Rosales Grand Salon C,D,E and Garita & Cariba Salons
[Sponsored by Google]
17:30
19:00
dinner
San Cristobal Ballroom
[Sponsored by Intel]
May 4
7:30
9:00
breakfast
Las Olas Terrace
9:00
12:30
Oral Session - Los Rosales Grand Salon A&B
9:00
9:30
town hall
ICLR town hall meeting (open discussion)
9:30
9:40
break
9:40
10:00
oral
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
by Song Han, Huizi Mao, Bill Dally
(Best Paper Award Recipient)
10:00
10:20
oral
Neural Networks with Few Multiplications
by Zhouhan Lin, Matthieu Courbariaux, Roland Memisevic, Yoshua Bengio
10:20
10:50
coffee break
10:50
11:30
keynote
Raquel Urtasun (University of Toronto):
Incorporating Structure in Deep Learning
11:30
11:50
oral
Order-Embeddings of Images and Language
by Ivan Vendrov, Ryan Kiros, Sanja Fidler, Raquel Urtasun
11:50
12:10
oral
Generating Images from Captions with Attention
by Elman Mansimov, Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov
12:10
12:30
oral
Density Modeling of Images using a Generalized Normalization Transformation
by Johannes Ballé, Valero Laparra, Eero Simoncelli
12:30
14:00
lunch
On your own
14:00
17:00
posters
Conference Track Posters
Los Rosales Grand Salon C,D,E and Garita & Cariba Salons
[Sponsored by Twitter]
17:30
19:00
dinner
San Cristobal Ballroom
Keynote Talks
Sergey Levine
Deep Robotic Learning
The problem of building an autonomous robot has traditionally been viewed as one of integration: connecting together modular components, each one designed to handle some portion of the perception and decision making process. For example, a vision system might be connected to a planner that might in turn provide commands to a low-level controller that drives the robot's motors. In this talk, I will discuss how ideas from deep learning can allow us to build robotic control mechanisms that combine both perception and control into a single system. This system can then be trained end-to-end on the task at hand. I will show how this end-to-end approach actually simplifies the perception and control problems, by allowing the perception and control mechanisms to adapt to one another and to the task. I will also present some recent work on scaling up deep robotic learning on a cluster consisting of multiple robotic arms, and demonstrate results for learning grasping strategies that involve continuous feedback and hand-eye coordination using deep convolutional neural networks.
BIO:
Sergey Levine is an assistant professor at the University of Washington. His research focuses on robotics and machine learning. In his PhD thesis, he developed a novel guided policy search algorithm for learning complex neural network control policies, which was later applied to enable a range of robotic tasks, including end-to-end training of policies for perception and control. He has also developed algorithms for learning from demonstration, inverse reinforcement learning, efficient training of stochastic neural networks, computer vision, and data-driven character animation.
Chris Dyer
Should Model Architecture Reflect Linguistic Structure?
Sequential recurrent neural networks (RNNs) over finite alphabets are remarkably effective models of natural language. RNNs now obtain language modeling results that substantially improve over long-standing state-of-the-art baselines, as well as in various conditional language modeling tasks such as machine translation, image caption generation, and dialogue generation. Despite these impressive results, such models are a priori inappropriate models of language. One point of criticism is that language users create and understand new words all the time, challenging the finite vocabulary assumption. A second is that relationships among words are computed in terms of latent nested structures rather than sequential surface order (Chomsky, 1957; Everaert, Huybregts, Chomsky, Berwick, and Bolhuis, 2015).
In this talk I discuss two models that explore the hypothesis that more (a priori) appropriate models of language will lead to better performance on real-world language processing tasks. The first composes sub word units (bytes, characters, or morphemes) into lexical representations, enabling more naturalistic interpretation and generation of novel word forms. The second, which we call recurrent neural network grammars (RNNGs), is a new generative model of sentences that explicitly models nested, hierarchical relationships among words and phrases. RNNGs operate via a recursive syntactic process reminiscent of probabilistic context-free grammar generation, but decisions are parameterized using RNNs that condition on the entire (top-down, left-to-right) syntactic derivation history, greatly relaxing context-free independence assumptions. Experimental results show that RNNGs obtain better results in generating language than models that don’t exploit linguistic structures.
BIO:
Chris Dyer is an assistant professor in the Language Technologies Institute and Machine Learning Department at Carnegie Mellon University. He obtained his PhD in Linguistics at the University of Maryland under Philip Resnik in 2010. His work has been nominated for—and occasionally received—best paper awards at EMNLP, NAACL, and
ACL
.
Anima Anandkumar
Guaranteed Non-convex Learning Algorithms through Tensor Factorization
Modern machine learning involves massive datasets of text, images,
videos, biological data, and so on. Most learning tasks can be framed
as optimization problems which turn out to be non-convex and NP-hard
to solve. This hardness barrier can be overcome by: (i) focusing on
conditions which make learning tractable, (ii) replacing the given
optimization objective with better behaved ones, and (iii) exploiting
non-obvious connections that abound in learning problems.
I will discuss the above in the context of: (i) unsupervised learning
of latent variable models and (ii) training multi-layer neural
networks, through a novel framework involving spectral decomposition
of moment matrices and tensors. Tensors are rich structures that can
encode higher order relationships in data. Despite being non-convex,
tensor decomposition can be solved optimally using simple iterative
algorithms under mild conditions. In practice, tensor methods yield
enormous gains both in running times and learning accuracy over
traditional methods for training probabilistic models such as
variational inference. These positive results demonstrate that many
challenging learning tasks can be solved efficiently, both in theory
and in practice.
BIO:
Anima Anandkumar is a faculty at the EECS Dept. at U.C.Irvine since
August 2010. Her research interests are in the areas of large-scale
machine learning, non-convex optimization and high-dimensional
statistics. In particular, she has been spearheading the development
and analysis of tensor algorithms for a variety of learning problems.
She is the recipient of the Alfred. P. Sloan Fellowship, Microsoft
Faculty Fellowship, Google research award, ARO and AFOSR Young
Investigator Awards, NSF CAREER Award, Early Career Excellence in
Research Award at UCI, Best Thesis Award from the ACM SIGMETRICS
society, IBM Fran Allen PhD fellowship, and best paper awards from the
ACM SIGMETRICS and IEEE Signal Processing societies. She received her
B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD
from Cornell University in 2009. She was a postdoctoral researcher at
MIT from 2009 to 2010, and a visiting faculty at Microsoft Research
New England in 2012 and 2014.
Anima
Neil Lawrence
Beyond Backpropagation: Uncertainty Propagation
Deep learning is founded on composable functions that are structured to capture regularities in data and can have their parameters optimized by backpropagation (differentiation via the chain rule). Their recent success is founded on the increased availability of data and computational power. However, they are not very data efficient. In low data regimes parameters are not well determined and severe overfitting can occur. The solution is to explicitly handle the indeterminacy by converting it to parameter uncertainty and propagating it through the model. Uncertainty propagation is more involved than backpropagation because it involves convolving the composite functions with probability distributions and integration is more challenging than differentiation.
We will present one approach to fitting such models using Gaussian processes. The resulting models perform very well in both supervised and unsupervised learning on small data sets. The remaining challenge is to scale the algorithms to much larger data.
BIO:
Neil Lawrence is Professor of Machine Learning at the University of Sheffield. His expertise is in probabilistic modelling with a particular focus on Gaussian processes and a strong interest in bridging the worlds of mechanistic and empirical models.
Raquel Urtasun
Title: Incorporating Structure in Deep Learning
Deep learning algorithms attempt to model high-level abstractions of the data using architectures composed of multiple non-linear transformations. A multiplicity of variants have been proposed and shown to be extremely successful in a wide variety of applications including computer vision, speech recognition as well as natural language processing. In this talk I’ll show how to make these representations more powerful by exploiting structure in the outputs, the loss function as well as in the learned embeddings.
Many problems in real-world applications involve predicting several random variables that are statistically related. Graphical models have been typically employed to represent and exploit the output dependencies. However, most current learning algorithm