Conference

ICLR 2016

· March 7, 2026 · 10 min read · 9 views

ICLR 2016 Main Accepted Papers Other Years ICLR 2016 Basic Information When May 2 - 4, 2016 Where Caribe Hilton, San Juan, Puerto Rico We recommend landing at the Luis Munoz Marin International Airport . The best way to get from this airport to the hotel is by taxi, which is about a 15 minute drive and costs around $20. Important: Local transmission of the Zika virus has been reported in Puerto Rico. While in most cases the symptoms of Zika are mild, women who are pregnant and women and men who may conceive a child in the near future have more reason to be concerned about the virus. The US Centers for Disease Control have reliable and current information on the Zika virus here: http://www.cdc.gov/zika/ We recommend that anyone with concerns about Zika virus review this information. Call for Papers (Main Track) For instructions on the submission process, go here . Call for Papers (Workshop Track) For instructions on the submission process, go here . Registration and Hotel Reservations To register and make hotel reservations, go here . On the day of the meeting, come pick up your badge at the Grand Salon Los Rosales, which is right next to the hotel (ask the staff of the hotel for more directions). Conference Wireless Access network: hmeeting password: iclr16 Video recordings of talks Talks are now available on videolectures.net: http://videolectures.net/iclr2016_san_juan/ Discussion, Forum, Pictures on the ICLR Facebook Page https://www.facebook.com/iclr.cc Feedback Poll We've created a poll to gather feedback and suggestions on ICLR: https://www.facebook.com/events/1737067246550684/permalink/1737070989883643/ Please participate by upvoting the suggestions you like or adding your own suggestions. Committee Senior Program Chair Hugo Larochelle, Twitter and Université de Sherbrooke Program Chairs Samy Bengio, Google Brian Kingsbury, IBM Watson Group General Chairs Yoshua Bengio, Université de Montreal Yann LeCun, New York University and Facebook Area Chairs Ryan Adams, Twitter and Harvard Antoine Bordes, Facebook KyungHyun Cho, New York University Adam Coates, Baidu Aaron Courville, Université de Montréal Trevor Darrell, University of California, Berkeley Ian Goodfellow, Google Roger Grosse, University of Toronto Nicolas Le Roux, Criteo Honglak Lee, University of Michigan Julien Mairal, INRIA Chris Manning, Stanford University Roland Memisevic, Université de Montréal Joelle Pineau, McGill University John Platt, Google Marc'Aurelio Ranzato, Facebook Tara Sainath, Google Ruslan Salakhutdinov, University of Toronto Raquel Urtasun, University of Toronto Contact iclr2016.programchairs@gmail.com Sponsors We are currently taking sponsorship applications for ICLR 2016. Companies interested in sponsoring should contact us at iclr2016.programchairs@gmail.com . Platinum Gold Silver Bronze Conference Schedule Date Start End Event Details May 2 7:30 8:50 breakfast San Cristobal Ballroom [Sponsored by Baidu Research] 8:50 12:30 Oral Session - Los Rosales Grand Salon A&B 8:50 9:00 opening Opening remarks 9:00 9:40 keynote Sergey Levine (University of Washington): Deep Robotic Learning 9:40 10:00 oral Neural Programmer-Interpreters by Scott Reed, Nando de Freitas (Best Paper Award Recipient) 10:00 10:20 oral Regularizing RNNs by Stabilizing Activations by David Krueger, Roland Memisevic 10:20 10:50 coffee break 10:50 11:30 keynote Chris Dyer (CMU): Should Model Architecture Reflect Linguistic Structure? 11:30 11:50 oral BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies by Shihao Ji, Swaminathan Vishwanathan, Nadathur Satish, Michael Anderson, Pradeep Dubey 11:50 12:10 oral The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations by Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston 12:10 12:30 oral Towards Universal Paraphrastic Sentence Embeddings by John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu 12:30 14:00 lunch On your own 14:00 17:00 posters Workshop Track Posters (May 2nd)

Los Rosales Grand Salon C,D,E and Garita & Cariba Salons

[Sponsored by Facebook] 17:30 19:00 dinner San Cristobal Ballroom May 3 7:30 9:00 breakfast Las Olas Terrace [Sponsored by NVIDIA] 9:00 12:30 Oral Session - Los Rosales Grand Salon A&B 9:00 9:40 keynote Anima Anandkumar (UC Irvine): Guaranteed Non-convex Learning Algorithms through Tensor Factorization 9:40 10:00 oral Convergent Learning: Do different neural networks learn the same representations? by Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson, John Hopcroft 10:00 10:20 oral Net2Net: Accelerating Learning via Knowledge Transfer by Tianqi Chen, Ian Goodfellow, Jon Shlens 10:20 10:50 coffee break 10:50 11:30 keynote Neil Lawrence (University of Sheffield): Beyond Backpropagation: Uncertainty Propagation 11:30 11:50 oral Variational Gaussian Process by Dustin Tran, Rajesh Ranganath, David Blei 11:50 12:10 oral The Variational Fair Autoencoder by Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel 12:10 12:30 oral A note on the evaluation of generative models by Lucas Theis, Aäron van den Oord, Matthias Bethge 12:30 14:00 lunch On your own 14:00 17:00 posters Workshop Track Posters (May 3rd)

Los Rosales Grand Salon C,D,E and Garita & Cariba Salons

[Sponsored by Google] 17:30 19:00 dinner San Cristobal Ballroom [Sponsored by Intel] May 4 7:30 9:00 breakfast Las Olas Terrace 9:00 12:30 Oral Session - Los Rosales Grand Salon A&B 9:00 9:30 town hall ICLR town hall meeting (open discussion) 9:30 9:40 break 9:40 10:00 oral Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding by Song Han, Huizi Mao, Bill Dally (Best Paper Award Recipient) 10:00 10:20 oral Neural Networks with Few Multiplications by Zhouhan Lin, Matthieu Courbariaux, Roland Memisevic, Yoshua Bengio 10:20 10:50 coffee break 10:50 11:30 keynote Raquel Urtasun (University of Toronto): Incorporating Structure in Deep Learning 11:30 11:50 oral Order-Embeddings of Images and Language by Ivan Vendrov, Ryan Kiros, Sanja Fidler, Raquel Urtasun 11:50 12:10 oral Generating Images from Captions with Attention by Elman Mansimov, Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov 12:10 12:30 oral Density Modeling of Images using a Generalized Normalization Transformation by Johannes Ballé, Valero Laparra, Eero Simoncelli 12:30 14:00 lunch On your own 14:00 17:00 posters Conference Track Posters

Los Rosales Grand Salon C,D,E and Garita & Cariba Salons

[Sponsored by Twitter] 17:30 19:00 dinner San Cristobal Ballroom Keynote Talks Sergey Levine Deep Robotic Learning The problem of building an autonomous robot has traditionally been viewed as one of integration: connecting together modular components, each one designed to handle some portion of the perception and decision making process. For example, a vision system might be connected to a planner that might in turn provide commands to a low-level controller that drives the robot's motors. In this talk, I will discuss how ideas from deep learning can allow us to build robotic control mechanisms that combine both perception and control into a single system. This system can then be trained end-to-end on the task at hand. I will show how this end-to-end approach actually simplifies the perception and control problems, by allowing the perception and control mechanisms to adapt to one another and to the task. I will also present some recent work on scaling up deep robotic learning on a cluster consisting of multiple robotic arms, and demonstrate results for learning grasping strategies that involve continuous feedback and hand-eye coordination using deep convolutional neural networks. BIO: Sergey Levine is an assistant professor at the University of Washington. His research focuses on robotics and machine learning. In his PhD thesis, he developed a novel guided policy search algorithm for learning complex neural network control policies, which was later applied to enable a range of robotic tasks, including end-to-end training of policies for perception and control. He has also developed algorithms for learning from demonstration, inverse reinforcement learning, efficient training of stochastic neural networks, computer vision, and data-driven character animation. Chris Dyer Should Model Architecture Reflect Linguistic Structure? Sequential recurrent neural networks (RNNs) over finite alphabets are remarkably effective models of natural language. RNNs now obtain language modeling results that substantially improve over long-standing state-of-the-art baselines, as well as in various conditional language modeling tasks such as machine translation, image caption generation, and dialogue generation. Despite these impressive results, such models are a priori inappropriate models of language. One point of criticism is that language users create and understand new words all the time, challenging the finite vocabulary assumption. A second is that relationships among words are computed in terms of latent nested structures rather than sequential surface order (Chomsky, 1957; Everaert, Huybregts, Chomsky, Berwick, and Bolhuis, 2015). In this talk I discuss two models that explore the hypothesis that more (a priori) appropriate models of language will lead to better performance on real-world language processing tasks. The first composes sub word units (bytes, characters, or morphemes) into lexical representations, enabling more naturalistic interpretation and generation of novel word forms. The second, which we call recurrent neural network grammars (RNNGs), is a new generative model of sentences that explicitly models nested, hierarchical relationships among words and phrases. RNNGs operate via a recursive syntactic process reminiscent of probabilistic context-free grammar generation, but decisions are parameterized using RNNs that condition on the entire (top-down, left-to-right) syntactic derivation history, greatly relaxing context-free independence assumptions. Experimental results show that RNNGs obtain better results in generating language than models that don’t exploit linguistic structures. BIO: Chris Dyer is an assistant professor in the Language Technologies Institute and Machine Learning Department at Carnegie Mellon University. He obtained his PhD in Linguistics at the University of Maryland under Philip Resnik in 2010. His work has been nominated for—and occasionally received—best paper awards at EMNLP, NAACL, and ACL . Anima Anandkumar Guaranteed Non-convex Learning Algorithms through Tensor Factorization Modern machine learning involves massive datasets of text, images, videos, biological data, and so on. Most learning tasks can be framed as optimization problems which turn out to be non-convex and NP-hard to solve. This hardness barrier can be overcome by: (i) focusing on conditions which make learning tractable, (ii) replacing the given optimization objective with better behaved ones, and (iii) exploiting non-obvious connections that abound in learning problems. I will discuss the above in the context of: (i) unsupervised learning of latent variable models and (ii) training multi-layer neural networks, through a novel framework involving spectral decomposition of moment matrices and tensors. Tensors are rich structures that can encode higher order relationships in data. Despite being non-convex, tensor decomposition can be solved optimally using simple iterative algorithms under mild conditions. In practice, tensor methods yield enormous gains both in running times and learning accuracy over traditional methods for training probabilistic models such as variational inference. These positive results demonstrate that many challenging learning tasks can be solved efficiently, both in theory and in practice. BIO: Anima Anandkumar is a faculty at the EECS Dept. at U.C.Irvine since August 2010. Her research interests are in the areas of large-scale machine learning, non-convex optimization and high-dimensional statistics. In particular, she has been spearheading the development and analysis of tensor algorithms for a variety of learning problems. She is the recipient of the Alfred. P. Sloan Fellowship, Microsoft Faculty Fellowship, Google research award, ARO and AFOSR Young Investigator Awards, NSF CAREER Award, Early Career Excellence in Research Award at UCI, Best Thesis Award from the ACM SIGMETRICS society, IBM Fran Allen PhD fellowship, and best paper awards from the ACM SIGMETRICS and IEEE Signal Processing societies. She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She was a postdoctoral researcher at MIT from 2009 to 2010, and a visiting faculty at Microsoft Research New England in 2012 and 2014. Anima Neil Lawrence Beyond Backpropagation: Uncertainty Propagation Deep learning is founded on composable functions that are structured to capture regularities in data and can have their parameters optimized by backpropagation (differentiation via the chain rule). Their recent success is founded on the increased availability of data and computational power. However, they are not very data efficient. In low data regimes parameters are not well determined and severe overfitting can occur. The solution is to explicitly handle the indeterminacy by converting it to parameter uncertainty and propagating it through the model. Uncertainty propagation is more involved than backpropagation because it involves convolving the composite functions with probability distributions and integration is more challenging than differentiation. We will present one approach to fitting such models using Gaussian processes. The resulting models perform very well in both supervised and unsupervised learning on small data sets. The remaining challenge is to scale the algorithms to much larger data. BIO: Neil Lawrence is Professor of Machine Learning at the University of Sheffield. His expertise is in probabilistic modelling with a particular focus on Gaussian processes and a strong interest in bridging the worlds of mechanistic and empirical models. Raquel Urtasun Title: Incorporating Structure in Deep Learning Deep learning algorithms attempt to model high-level abstractions of the data using architectures composed of multiple non-linear transformations. A multiplicity of variants have been proposed and shown to be extremely successful in a wide variety of applications including computer vision, speech recognition as well as natural language processing. In this talk I’ll show how to make these representations more powerful by exploiting structure in the outputs, the loss function as well as in the learned embeddings. Many problems in real-world applications involve predicting several random variables that are statistically related. Graphical models have been typically employed to represent and exploit the output dependencies. However, most current learning algorithm

Sources

ICLR

ICLR 2016

AI Commentary

Sources

Related Articles

Google Maps

Find Your Next Job

A Retrospective on the ICLR 2026 Review Process

Retrospective on PAT x ICML 2026 AI Paper Assistant Program

JCG, PC

HSOLLC Co., Ltd.