## Table of Contents## Basic Information## WhenMay 2 - 4, 2016 ## WhereCaribe Hilton, San Juan, Puerto Rico
We recommend landing at the
The US Centers for Disease Control have reliable and current information on the Zika virus here: We recommend that anyone with concerns about Zika virus review this information. ## Registration and Hotel ReservationsTo register and make hotel reservations, go here. On the day of the meeting, come pick up your badge at the Grand Salon Los Rosales, which is right next to the hotel (ask the staff of the hotel for more directions). ## Conference Wireless Access
network: ## Video recordings of talksTalks are now available on videolectures.net: ## Discussion, Forum, Pictures on the ICLR Facebook Page## Feedback PollWe've created a poll to gather feedback and suggestions on ICLR: https://www.facebook.com/events/1737067246550684/permalink/1737070989883643/ Please participate by upvoting the suggestions you like or adding your own suggestions. ## Committee## Senior Program ChairHugo Larochelle, Twitter and Université de Sherbrooke ## Program Chairs
Samy Bengio, Google ## General Chairs
Yoshua Bengio, Université de Montreal ## Area Chairs
Ryan Adams, Twitter and Harvard ## Contact## SponsorsWe are currently taking sponsorship applications for ICLR 2016. Companies interested in sponsoring should contact us at iclr2016.programchairs@gmail.com. ## Platinum## Gold## Silver## Bronze## Conference Schedule
## Keynote Talks## Sergey Levine
The problem of building an autonomous robot has traditionally been viewed as one of integration: connecting together modular components, each one designed to handle some portion of the perception and decision making process. For example, a vision system might be connected to a planner that might in turn provide commands to a low-level controller that drives the robot's motors. In this talk, I will discuss how ideas from deep learning can allow us to build robotic control mechanisms that combine both perception and control into a single system. This system can then be trained end-to-end on the task at hand. I will show how this end-to-end approach actually simplifies the perception and control problems, by allowing the perception and control mechanisms to adapt to one another and to the task. I will also present some recent work on scaling up deep robotic learning on a cluster consisting of multiple robotic arms, and demonstrate results for learning grasping strategies that involve continuous feedback and hand-eye coordination using deep convolutional neural networks.
## Chris Dyer
Sequential recurrent neural networks (RNNs) over finite alphabets are remarkably effective models of natural language. RNNs now obtain language modeling results that substantially improve over long-standing state-of-the-art baselines, as well as in various conditional language modeling tasks such as machine translation, image caption generation, and dialogue generation. Despite these impressive results, such models are a priori inappropriate models of language. One point of criticism is that language users create and understand new words all the time, challenging the finite vocabulary assumption. A second is that relationships among words are computed in terms of latent nested structures rather than sequential surface order (Chomsky, 1957; Everaert, Huybregts, Chomsky, Berwick, and Bolhuis, 2015). In this talk I discuss two models that explore the hypothesis that more (a priori) appropriate models of language will lead to better performance on real-world language processing tasks. The first composes sub word units (bytes, characters, or morphemes) into lexical representations, enabling more naturalistic interpretation and generation of novel word forms. The second, which we call recurrent neural network grammars (RNNGs), is a new generative model of sentences that explicitly models nested, hierarchical relationships among words and phrases. RNNGs operate via a recursive syntactic process reminiscent of probabilistic context-free grammar generation, but decisions are parameterized using RNNs that condition on the entire (top-down, left-to-right) syntactic derivation history, greatly relaxing context-free independence assumptions. Experimental results show that RNNGs obtain better results in generating language than models that don’t exploit linguistic structures.
## Anima Anandkumar
Modern machine learning involves massive datasets of text, images, videos, biological data, and so on. Most learning tasks can be framed as optimization problems which turn out to be non-convex and NP-hard to solve. This hardness barrier can be overcome by: (i) focusing on conditions which make learning tractable, (ii) replacing the given optimization objective with better behaved ones, and (iii) exploiting non-obvious connections that abound in learning problems. I will discuss the above in the context of: (i) unsupervised learning of latent variable models and (ii) training multi-layer neural networks, through a novel framework involving spectral decomposition of moment matrices and tensors. Tensors are rich structures that can encode higher order relationships in data. Despite being non-convex, tensor decomposition can be solved optimally using simple iterative algorithms under mild conditions. In practice, tensor methods yield enormous gains both in running times and learning accuracy over traditional methods for training probabilistic models such as variational inference. These positive results demonstrate that many challenging learning tasks can be solved efficiently, both in theory and in practice.
## Neil Lawrence
Deep learning is founded on composable functions that are structured to capture regularities in data and can have their parameters optimized by backpropagation (differentiation via the chain rule). Their recent success is founded on the increased availability of data and computational power. However, they are not very data efficient. In low data regimes parameters are not well determined and severe overfitting can occur. The solution is to explicitly handle the indeterminacy by converting it to parameter uncertainty and propagating it through the model. Uncertainty propagation is more involved than backpropagation because it involves convolving the composite functions with probability distributions and integration is more challenging than differentiation. We will present one approach to fitting such models using Gaussian processes. The resulting models perform very well in both supervised and unsupervised learning on small data sets. The remaining challenge is to scale the algorithms to much larger data.
## Raquel Urtasun
Deep learning algorithms attempt to model high-level abstractions of the data using architectures composed of multiple non-linear transformations. A multiplicity of variants have been proposed and shown to be extremely successful in a wide variety of applications including computer vision, speech recognition as well as natural language processing. In this talk I’ll show how to make these representations more powerful by exploiting structure in the outputs, the loss function as well as in the learned embeddings. Many problems in real-world applications involve predicting several random variables that are statistically related. Graphical models have been typically employed to represent and exploit the output dependencies. However, most current learning algorithms assume that the models are log linear in the parameters. In the first part of the talk I’ll show a variety of algorithms that can learn arbitrary functions while exploiting the output dependencies, unifying deep learning and graphical models. Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application domain. In the second part of the talk I’ll show a direct loss minimization approach to train deep neural networks, which provably minimizes the task loss. This is often non-trivial, since these loss functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. I’ll demonstrate the applicability of this general framework in the context of maximizing average precession, a structured loss commonly used to evaluate ranking problems. Deep learning has become a very popular approach to learn word, sentence and/or image embeddings. Neural embeddings have shown great performance in tasks such as image captioning, machine translation and paraphrasing. In the last part of my talk I’ll show how to exploit the partial order structure of the visual semantic hierarchy over words, sentences and images to learn order embeddings. I’ll demonstrate the utility of these new representations for hypernym prediction and image-caption retrieval.
## Best Paper AwardsThis year, the program committee has decided to grant two Best Paper Awards to papers that were singled out for their impressive and original scientific contributions. The recipients of a Best Paper Award for ICLR 2016 are: - Neural Programmer-Interpreters
Scott Reed, Nando de Freitas - Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han, Huizi Mao, Bill Dally
The selection was made by the program chairs and was informed by the feedback from the reviewers and area chairs. ## Workshop Track Posters (May 2nd)- Deep Motif: Visualizing Genomic Sequence Classifications
Jack Lanchantin, Ritambhara Singh, Zeming Lin, & Yanjun Qi - Lookahead Convolution Layer for Unidirectional Recurrent Neural Networks
Chong Wang, Dani Yogatama, Adam Coates, Tony Han, Awni Hannun, Bo Xiao - Joint Stochastic Approximation Learning of Helmoltz Machines
HaotianXu, Zhijian Ou - A Minimalistic Approach to Sum-Product Network Learning for Real Applications
Viktoriya Krakovna, Moshe Looks - Hardware-Oriented Approximation of Convolutional Neural Networks
Philipp Gysel, Mohammad Motamedi, Soheil Ghiasi - Neurogenic Deep Learning
Timothy J. Draelos, Nadine E. Miner, Jonathan A. Cox, Christopher C. Lamb, Conrad D. James, James B. Aimone - Deep Bayesian Neural Nets as Deep Matrix Gaussian Processes
Christos Louizos, Max Welling - Neural Network Training Variations in Speech and Subsequent Performance Evaluation
Ewout van den Berg, Bhuvana Ramabhadran, Michael Picheny - Neural Variational Random Field Learning
Volodymyr Kuleshov, Stefano Ermon - Improving Variational Inference with Inverse Autoregressive Flow
Diederik P. Kingma, Tim Salimans, Max Welling - Learning Genomic Representations to Predict Clinical Outcomes in Cancer
Safoora Yousefi, Congzheng Song, Nelson Nauata, Lee Cooper - Understanding Very Deep Networks via Volume Conservation
Thomas Unterthiner, Sepp Hochreiter - Fixed Point Quantization of Deep Convolutional Networks
Darryl D. Lin, Sachin S. Talathi, V. Sreekanth Annapureddy - CMA-ES for Hyperparameter Optimization of Deep Neural Networks
Ilya Loshchilov, Frank Hutter - Understanding Visual Concepts with Continuation Learning
William F. Whitney, Michael Chang, Tejas Kulkarni, Joshua B. Tenenbaum ~~Input-Convex Deep Networks~~(moved to May 3rd) Brandon Amos, J. Zico Kolter- Learning to SMILE(S)
Stanisław Jastrzębski, Damian Leśniak, Wojciech Marian Czarnecki - Learning Retinal Tiling in a Model of Visual Attention
Brian Cheung, Eric Weiss, Bruno Olshausen - Hardware-Friendly Convolutional Neural Network with Even-Number Filter Size
Song Yao, Song Han, Kaiyuan Guo, Jianqiao Wangni, Yu Wang - Do Deep Convolutional Nets Really Need to be Deep (Or Even Convolutional)?
Gregor Urban, Krzysztof J. Geras, Samira Ebrahimi Kahou, Ozlem Aslan Shengjie Wang, Rich Caruana, Abdelrahman Mohamed, Matthai Philipose, Matt Richardson - Generative Adversarial Metric
Daniel Jiwoong Im, Chris Dongjoo Kim, Hui Jiang, Roland Memisevic - Revise Saturated Activation Functions
Bing Xu, Ruitong Huang, Mu Li - Multi-layer Representation Learning for Medical Concepts
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Elizabeth Searles, Catherine Coffey - Alternative structures for character-level RNNs
Piotr Bojanowski, Armand Joulin, Tomas Mikolov - Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke - Revisiting Distributed Synchronous SGD
Jianmin Chen, Rajat Monga, Samy Bengio, Rafal Jozefowicz - A Differentiable Transition Between Additive and Multiplicative Neurons
Wiebke Koepp, Patrick van der Smagt, Sebastian Urban - Deep Autoresolution Networks
Gabriel Pereyra, Christian Szegedy - Unsupervised Learning with Imbalanced Data via Structure Consolidation Latent Variable Model
Fariba Yousefi, Zhenwen Dai, Carl Henrik Ek, Neil Lawrence - Robust Convolutional Neural Networks under Adversarial Noise
Jonghoon Jin, Aysegul Dundar, Eugenio Culurciello - GradNets: Dynamic Interpolation Between Neural Architectures
Diogo Almeida, Nate Sauder - Resnet in Resnet: Generalizing Residual Architectures
Sasha Targ, Diogo Almeida, Kevin Lyman - Doctor AI: Predicting Clinical Events via Recurrent Neural Networks
Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F. Stewart, Joshua C. Denny, Bradley A. Malin, Jimeng Sun - On-the-fly Network Pruning for Object Detection
Marc Masana, Joost van de Weijer, Andrew D. Bagdanov - Deep Directed Generative Models with Energy-Based Probability Estimation
Taesup Kim, Yoshua Bengio - Rectified Factor Networks for Biclustering
Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter - RandomOut: Using a convolutional gradient norm to win The Filter Lottery
Joseph Paul Cohen, Henry Z. Lo, Wei Ding - Persistent RNNs: Stashing Weights on Chip
Greg Diamos, Shubho Sengupta, Bryan Catanzaro, Mike Chrzanowski, Adam Coates, Erich Elsen, Jesse Engel, Awni Hannun, Sanjeev Satheesh - Scale Normalization
Henry Z Lo, Kevin Amaral, Wei Ding - Close-to-clean regularization relates virtual adversarial training, ladder networks and others
Mudassar Abbas, Jyri Kivinen, Tapani Raiko - Guided Sequence-to-Sequence Learning with External Rule Memory
Jiatao Gu, Baotian Hu, Zhengdong Lu, Hang Li, Victor O.K. Li - Neural Text Understanding with Attention Sum Reader
Rudolf Kadlec, Martin Schmid, Ondřej Bajgar, Jan Kleindienst - Incorporating Nesterov Momentum into Adam
Timothy Dozat - Variational Inference for On-line Anomaly Detection in High-Dimensional Time Series
Maximilian Sölch, Justin Bayer, Marvin Ludersdorfer, Patrick van der Smagt - Sequence-to-Sequence RNNs for Text Summarization
Ramesh Nallapati, Bing Xiang, Bowen Zhou - Neural Generative Question Answering
Jun Yin, Xin Jiang, Zhengdong Lu, Lifeng Shang, Hang Li, Xiaoming Li - Learning Document Embeddings by Predicting N-grams for Sentiment Classification of Long Movie Reviews
Bofang Li, Tao Liu, Xiaoyong Du, Deyuan Zhang, Zhe Zhao - Autoencoding for Joint Relation Factorization and Discovery from Text
Diego Marcheggiani, Ivan Titov - Adaptive Natural Gradient Learning Based on Riemannian Metric of Score Matching
Ryo Karakida, Masato Okada, Shun-ichi Amari - Neural Enquirer: Learning to Query Tables in Natural Language
Pengcheng Yin, Zhengdong Lu, Hang Li, Ben Kao - End to end speech recognition in English and Mandarin
Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Yi Wang, Zhiqian Wang, Chong Wang, Bo Xiao, Dani Yogatama, Jun Zhan, Zhenyao Zhu - Lessons from the Rademacher Complexity for Deep Learning
Jure Sokolic, Raja Giryes, Guillermo Sapiro, Miguel R. D. Rodrigues - Coverage-based Neural Machine Translation
Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, Hang Li - Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems
Colin Raffel, Daniel P. W. Ellis - Learning stable representations in a changing world with on-line t-SNE: proof of concept in the songbird
Stéphane Deny, Emily Mackevicius, Tatsuo Okubo, Gordon Berman, Joshua Shaevitz, Michale Fee
## Workshop Track Posters (May 3rd)- Mixtures of Sparse Autoregressive Networks
Marc Goessling, Yali Amit - Comparative Study of Caffe, Neon, Theano, and Torch for Deep Learning
Soheil Bahrampour, Naveen Ramakrishnan, Lukas Schott, Mohak Shah - Action Recognition using Visual Attention
Shikhar Sharma, Ryan Kiros, Ruslan Salakhutdinov - Improving performance of recurrent neural network with relu nonlinearity
Sachin S. Talathi, Aniket Vartak - Visualizing and Understanding Recurrent Networks
Andrej Karpathy, Justin Johnson, Li Fei-Fei - Learning to Decompose for Object Detection and Instance Segmentation
Eunbyung Park, Alexander C. Berg - Learning visual groups from co-occurrences in space and time
Phillip Isola, Daniel Zoran, Dilip Krishnan, Edward H. Adelson - Spatio-Temporal Video Autoencoder with Differentiable Memory
Viorica Patraucean, Ankur Handa, Roberto Cipolla - Task Loss Estimation for Structured Prediction
Dzmitry Bahdanau, Dmiriy Serdyuk, Philémon Brakel, Nan Rosemary Ke, Jan Chorowski, Aaron Courville, Yoshua Bengio - Conditional computation in neural networks for faster models
Emmanuel Bengio, Pierre-Luc Bacon, Joelle Pineau, Doina Precup - A metric learning approach for graph-based label propagation
Pauline Wauquier, Mikaela Keller - Bidirectional Helmholtz Machines
Jorg Bornschein, Samira Shabanian, Asja Fischer, Yoshua Bengio - A Controller-Recognizer Framework: How Necessary is Recogntion for Control?
Marcin Moczulski, Kelvin Xu, Aaron Courville, Kyunghyun Cho - Online Batch Selection for Faster Training of Neural Networks
Ilya Loshchilov, Frank Hutter - Nonparametric Canonical Correlation Analysis
Tomer Michaeli, Weiran Wang, Karen Livescu - Document Context Language Models
Yangfeng Ji, Trevor Cohn, Lingpeng Kong, Chris Dyer, Jacob Eisenstein - Unsupervised Learning of Visual Structure using Predictive Generative Networks
William Lotter, Gabriel Kreiman, David Cox - Convolutional Clustering for Unsupervised Learning
Aysegul Dundar, Jonghoon Jin, and Eugenio Culurciello - ParseNet: Looking Wider to See Better
Wei Liu, Andrew Rabinovich, Alexander C. Berg - Why are deep nets reversible: A simple theory, with implications for training
Sanjeev Arora, Yingyu Liang, Tengyu Ma - Binding via Reconstruction Clustering
Klaus Greff, Rupesh Srivastava, Jürgen Schmidhuber - Dynamic Capacity Networks
Amjad Almahairi, Nicolas Ballas, Tim Cooijmans, Yin Zheng, Hugo Larochelle, Aaron Courville - Learning Representations of Affect from Speech
Sayan Ghosh, Eugene Laksana, Louis-Philippe Morency, Stefan Scherer - Neural Variational Inference for Text Processing
Yishu Miao, Lei Yu, Phil Blunsom - A Deep Memory-based Architecture for Sequence-to-Sequence Learning
Fandong Meng, Zhengdong Lu, Zhaopeng Tu, Hang Li, Qun Liu - Deconstructing the Ladder Network Architecture
Mohammad Pezeshki, Linxi Fan, Philemon Brakel, Aaron Courville, Yoshua Bengio - Neural network-based clustering using pairwise constraints
Yen-Chang Hsu, Zsolt Kira - LSTM-based Deep Learning Models for non-factoid answer selection
Ming Tan, Cicero dos Santos, Bing Xiang, Bowen Zhou - Using Deep Learning to Predict Demographics from Mobile Phone Metadata
Bjarke Felbo, Pål Sundsøy, Alex 'Sandy' Pentland, Sune Lehmann, Yves-Alexandre de Montjoye - Efficient Inference in Occlusion-Aware Generative Models of Images
Jonathan Huang, Kevin Murphy - Convolutional Models for Joint Object Categorization and Pose Estimation
Mohamed Elhoseiny, Tarek El-Gaaly, Amr Bakry, Ahmed Elgammal - Basic Level Categorization Facilitates Visual Object Recognition
Panqu Wang, Garrison Cottrell - Learning to Represent Words in Context with Multilingual Supervision
Kazuya Kawakami, Chris Dyer - Fine-grained pose prediction, normalization, and recognition
Ning Zhang, Evan Shelhamer, Yang Gao, Trevor Darrell - Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters
Jelena Luketina, Mathias Berglund, Tapani Raiko - Unitary Evolution Recurrent Neural Networks
Martin Arjovsky, Amar Shah, Yoshua Bengio - Temporal Convolutional Neural Networks for Diagnosis from Lab Tests
Narges Razavian, David Sontag - PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
Michael Figurnov, Dmitry Vetrov, Pushmeet Kohli - How far can we go without convolution: Improving fully-connected networks
Zhouhan Lin, Roland Memisevic, Kishore Konda - Learning Dense Convolutional Embeddings for Semantic Segmentation
Adam W. Harley, Konstantinos G. Derpanis, Iasonas Kokkinos - Generating Sentences from a Continuous Space
Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio - Stacked What-Where Auto-encoders
Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann LeCun - Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference
Yarin Gal, Zoubin Ghahramani - Blending LSTMs into CNNs
Krzysztof J. Geras, Abdel-rahman Mohamed, Rich Caruana, Gregor Urban, Shengjie Wang, Ozlem Aslan, Matthai Philipose, Matthew Richardson, Charles Sutton - Empirical performance upper bounds for image and video captioning
Li Yao, Nicolas Ballas, Kyunghyun Cho, John R. Smith, Yoshua Bengio - Adversarial Autoencoders
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow - Deep Reinforcement Learning with an Action Space Defined by Natural Language
Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng, Mari Ostendorf - Universum Prescription: Regularization using Unlabeled Data
Xiang Zhang, Yann LeCun - Variance Reduction in SGD by Distributed Importance Sampling
Guillaume Alain, Alex Lamb, Chinnadhurai Sankar, Aaron Courville, Yoshua Bengio - Adding Gradient Noise Improves Learning for Very Deep Networks
Arvind Neelakantan, Luke Vilnis, Quoc V. Le, Ilya Sutskever, Lukasz Kaiser, Karol Kurach, James Martens - Black Box Variational Inference for State Space Models
Evan Archer, Il Memming Park, Lars Buesing, John Cunningham, Liam Paninski - Input-Convex Deep Networks
Brandon Amos, J. Zico Kolter
## Accepted Papers (Conference Track)- Multi-Scale Context Aggregation by Dilated Convolutions
Fisher Yu, Vladlen Koltun - The Variational Fair Autoencoder
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, Richard Zemel - A note on the evaluation of generative models
Lucas Theis, Aäron van den Oord, Matthias Bethge - Learning to Diagnose with LSTM Recurrent Neural Networks
Zachary Lipton, David Kale, Charles Elkan, Randall Wetzel - Prioritized Experience Replay
Tom Schaul, John Quan, Ioannis Antonoglou, David Silver - Importance Weighted Autoencoders
Yuri Burda, Ruslan Salakhutdinov, Roger Grosse - Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han, Huizi Mao, Bill Dally - Variationally Auto-Encoded Deep Gaussian Processes
Zhenwen Dai, Andreas Damianou, Javier Gonzalez, Neil Lawrence - Training Convolutional Neural Networks with Low-rank Filters for Efficient Image Classification
Yani Ioannou, Duncan Robertson, Jamie Shotton, roberto Cipolla, Antonio Criminisi, Jamie Shotton - Neural Networks with Few Multiplications
Zhouhan Lin, Matthieu Courbariaux, Roland Memisevic, Yoshua Bengio - Reducing Overfitting in Deep Networks by Decorrelating Representations
Michael Cogswell, Faruk Ahmed, Ross Girshick, Larry Zitnick, Dhruv Batra - Pushing the Boundaries of Boundary Detection using Deep Learning
Iasonas Kokkinos - Generating Images from Captions with Attention
Elman Mansimov, Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov - Reasoning about Entailment with Neural Attention
Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom - Convolutional Neural Networks With Low-rank Regularization
Cheng Tai, Tong Xiao, Yi Zhang, Xiaogang Wang, Weinan E - Unifying distillation and privileged information
David Lopez-Paz, Leon Bottou, Bernhard Schölkopf, Vladimir Vapnik - Particular object retrieval with integral max-pooling of CNN activations [code]
Giorgos Tolias, Ronan Sicre, Hervé Jégou - Bayesian Representation Learning with Oracle Constraints
Theofanis Karaletsos, Serge Belongie, Gunnar Rätsch - Neural Programmer: Inducing Latent Programs with Gradient Descent
Arvind Neelakantan, Quoc Le, Ilya Sutskever - Towards Universal Paraphrastic Sentence Embeddings [code]
John Wieting, Mohit Bansal, Kevin Gimpel, Karen Livescu - Regularizing RNNs by Stabilizing Activations
David Krueger, Roland Memisevic - SparkNet: Training Deep Networks in Spark
Philipp Moritz, Robert Nishihara, Ion Stoica, Michael Jordan - Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
Jost Tobias Springenberg - The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston - MuProp: Unbiased Backpropagation For Stochastic Neural Networks
Shixiang Gu, Sergey Levine, Ilya Sutskever, Andriy Mnih - Data Representation and Compression Using Linear-Programming Approximations
Hristo Paskov, John Mitchell, Trevor Hastie - Diversity Networks
Zelda Mariet, Suvrit Sra - Deep Reinforcement Learning in Parameterized Action Space [code] [data]
Matthew Hausknecht, Peter Stone - Learning VIsual Predictive Models of Physics for Playing Billiards
Katerina Fragkiadaki, Pulkit Agrawal, Sergey Levine, Jitendra Malik - Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks [code] [data]
Jason Weston, Antoine Bordes, Sumit Chopra, Sasha Rush, Bart van Merrienboer, Armand Joulin, Tomas Mikolov - Evaluating Prerequisite Qualities for Learning End-to-end Dialog Systems [data]
Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, Jason Weston - Better Computer Go Player with Neural Network and Long-term Prediction
Yuandong Tian, Yan Zhu - Distributional Smoothing with Virtual Adversarial Training [code]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Ken Nakae, Shin Ishii - Multi-task Sequence to Sequence Learning
Minh-Thang Luong, Quoc Le, Ilya Sutskever, Oriol Vinyals, Lukasz Kaiser - A Test of Relative Similarity for Model Selection in Generative Models
Eugene Belilovsky, Wacha Bounliphone, Matthew Blaschko, Ioannis Antonoglou, Arthur Gretton - Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications
Yong-Deok Kim, Eunhyeok Park, Sungjoo Yoo, Taelim Choi, Lu Yang, Dongjun Shin - Neural Programmer-Interpreters
Scott Reed, Nando de Freitas - Session-based recommendations with recurrent neural networks [code]
Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, Domonkos Tikk - Continuous control with deep reinforcement learning
Timothy Lillicrap, Jonathan Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra - Recurrent Gaussian Processes
César Lincoln Mattos, Zhenwen Dai, Andreas Damianou, Jeremy Forth, Guilherme Barreto, Neil Lawrence - Modeling Visual Representations:Defining Properties and Deep Approximations
Stefano Soatto, Alessandro Chiuso - Auxiliary Image Regularization for Deep CNNs with Noisy Labels
Samaneh Azadi, Jiashi Feng, Stefanie Jegelka, Trevor Darrell - Convergent Learning: Do different neural networks learn the same representations?
Yixuan Li, Jason Yosinski, Jeff Clune, Hod Lipson, John Hopcroft - Policy Distillation
Andrei Rusu, Sergio Gomez, Caglar Gulcehre, Guillaume Desjardins, James Kirkpatrick, Razvan Pascanu, Volodymyr Mnih, Koray Kavukcuoglu, Raia Hadsell - Neural Random-Access Machines
Karol Kurach, Marcin Andrychowicz, Ilya Sutskever - Gated Graph Sequence Neural Networks
Yujia Li, Daniel Tarlow, Marc Brockschmidt, Richard Zemel, CIFAR - Metric Learning with Adaptive Density Discrimination
Oren Rippel, Manohar Paluri, Piotr Dollar, Lubomir Bourdev - Censoring Representations with an Adversary
Harrison Edwards, Amos Storkey - Order-Embeddings of Images and Language [code]
Ivan Vendrov, Ryan Kiros, Sanja Fidler, Raquel Urtasun - Variable Rate Image Compression with Recurrent Neural Networks
George Toderici, Sean O'Malley, Damien Vincent, Sung Jin Hwang, Michele Covell, Shumeet Baluja, Rahul Sukthankar, David Minnen - Delving Deeper into Convolutional Networks for Learning Video Representations
Nicolas Ballas, Li Yao, Pal Chris, Aaron Courville - Data-dependent initializations of Convolutional Neural Networks [code]
Philipp Kraehenbuehl, Carl Doersch, Jeff Donahue, Trevor Darrell - Order Matters: Sequence to sequence for sets
Oriol Vinyals, Samy Bengio, Manjunath Kudlur - High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel - BlackOut: Speeding up Recurrent Neural Network Language Models With Very Large Vocabularies [code]
Shihao Ji, Swaminathan Vishwanathan, Nadathur Satish, Michael Anderson, Pradeep Dubey - Deep Multi Scale Video Prediction Beyond Mean Square Error
Michael Mathieu, camille couprie, Yann Lecun - Grid Long Short-Term Memory
Nal Kalchbrenner, Alex Graves, Ivo Danihelka - Net2Net: Accelerating Learning via Knowledge Transfer
Tianqi Chen, Ian Goodfellow, Jon Shlens - Predicting distributions with Linearizing Belief Networks
Yann Dauphin, David Grangier - Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter - Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
Emilio Parisotto, Jimmy Ba, Ruslan Salakhutdinov - Segmental Recurrent Neural Networks
Lingpeng Kong, Chris Dyer, Noah Smith - Large-Scale Approximate Kernel Canonical Correlation Analysis
Weiran Wang, Karen Livescu - Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Alec Radford, Luke Metz, Soumith Chintala - Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks [code]
Pouya Bashivan, Irina Rish, Mohammed Yeasin, Noel Codella - Digging Deep into the layers of CNNs: In Search of How CNNs Achieve View Invariance
Amr Bakry, Mohamed Elhoseiny, Tarek El-Gaaly, Ahmed Elgammal - An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family
Alexandre De Brébisson, Pascal Vincent - Data-Dependent Path Normalization in Neural Networks
Behnam Neyshabur, Ryota Tomioka, Ruslan Salakhutdinov, Nathan Srebro - Reasoning in Vector Space: An Exploratory Study of Question Answering
Moontae Lee, Xiaodong He, Wen-tau Yih, Jianfeng Gao, Li Deng, Paul Smolensky - ACDC: A Structured Efficient Linear Layer
Marcin Moczulski, Misha Denil, Jeremy Appleyard, Nando de Freitas - Density Modeling of Images using a Generalized Normalization Transformation
Johannes Ballé, Valero Laparra, Eero Simoncelli - Adversarial Manipulation of Deep Representations [code]
Sara Sabour, Yanshuai Cao, Fartash Faghri, David Fleet - Geodesics of learned representations
Olivier Hénaff, Eero Simoncelli - Sequence Level Training with Recurrent Neural Networks
Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba - Super-resolution with deep convolutional sufficient statistics
Joan Bruna, Pablo Sprechmann, Yann Lecun - Variational Gaussian Process
Dustin Tran, Rajesh Ranganath, David Blei
## Presentation Guidelines## Conference OralsTalks should be no longer than 17 minutes, leaving 2-3 minutes for questions by the audience. The author who will be giving the talk must find the oral session chair in advance, to test the use of his/her personal laptop for presenting the slides. Talks scheduled before the morning coffee break should do a laptop test before the morning session starts, while other talks can perform their tests during the coffee break. ## Poster PresentationsThe poster boards are 4 ft. high by 8 ft. wide. Poster presenters are encouraged to put up their posters as early as the day's morning coffee break (10:20 to 10:50). Each poster is assigned a number, shown above. Presenters should use the poster board corresponding to the number for their work. Once the poster session is over, presenters have until the end of the day to take off their posters from their assigned poster boards. |