NIPS 2016 papers

Originally by @karpathy updated by Finlay Maguire
Source available on github
Below every paper are TOP 100 most-occuring words in that paper and their color is based on LDA topic model with k = 7.
(It looks like 0 = ?, 1 = ?, 2 = ?, 3 = ?, 4 = ?, 5 = ?, 6 = ? etc.)
Toggle LDA topics to sort by: TOPIC0 TOPIC1 TOPIC2 TOPIC3 TOPIC4 TOPIC5 TOPIC6
Dimensionality Reduction of Massive Sparse Datasets Using Coresets
Dan Feldman, Mikhail Volkov, Daniela Rus


[weighted, sum, subset, whose, thus, complete, acm] [algorithm, theorem, every, proof, let, show, set, prove, problem, general, case, cardinality, since] [approximation, size, reduction, large, compute, computing, sampling, supplementary, latent, efficient, log, end] [matrix, sparse, can, error, wikipedia, rank, computation, first, vector, svd, analysis, small, see, low, running, vit, row, pca, symposium, solution, subspace, main, relative, paper, sparsity, synthetic, sketch, eps] [coreset, dimensionality, coresets, data, random, squared, section, independent, given, result, based, project, practical] [time, research, english] [input, using, used, weight, approach, use, original, different, please, recent]
Nearly Isometric Embedding by Relaxation
James McQueen, Marina Meila, Dominique Joncas


[graph, average, find] [loss, algorithm, set, function, learning, will, let, convex, choose, define, case, obtain, every] [gradient, compute, smooth, method, convergence, operator, initial, step, dual, coordinate, size, requires, machine, computed, end, subsample] [dimension, can, noise, matrix, subspace, principal, one, proposition, isometry, norm, denote, rank, spectral, symmetric, existing, via, order, low, noisy, linear] [data, embedding, metric, riemannian, point, manifold, distortion, laplacian, space, isomap, embeddings, lossk, isometric, given, pushforward, measure, mvu, journal, kernel, hlle, euclidean, sphere, sample, tangent, rincipal, hourglass, distance, based, geometric, dimensionality, null, estimator, embedded, definite] [intrinsic, optimizing, along] [using, propose, use, output, figure]
Deep Submodular Functions: Definitions and Learning
Brian W. Dolhansky, Jeff A. Bilmes


[partition, represent, cycle, strictly, many, graph, represented, include, possible, normalized, decomposable, number, called] [submodular, function, set, learning, matroid, dsfs, concave, dsf, modular, laminar, may, show, monotone, since, defined, general, might, theorem, greedy, antitone, scms, matroids, arbitrary, define, scmms, class, maximizing, even, maximization, summarization, context, lemma, submodularity] [machine, learnt, size, large, optimization] [can, rank, one, via, also, linear, analysis, require] [given, family, useful, data, based, result, section, associated] [form, within, extend, showing, value] [feature, ground, used, learn, approach, deep, figure, generalize, neural, layer, training, use, using, trained, image]
Causal meets Submodular: Subset Selection with Directed Information
Yuxun Zhou, Costas J. Spanos


[causal, directed, subset, structure, definition, monotonic, larger, covariate, normalized, theory, degree, denoted] [greedy, function, submodular, smi, set, submodularity, algorithm, bound, guarantee, cardinality, problem, index, learning, general, maximizing, show, causally, defined, lemma, causality, optimal, dependence, since, theorem, maximization, lower, placement, close, provide, class, possibly, every, near] [objective, constrained, derivative, although] [can, also, analysis, first, one, much, solution, note, second, proposition, reconstructed, ieee, local] [random, selection, two, theoretical, data, covariates, given, selected, universal, journal, provides, measure] [information, sensor, location, research, process, value, series] [performance, figure, network, conditioned, work, used, proposed, better]
A Consistent Regularization Approach for Structured Prediction
Carlo Ciliberto, Lorenzo Rosasco, Alessandro Rudi


[consistency, number, induced, probability] [loss, surrogate, learning, problem, least, considered, set, algorithm, case, prove, risk, let, satisfy, function, inequality, since, conference, consider, general, minimizer, binary, lemma, ask] [argmin, large, processing, machine, step] [can, regression, comparison, following, rank, ranking, note, linear, paper, decoding, sampled, solution, analysis, regularization, also, vector] [kernel, estimator, generalization, given, statistical, universal, reproducing, space, robust, finite, derivation, gaussian, corresponding, infinite, particular, kde, empirical, result, hellinger, hilbert] [whether, information, framework] [structured, prediction, approach, proposed, training, work, using, natural, question, neural, classifier, image, output, input, classification, reconstruction, compact]
Bayesian latent structure discovery from multi-neuron recordings
Scott Linderman, Ryan P. Adams, Jonathan W. Pillow


[structure, adjacency, block, number, probability] [function, binary, bernoulli, consider, since, algorithm, provide] [latent, bayesian, inference, glm, auxiliary, stochastic, sampling, standard, likelihood, posterior, fit, efficient, logistic, scalability] [can, matrix, linear, synthetic, correlated, one, via, also] [true, distribution, gaussian, distance, data, given, discrete, underlying, random, conditional, gibbs, population, mean, collapsed, functional, journal, joint, conditionally, independent] [spike, model, cell, time, neuron, retinal, ganglion, inferred, binomial, spiking, activity, response, neuronal, modeling, interpretable] [neural, network, connection, activation, table, approach, shown, figure, augmentation, using, receptive, weight, like, capture]
Lifelong Learning with Weighted Majority Votes
Anastasia Pentina, Ruth Urner


[majority, base, weighted, number, probability, theory, vote, total] [learning, set, algorithm, lifelong, complexity, hypothesis, every, class, theorem, conference, will, lsi, encountered, learner, setting, upper, show, assume, let, exists, bounded, consider, implies, max, formulate, case] [machine, solved, marginal, draw, end] [can, error, linear, one, note, assumption, first, dimension, arg, observed, related, solving, small, also, analysis, need, following, min] [sample, section, data, well, distribution] [information, new, previously, current, model, therefore, autonomous, required] [task, learned, used, training, sequence, ground, previous, neural, transfer, representation, classifier, work, international, performance, relatedness, using, feature, similar, prediction]
Global Optimality of Local Search for Low Rank Matrix Recovery
Srinadh Bhojanapalli, Behnam Neyshabur, Nati Srebro


[number, present, probability] [problem, theorem, consider, case, lemma, proof, close, satisfies, learning, show, conference, convex, let, even, now, general, algorithm, property, best] [optimization, gradient, convergence, descent, saddle, operator, machine, showed, efficient, stochastic] [local, matrix, global, rank, first, order, can, also, recovery, noiseless, spurious, condition, linear, initialization, low, noisy, orthonormal, optimality, measurement, second, minimum, isometry, optimum, rip, via, following, svd, sensing, error, success, semidefinite, require, hai, restricted] [random, point, result, gaussian, given, section, stationary, space, well] [search, information, significant] [work, using, preprint, used, similar, arxiv, recent, neural, international]
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
Xinyang Yi, Zhaoran Wang, Zhuoran Yang, Constantine Caramanis, Han Liu


[detection, definition, number, thus, probability] [query, problem, oracle, minimax, bound, function, lower, sup, consider, algorithm, theorem, learning, defined, label, rate, asymptotically, weakly, hypothesis, setting, constant, define, binary, uncorrupted, assume, risk, complexity, focus, let, set, exists, known, optimal, absolute, upper] [log, parameter, tractable, efficient] [computational, sparse, denote, high, gap, first, matrix, phase, analysis, satisfying, via, detecting, principal, condition, observed, also] [statistical, test, testing, random, two, sample, computationally, estimation, given, dimensional, data, covariance, polynomial, efficiency, distribution] [model, information, characterize] [supervised, sequence, classification, arxiv, preprint, compared, use, unsupervised, accuracy, work, supervision]
Disease Trajectory Maps
Peter Schulam, Raman Arora


[number, probabilistic, include, many, obtained, clustering] [lower, learning, let, may, known, study, set, make, depends, focus] [variational, inducing, log, posterior, inference, approximate, bayesian, compute, stochastic, respect] [can, sparse, denote, vector, also, matrix, principal, see, first, sampled, analysis, column] [clinical, data, gaussian, basis, two, functional, distribution, longitudinal, given, covariance, test, mean, important, kernel, space, multivariate] [disease, dtm, trajectory, time, model, lmm, process, across, scleroderma, pfvc, tss, fpca, pulmonary, series, lung, skin, ini, complex, observation, prior, new] [using, use, representation, similar, figure, marker, learned, used, learn, capture]
Edge-exchangeable graphs and sparsity
Diana Cai, Trevor Campbell, Tamara Broderick


[graph, edge, number, exchangeability, exchangeable, vertex, probability, many, kallenberg, fox, caron, collection, multiplicity, crane, greater, projective, growing, infinity, grows, added, multigraph, according] [binary, set, consider, active, let, every, since, setting, may, notion, will, theorem, function, drawn, slope, rate] [step, latent, bayesian, beta, iid] [frequency, sparse, can, sparsity, via, order, invariant, power, one, see] [random, measure, distribution, section, finite, given, data, point, infinite, nonparametric, permutation, based, asymptotic] [model, process, new, poisson, form, behavior, demonstrate, framework] [sequence, network, arxiv, dense, figure, single, use, generated, generate, generative, work, different]
Density Estimation via Discrepancy Based Adaptive Sequential Partition
Dangna Li, Kun Yang, Wing Hung Wong


[partition, piecewise, number, find, cluster, tree, total, split, probability] [function, constant, algorithm, defined, set, binary, let, opt, theorem, bound, bounded, problem, achieves, learning, class] [method, carlo, convergence, monte, bayesian, size, objective, supplementary] [can, one, good, error, first, following, see, dimension, denote, second, order, initialization, also, analysis] [density, estimation, based, dsp, discrepancy, data, distance, hellinger, star, true, given, section, estimate, bsp, nonparametric, distribution, estimator, kde, two, random, kernel, sample, estimated, computationally, underlying, ingredient] [time, demonstrate, along, location] [used, table, use, domain, figure, propose, three, different]
Supervised learning through the lens of compression
Ofir David, Shay Moran, Amir Yehudayoff


[equivalence, theory, whose] [compression, scheme, agnostic, pac, learning, loss, learnability, theorem, hypothesis, let, implies, learnable, every, proof, function, general, class, rate, equivalent, show, learner, multiclass, appears, version, exists, compactness, subsection, property, case, follows, dichotomy, algorithm, realizable, study, consider, constant, combinatorial, manfred, defined, inf, context, setting, may, even, israel, set, label, argument, known] [size, log, convergence, approximate, full, machine, shai] [dimension, following, linear, error, can, first] [sample, selection, uniform, statistical, distribution, given, based, two, finite, section, empirical, journal] [showing, statement] [part, use, input, work, categorization, similar, using, compressing]
Clustering with Same-Cluster Queries
Hassan Ashtiani, Shrinu Kushagra, Shai Ben-David


[clustering, number, cluster, probability, subset, conforms, clusterability, definition, ssac] [algorithm, set, query, oracle, problem, satisfies, show, let, complexity, property, will, notion, optimal, hardness, theorem, instance, access, provide, prove, lemma, proof, lower, bound, case, least, setting, even, appendix, assume, cost, function, exists, niceness, balcan, consider] [log, efficient] [can, solution, one, computational, also, note, phase, first, following, condition, hard] [euclidean, data, two, result, given, center, polynomial, provided, mean] [framework, target, another, whether, form, time, expert] [domain, supervision, without, using, work, natural, different, approach, help]
Deep ADMM-Net for Compressive Sensing MRI
yan yang, Jian Sun, Huibin Li, Zongben Xu


[graph, number, average] [function, fast, defined, algorithm, loss, learning, general, ratio, achieves, set] [stage, sampling, method, update, computed, optimization, gradient, compute] [admm, mri, can, reconstructed, regularization, sensing, nmse, computational, sparse, magnetic, multiplier, resonance, compressive, imaging, following, ieee, also, significantly, first, min, iterative, chest, dictionary, arg, pano] [data, nonlinear, test, corresponding, shrinkage, given] [brain, operation, time, direction] [reconstruction, layer, image, deep, network, flow, transform, using, output, convolution, accuracy, figure, different, learned, initialized, training, shown, filter, psnr, four, three, learn, architecture, novel, compared, train, dct, net]
Temporal Regularized Matrix Factorization for High-dimensional Time Series Prediction
Hsiang-Fu Yu, Nikhil Rao, Inderjit S. Dhillon


[graph, tar, forecasting, among, dependency, many, structure, incorporate, negative] [consider, set, theorem, conference, appendix, learning, let, general, kxt, problem, might, even] [latent, standard, large, autoregressive, method, usually] [can, matrix, missing, regularizer, trmf, factorization, regularization, existing, handle, also, see, dlm, following, min, regularizers, gar, highly, yit, lag, note, electricity, regularized, formulation, linear, one] [data, section, two, gaussian, corresponding, given, embeddings, covariance, dimensional] [time, temporal, series, model, framework, unlike, simple, traffic] [use, approach, figure, international, used, weight, table, proposed, novel, scale, learn, prediction]
Rényi Divergence Variational Inference
Yingzhen Li, Richard E. Turner


[negative, definition, propagation] [bound, learning, conference, algorithm, special, consider, case, theorem, obtain, function, now] [variational, approximation, log, inference, bayesian, posterior, stochastic, approximate, divergence, likelihood, method, optimisation, datasets, machine, energy, importance, expectation, vae, however, frey, monte, carlo, lvi, iwae, marginal, processing, sep, wmax] [can, also, exact, one, following, first, local] [test, section, gaussian, distribution, fixed, family, mle, point, finite, bias, theoretical, sample, useful, alpha] [new, model, framework, information, considers, upon] [figure, neural, using, different, international, proposed, used, face, applied, network, recent, work]
Fast Algorithms for Robust PCA via Gradient Descent
Xinyang Yi, Dohyung Park, Yudong Chen, Constantine Caramanis


[partial, probability, number, thus] [algorithm, complexity, consider, setting, problem, let, show, guarantee, theorem, convex, assume, case, fast, constant, set, annual, even, satisfies] [gradient, method, log, processing, step, deterministic, descent, faster] [matrix, sparse, running, can, observed, rank, phase, svd, via, pca, sinit, corruption, completion, linear, error, norm, following, ieee, first, row, suppose, alternating, projected, altproj, exact, sujay, singular, min, partially, note, initialization, nonzero, low, symposium, ialm, emmanuel, factorized] [robust, given, robustness, two, based, estimator, statistical, result] [time, information, model, observation] [using, figure, preprint, neural, arxiv, propose, compared, fully, produce]
Memory-Efficient Backpropagation Through Time
Audrunas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves


[definition, total, number, established, reverse] [cost, algorithm, optimal, define, will, strategy, may, general, budget, case, learning, every] [core, step, size, consumption, standard, capacity, argmin, respect] [computational, can, one, also, computation, suppose, following, comparison] [length, fixed, given, curve, particular, two] [state, internal, time, forward, backpropagation, operation, policy, backward, remember, usage, measured, dynamic, next, memorization, execution, value, simple, within, typically] [memory, hidden, rnn, single, sequence, figure, neural, network, recurrent, approach, using, input, used, previous, per, different, position, proposed, output, alex, storing, shown, store, deepmind, able, google, deep]
Learning and Forecasting Opinion Dynamics in Social Networks
Abir De, Isabel Valera, Niloy Ganguly, Sourangshu Bhattacharya, Manuel Gomez Rodriguez


[opinion, sentiment, average, social, forecasting, message, steady, number, eht, find, april, influence, simulate, bvu] [set, may, show, appendix, consider, theorem, algorithm, will, posted, depends, property] [stochastic, efficient, parameter, compute, latent, forecast, consensus, key, sampling] [can, user, accurate, first, linear, following, real, one, matrix, leverage] [given, estimation, data, conditional, two, based, differential, journal, denotes, multivariate, theoretical, twitter, converge] [model, time, hawkes, simulation, poisson, modeling, history, temporal, state, intensity, framework, predictive, evolution, information, process, markov, value, recorded] [using, figure, network, performance, proposed]
Professor Forcing: A New Algorithm for Training Recurrent Networks
Alex M. Lamb, Anirudh Goyal ALIAS PARTH GOYAL, Ying Zhang, Saizheng Zhang, Aaron C. Courville, Yoshua Bengio


[loop, number, either, quality, negative] [learning, set, algorithm] [sampling, likelihood, method, evaluation, sequential, processing, gradient, step, objective, stochastic] [can, also, one, running, observed, note, much] [distribution, data, sample, two, length, conditioning, given, selected] [forcing, professor, teacher, model, behavior, time, human, mode, information, van, raw] [training, sequence, neural, generative, network, rnn, recurrent, hidden, discriminator, arxiv, using, used, scheduled, generator, generated, input, output, preprint, validation, language, handwriting, use, generation, adversarial, better, prediction, generating, task, figure, open, classifier, trained, randomly, synthesis, train, architecture, layer, image, mnist]
Full-Capacity Unitary Recurrent Neural Networks
Scott Wisdom, Thomas Powers, John Hershey, Jonathan Le Roux, Les Atlas


[number, quality, average, represent] [set, learning, best, loss, rate, achieves, show, problem, consider, since, function, theorem, argument, differentiable] [unitary, urnn, gradient, urnns, parameterization, recurrence, evaluation, capacity, stiefel, stft, permuted, descent, processing, optimization, suggests, intelligibility] [matrix, dimension, can, lie, restricted, synthetic, solution, vector] [manifold, test, space, data, section, given, theoretical] [state, system, optimize, determine, model, baseline, time] [using, recurrent, neural, speech, use, lstm, hidden, training, performance, memory, network, figure, table, validation, task, lstms, consists, sequence, proposed, natural, prediction, perceptual, used, shown, mnist, output]
Learned Region Sparsity and Diversity Also Predicts Visual Attention
Zijun Wei, Hossein Adeli, Minh Hoai, Greg Zelinsky, Dimitris Samaras


[detection, number] [set, consider, return, svm, diversity, show, function, might] [method, divergence, evaluation] [auc, also, local, sparse, can, sparsity, significantly, localization, ranking, one, first, computational] [center, bias, selected, test, mean, people, distribution, gaussian, based, searching] [model, human, search, target, information, eye, new, failure, evidence] [visual, sdr, image, object, rrsvm, attention, region, priority, classification, multiple, dataset, map, poet, used, using, predict, different, figure, inhibition, score, training, fixation, mechanism, feature, performance, pascal, predicting, saliency, generated, prediction, spatial, learned, three, use, trained, pet, resized, vision, computer, work, voc, table, category, neural]
Interaction Networks for Learning about Objects, Relations and Physics
Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, koray kavukcuoglu


[interaction, string, relation, potential, whose, external, many, contained, represents, represent, represented, relational] [learning, show, learnable, constant] [mse, applies] [can, also, one, matrix, first] [two, generalization, test, well, data] [model, complex, time, across, abstract, gravitational, future, system, velocity, receiver, spring, simulation, baseline, reason, control, effect, state, another, exploit, ability] [physical, input, reasoning, object, neural, predict, network, prediction, used, learn, training, deep, trained, different, output, mlp, using, rigid, applied, three, engine, cnns, static, use, novel, ground, truth, hidden, scene]
Discriminative Gaifman Models
Mathias Niepert


[gaifman, knowledge, relational, base, neighborhood, tuple, number, relation, structure, formula, graph, tuples, probability, connected, logical, probabilistic, negative, whose, path, include] [learning, set, conference, every, query, problem, now, class, let, complexity, theorem, confidence, algorithm] [inference, machine, large, compute, free, size] [can, local, one, also, locality, completion] [positive, data, embeddings, embedding, given, locally, random, corresponding] [model, within, form, artificial, target, complex] [figure, learn, training, neural, perform, feature, object, generated, domain, representation, work, per, used, generate, language, table, learned, input]
Generative Adversarial Imitation Learning
Jonathan Ho, Stefano Ermon


[causal, interaction] [learning, cost, function, will, algorithm, problem, optimal, max, lemma, conference, convex, minimize, show, constant, defined, cloning, appendix, set] [large, dual, step, gradient, respect, primal, machine, optimization, efficient, expectation, due, objective] [can, proposition, linear, arg, regularizer, following, exactly, certain, min, optimum, running] [measure, entropy, data, given, true, section, sample, maximum, random] [policy, expert, occupancy, reinforcement, irl, imitation, apprenticeship, gail, inverse, behavioral, control, environment, directly, trpo, behavior, guided] [performance, using, neural, generative, adversarial, international, dataset, discriminator, approach, matching, work, used, learned, training]
Probabilistic Inference with Generating Functions for Poisson Latent Variable Models
Kevin Winner, Daniel R. Sheldon


[probability, variable, runtime, number, detection, degree, resulting, enough, graphical] [algorithm, eliminate, class, set, let, function, theorem, will, fast, show, known] [pgf, posterior, inference, parameter, line, pgfs, factor, compute, latent, nmax, likelihood, hmms, unnormalized, marginals, abundance, end, approximate, faster, countably, implementation, develop, standard, marginal, apply] [can, exact, tail, first, truncated, also, observed, see, proposition, related, one] [joint, population, infinite, polynomial, discrete, based, given, data, two, estimation, mean, finite, true, hmm, journal] [poisson, forward, model, time, elimination, form, value, series, simple, count] [generating, figure, using, perform, representation, approach, use, instead, previous]
Active Nearest-Neighbor Learning in Metric Spaces
Aryeh Kontorovich, Sivan Sabato, Ruth Urner


[number, probability, possible] [active, label, set, sin, let, theorem, learning, compression, passive, gottlieb, learner, algorithm, complexity, marmann, general, selectscale, guarantee, distmon, setting, kontorovich, defined, binary, lemma, uin, hnn, depends, since, competitive, show, define, generatennset, margin, provide, smallest, proof, requested] [size, end] [error, can, denote, following, analysis, first, also] [metric, sample, procedure, nearest, neighbor, given, selection, empirical, section, estimate, generalization, based, random, selected, bayes, consistent, two] [search, model, therefore, left, value, information, new] [scale, classification, output, using, labeled, different, prediction, similar, propose, classifier, approach, input, unlabeled]
Regularization With Stochastic Transformations and Perturbations for Deep Semi-Supervised Learning
Mehdi Sajjadi, Mehran Javanmardi, Tolga Tasdizen


[combination, contains, number] [loss, set, function, learning, randomized, achieve, rate, conference, example] [large, five, method, processing, size, minimizes] [can, error, sparse, also, one, first, see, vector] [data, sample, two, based, random, test, difference] [model, another, information] [labeled, training, unlabeled, use, using, unsupervised, convolutional, accuracy, neural, different, proposed, dropout, network, trained, dataset, prediction, computer, used, similar, train, perform, multiple, svhn, mnist, augmentation, norb, improve, randomly, international, available, layer, table, feature, convnets, deep, vision, ladder, single, supervised, classification, convnet, last, task, pattern, arxiv]
Generalization of ERM in Stochastic Convex Optimization: The Dimension Strikes Back
Vitaly Feldman


[probability, number, many, describe, computable] [convex, bound, lower, let, complexity, erm, every, function, set, will, setting, define, general, sco, exists, algorithm, known, defined, binary, dependence, even, theorem, now, problem, constant, show, lipschitz, bpd, upper, imply, asymptotically, learning, case, consider, class, max, risk, prove, mink, vitaly] [stochastic, convergence, optimization, smooth, term, standard, additional, efficiently, gradient, log] [can, stability, regularization, linear, also, one, necessary, note, analysis, following, denote, support, convexity] [sample, uniform, distribution, generalization, construction, given, maximum, based, empirical, radius] [therefore, ball, code, whether, time] [use, approach, work, unit, question, natural]
On Explore-Then-Commit strategies
Aurelien Garivier, Tor Lattimore, Emilie Kaufmann


[rule, possible, identification, number] [regret, strategy, etc, algorithm, optimal, lower, arm, bandit, theorem, known, bound, upper, stopping, unknown, minimax, show, asymptotically, best, let, ucb, lai, proof, kaufmann, uniformly, inf, set, choose, problem, learning, bounded, appendix, garivier, case, even, now, confidence, conference, prove, chooses, suboptimal, perchet, chosen, horizon, exploitation, setting, robbins, active] [sequential, log, end, factor, efficient, sampling] [can, also, arg, gap, relative, one, phase, analysis, following] [based, given, gaussian, two, lim, asymptotic, section, empirical, difference, uniform, mean] [exploration, action, time, simple, design] [using, fully, improve, presented]
Improved Deep Metric Learning with Multi-class N-pair Loss Objective
Kihyuk Sohn


[negative, number, mining, clustering, identification, many] [loss, learning, example, verification, class, since, online, function, set, observe, improvement] [batch, large, log, standard, efficient] [can, one, product, hard, also, formulation, following, comparison] [metric, embedding, data, positive, distance, construction, two, equation, well, selected, section, test] [model] [triplet, deep, recognition, face, performance, training, different, using, proposed, softmax, evaluate, accuracy, per, multiple, dataset, object, trained, network, train, output, image, table, use, contrastive, figure, similarity, feature, tri, visual, propose, neural, classification, input, composed, pair, better, vrf, randomly, nmi, instead, representation]
Greedy Feature Construction
Dino Oglic, Thomas Gärtner


[number, definition, subset, probability] [function, set, algorithm, learning, hypothesis, greedy, let, banach, considered, problem, defined, theorem, appendix, rate, convex, expected, bounded] [descent, gradient, step, method, convergence, incremental, constructive, fitting, machine, capacity, respect, optimization, large, smooth, walltime] [linear, can, error, also, regression, good, regularization, following, one] [space, data, functional, ridge, kernel, constructed, empirical, hilbert, squared, measure, sample, generalization, construct, construction, section, provided, smoothness, random, specified, carte, basis] [choice, target, model, information, hyperparameter, goal, time] [feature, approach, sequence, training, using, representation, proposed, residual, neural, single, compact, performance, presented]
Learning HMMs with Nonparametric Emissions via Spectral Decompositions of Continuous Matrices
Kirthevasan Kandasamy, Maruan Al-Shedivat, Eric P. Xing


[number, chebyshev, variable, theory, probability] [algorithm, will, bound, lemma, let, learning, theorem, case, setting, concentration] [method, hmms, latent, efficient, compute] [can, error, spectral, singular, first, via, matrix, one, denote, linear, analysis, recover, following, main, perturbation, rank] [nonparametric, density, discrete, joint, parametric, true, observable, conditional, estimation, hmm, estimated, estimating, kernel, emission, two, given, kde, qmatrix, distribution, sample, data, space, estimate, procedure] [continuous, state, time, markov, predictive, model] [hidden, used, using, training, use, representation, prediction, recent, sequence, compare]
Multivariate tests of association based on univariate tests
Ruth Heller, Yair Heller


[independence, resulting, third, aggregation, number, partition, level, fact] [let, will, hypothesis, may, show, set, since, consider, every, best, function] [bivariate, method, processing, log] [can, power, one, see, following, vector, order, computational, first, also, good] [test, univariate, center, multivariate, sample, distribution, point, consistent, based, null, statistic, random, two, measure, testing, fxy, minp, significance, result, data, permutation, journal, retton, distance, section, specific, useful, procedure, versus, annals, comparing, population, normal, powerful, joint] [value, choice, information, carry] [using, single, approach, figure, use, pooling, table, multiple, different, referred, novel, category, neural]
Generating Long-term Trajectories Using Deep Hierarchical Networks
Stephan Zheng, Yisong Yue, Jennifer Hobbs


[quality] [learning, study, class, may, set, case, general, every, function] [latent, large] [can, one, also, preference, via] [two, data, space] [hpn, policy, hierarchical, model, basketball, raw, rollouts, rollout, player, behavior, agent, state, action, towards, gti, modeling, planning, trajectory, court, reinforcement, move, issn, expert, depicts, ait, sit, macro, straight, velocity, generates, spatiotemporal] [attention, using, network, figure, gru, cnn, approach, ground, neural, weak, output, realistic, truth, training, use, work, transfer, deep, accuracy, instead, conv, used, trained, generated, predicted, recurrent, prediction, sequence, natural]
R-FCN: Object Detection via Region-based Fully Convolutional Networks
jifeng dai, Yi Li, Kaiming He, Jian Sun


[detection, mining, vote, bin, average] [set, loss, example, learnable, bank, microsoft, fast] [faster, method, proposal, standard, size] [following, can, one, computation, also, paper, hard] [test, data, result] [roi, box, information, time, evaluated, design] [convolutional, object, layer, score, fully, pooling, training, image, map, table, using, train, rpn, subnetwork, voc, region, use, figure, classification, pascal, deep, shared, spatial, feature, depth, per, accuracy, entire, trainval, bounding, last, semantic, neural, single, candidate, architecture, val, translation, evaluate, trous, scale, network, adopt, position, conv, used]
Large-Scale Price Optimization via Network Flow
Shinji Ito, Ryohei Fujimaki


[cut, programming, number, find, graph] [price, algorithm, problem, optimal, function, submodular, revenue, property, binary, supermodular, strategy, upper, set, define, cost, relaxation, might, best, lower, bound, even, satisfies] [optimization, method, approximate, objective, quadratic, compute, efficient, large, gradient, solved] [can, demand, computational, gross, profit, qpbo, regression, minimum, qpboi, solution, sdprelax, elasticity, following, product, exact, paper, substitute, ieee, satisfying, assumption, prescriptive, also, regarding, good, analysis, fij] [basis, data, construct] [value, time, subject, model] [proposed, using, network, flow, cross, pattern, table, complementary, use]
A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
Yarin Gal, Zoubin Ghahramani


[many, sentiment, probabilistic, probability] [learning, will, appendix, rate] [variational, bayesian, variant, standard, inference, approximate, approximating, large, technique, naive, experiment, posterior, log, step] [can, matrix, error, one, following, analysis, existing, see, small, note] [distribution, test, embedding, word, given, embeddings, random] [model, time, decay, new, early, medium, next, identical, prior] [dropout, recurrent, neural, lstm, weight, rnn, different, using, use, perplexity, layer, language, used, gru, input, deep, single, applied, compared, network, validation, proposed, output, performance, moon, approach, sequence, modelling, rnns, overfitting, regularisation, penn, mask, randomly, figure, lstms, untied, seen]
Unsupervised Learning for Physical Interaction through Video Prediction
Chelsea Finn, Ian Goodfellow, Sergey Levine


[interaction] [learning, show, conference] [method, processing] [can, also, one, interactive] [test, two, transformation, well] [model, future, human, information, prior, robot, action, state, new, internal, next, robotic, time, previously, raw, directly, predictive] [prediction, video, motion, image, previous, object, pixel, figure, neural, predicted, lstm, cdna, predict, using, physical, different, multiple, dataset, convolutional, proposed, trained, vision, performance, predicting, conv, applied, predicts, used, without, use, frame, international, approach, computer, stp, learn, mask, including, unsupervised, training, spatial, produce, explicitly, million, appearance, feedforward, pushing, work, reconstruct, evaluate]
Convolutional Neural Fabrics
Shreyas Saxena, Jakob Verbeek


[number, connected, obtained, node, grows, many, sum, thus, contains] [learning, show, best, exponentially, since, obtain, consider] [large, size, factor] [can, sparse, one, also, error, first, connectivity, see, related, signal] [two, data, embedded] [across, response, state, model, process, along] [convolutional, input, network, fabric, using, channel, use, scale, neural, figure, dense, used, layer, image, per, pooling, part, semantic, table, training, last, cnn, architecture, output, deep, filter, trellis, coarser, single, classification, three, work, maxout, convolution, memory, segmentation, dataset, resolution, augmentation]
Multiple-Play Bandits in the Position-Based Model
Paul Lagrée, Claire Vernade, Olivier Cappe


[cascade, list, expression, number] [lower, bound, regret, algorithm, optimal, bandit, learning, arm, theorem, learner, pbm, will, asymptotically, inf, expected, round, upper, appendix, obtain, click, considered, problem, provide, examination, uniformly, assume, confidence, best, may, decreasing, chosen, close, suboptimal, general, every] [efficient, log, machine, step, parameter, stochastic, end] [can, one, user, item, first, also, order, analysis, following, proposition, observed] [section, given, corresponding, two, data, based, estimator, lim, independent, associated] [model, feedback, reward, time, information, search, action, simple] [position, using, multiple, figure, use, content, performance]
Sublinear Time Orthogonal Tensor Decomposition
Zhao Song, David Woodruff, Huan Zhang


[number, total, probability, variable] [algorithm, theorem, let, achieve, show, constant, fast, smaller, set, since, wang, even, lemma] [sampling, importance, method, inner, latent, approximate, faster, datasets, large, factor] [tensor, can, one, sublinear, power, symmetric, also, running, preprocessing, decomposition, norm, pprox, dimension, sketching, small, main, need, synthetic, orthogonal, spectral, analysis, much, order, denote, first, noise, still, suppose, contraction, asymmetric, product] [slice, random, robust, based, squared, two, dirichlet, estimate, sample, theoretical] [time, take, reading, median] [residual, work, using, without, use, previous, different, generating, input, table, instead]
Generating Videos with Scene Dynamics
Carl Vondrick, Hamed Pirsiavash, Antonio Torralba


[often] [learning, show, since, may, will] [large, usually, latent] [can, one, also, order, first, relative, believe] [two, random, data, interested, people, suggest] [model, temporal, future, action, except, human] [video, generative, network, use, frame, generation, unlabeled, adversarial, generated, using, unsupervised, generator, generate, train, labeled, figure, static, discriminator, background, representation, architecture, convolutional, scene, work, learn, stream, foreground, input, deep, recognition, prefer, vgan, image, motion, visual, performance, approach, realistic, training, instead, antonio, golf, layer, hidden, strided, learned, better, initialized, experimented, accuracy, object, outperforms, promising]
Differential Privacy without Sensitivity
Kentaro Minami, HItomi Arai, Issei Sato, Hiroshi Nakagawa


[probability, adjacent, strong, proportional, definition, obtained, becomes, whose] [privacy, satisfies, loss, function, let, theorem, private, inequality, boundedness, lipschitz, convex, lsi, bound, randomized, upper, learning, constant, corollary, set, algorithm, general, proof, assume, satisfy, bounded, differentially, problem, ratio, defined, risk, example, arbitrary, said, consider, depends, define, concentration, property, tokyo, case] [log, posterior, parameter, size, dkl, twice, university, gradient, sampling, datasets, respect, monte, requires, langevin] [can, following, analysis, assumption, condition, suppose, certain, proposition, sufficient, related] [gibbs, exponential, differential, space, density, estimator, measure, two, distribution, difference, sensitivity, given, gaussian, statistical, exp, empirical, provides, shrinkage, classical] [prior] [mechanism, dataset]
SPALS: Fast Alternating Least Squares via Implicit Leverage Scores Sampling
Dehua Cheng, Richard Peng, Yan Liu, Ioakeim Perros


[amazon, review, many, number, probability] [algorithm, least, randomized, cost, optimal, will, theorem] [sampling, factor, approximation, efficient, importance, size, method, gradient, stochastic, kronecker, step, due, efficiently] [tensor, matrix, can, leverage, row, decomposition, krp, linear, alternating, rank, spals, error, product, via, computational, sparse, first, solving, also, related, low, main, regression, sublinear, one, spectral, sketching, singular, running, much, square] [statistical, data, estimation, two, sample, section, numerical, random, provides] [time, design, allows, form, directly, value, target] [score, input, work, recent, using, previous, similar]
Estimating Nonlinear Neural Response Functions using GP Priors and Kronecker Methods
Cristina Savin, Gasper Tkacik


[] [rate, learning, function, defined, example] [log, inference, kronecker, additional, posterior, approximate, sampling, due] [can, spectral, also, linear, stability, matrix, see, overall, sparse] [place, data, estimate, covariance, gaussian, mixture, mean, functional, estimator, space, kernel, multidimensional, way, estimation, familiar, distribution] [tuning, firing, complex, information, selectivity, animal, time, speed, model, within, direction, cell, process, modulation, exploration, environment, found, simple, nature, poisson, across, hippocampal, behavior, temporal, activity, prior, artificial, grid] [field, spatial, neural, traditional, input, network, using, use, position, used, motion, several, open, scale, figure, recent, representation]
Learning What and Where to Draw
Scott E. Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee


[] [set, learning, show, case, gray, class, provide] [black, draw, variational] [can, global, also, local, vector, noise, small, tensor, addition] [conditioning, embedding, conditional] [box, model, location, white, human, control, long, blue, form] [bird, image, text, keypoints, part, spatial, keypoint, bounding, generative, generate, figure, beak, deep, using, red, network, conditioned, recurrent, adversarial, gawwn, pointy, use, depth, convolutional, yellow, discriminator, neural, pose, object, caption, generator, generating, face, shown, trained, pathway, realistic, training, concat, man, visual, gan, fed, synthesis, position, several, final, feature, generation, generated, cub, orange, encoder, approach, golf]
A Multi-step Inertial Forward-Backward Splitting Method for Non-convex Optimization
Jingwei Liang, Jalal Fadili, Gabriel Peyré


[identification, partial, definition] [mifb, function, example, algorithm, theorem, let, scheme, bnd, optimal, problem, rate, near, property, kxk, proper, set, since, satisfies, loss, bound, case, choose, consider, class, bounded, show, will] [convergence, smooth, optimization, siam, gradient, method, faster, proximal, splitting, objective, iterates, supplementary] [inertial, def, can, local, linear, also, rank, following, partly, relative, thm, see, condition, global, critical, sparse, submanifold, ifb, analysis, lsc, noted, min, proposition, comparison] [journal, finite, locally, point, given, empirical, smoothness, section, numerical, mathematical, riemannian, space] [choice, continuous] [figure, sequence, generated]
Depth from a Single Image by Harmonizing Overcomplete Local Network Predictions
Ayan Chakrabarti, Jingyu Shao, Greg Shakhnarovich


[find, number, coefficient] [set, confidence, since, learning] [derivative, standard, log, method, inference, efficient, large, zeroth, factor, respect] [local, can, also, order, global, relative, second, qij, error, first, vector, note, high, related] [estimation, corresponding, estimate, mixture, mean, geometry, useful, geometric] [form, location, value, characterize, individual, along, information] [depth, network, map, scene, different, image, neural, single, training, using, use, approach, output, globalization, monocular, accuracy, surface, train, used, overcomplete, trained, figure, able, various, performance, color, convolutional, multiple, produce, predicted, instead, validation, predicting, task, nyu, input, better]
Efficient High-Order Interaction-Aware Feature Selection Based on Conditional Mutual Information
Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, Ekaterina Gladkikh, Gleb Gusev, Pavel Serdyukov


[cmicot, team, number, sfs, scoring, interaction, cmi, subset, among, variable, whose, iwfs, account, rcdfs, cmim, identify] [binary, set, greedy, since, complexity, optimal, problem, function, strategy, appendix, algorithm, learning, best, will, let, class] [method, datasets, technique, select, approximation, opposing, large, optimization, size, step, end] [one, following, can, computational, dimension, also, popular, note, existing, order] [selection, based, mutual, selected, two, joint, given, conditional, sample, equal, distribution] [information, search, target, built] [feature, score, approach, complementary, novel, performance, candidate, used, able, already, classification, using, different, several, higher, neural]
Training and Evaluating Multimodal Word Embeddings with Large-scale Web Annotated Images
Junhua Mao, Jiajing Xu, Kevin Jing, Alan L. Yuille


[negative, collected, number, base] [learning, strategy, click, loss, query, will, set] [evaluation, datasets, large, standard, pure] [related, user, one, can, sampled, also] [word, embedding, embeddings, positive, data, sample, based, given, section] [model, information, search, state] [dataset, visual, image, phrase, text, training, multimodal, million, neural, gold, rnn, trained, similar, figure, similarity, using, weight, softmax, semantically, proposed, layer, final, language, learned, hair, recurrent, use, previous, sentence, sharing, adopt, effective, relatedness, score, pinterest, shown, billion, work, gru, annotated, table, several, compared, learn, scale, evaluate, semantic, coco]
SDP Relaxation with Randomized Rounding for Energy Disaggregation
Kiarash Shaloudegi, András György, Csaba Szepesvari, Wilsun Xu


[number, integer, programming, total, find, normalized] [problem, algorithm, set, randomized, relaxation, convex, function, best, binary, minimize, learning] [method, energy, inference, approximate, quadratic, large, consumption, optimization, fhmms, objective, variational, end, efficient, factorial, solve, fhmm] [power, kolter, can, appliance, disaggregation, solution, admm, sdp, error, rounding, jaakkola, load, also, zgs, semidefinite, matrix, sparse, need, ieee, signal, smart, vector, min, home, monitoring, via, running, arg, synthetic, park, following] [data, denotes, based, given, section, length, hmm, point, additive] [time, model, subject, state, usage, change, information, new, form, individual] [use, using, used, performance, neural, work, better]
Orthogonal Random Features
Felix X. Yu, Ananda Theertha Suresh, Krzysztof M. Choromanski, Daniel N. Holtmann-Rice, Sanjiv Kumar


[structure, number] [ratio, show, fast, let, lower, provide, function, since, almost, lemma, set, theorem, cost, achieve, close] [variance, approximation, mse, method, large, approximate, log, var, unbiased, distributed] [orthogonal, can, matrix, linear, also, first, error, following, note, computation, diagonal, sampled, comparison, small, orthogonality, order, hence] [random, kernel, orf, sorf, rff, gaussian, transformation, fourier, circulant, based, fastfood, bias, provides, theoretical, nonlinear, mean, section, distance, space, empirical, distribution, estimation, empirically, korf, given, krff, estimator, fixed] [time] [structured, figure, feature, use, proposed, memory, classification, mnist, using, used, different, similar, compared, cifar, table]
Unsupervised Learning of Spoken Language with Visual Context
David Harwath, Antonio Torralba, James Glass


[branch, many, association, discovery] [learning, set, max, function] [machine, timit, processing] [computational, also, analysis, highly, can, one, approximately, error, dot, product, dimension, first, vector] [embedding, word, data, mean, embeddings, well, given] [model, search, across, information, time, take, along, form, within, modeling, framework] [image, audio, caption, network, speech, spoken, similarity, used, training, using, neural, language, recognition, visual, text, annotation, score, use, acoustic, spectrogram, learned, work, different, google, figure, object, ground, truth, layer, learn, perform, multimodal, deep, semantic, able, dataset, train, unsupervised, table, shown, vgg, similar, convolutional, recognize, recent]
Pruning Random Forests for Prediction on a Budget
Feng Nan, Joseph Wang, Venkatesh Saligrama


[pruning, node, tree, rune, udget, number, ensemble, reedy, average, integer, pruned, leaf, constraint, path, ccp, program, iser, account, tradeoff, acquired, subset, prune, total, forest, majority, rule, indicates, impurity] [cost, example, set, learning, algorithm, budget, problem, optimal, expected, obtain, conference, consider, focus, label] [large, datasets, solve, dual, optimization, competing, due, constrained, method] [can, error, one, first, solving, also, high, matrix, low] [test, based, random, associated, corresponding, entropy, given] [acquisition, decision, usage, resource, time] [feature, used, accuracy, prediction, use, using, approach, classification, training, figure, propose, network, shown, international, outperforms, work]
Stochastic Online AUC Maximization
Yiming Ying, Longyin Wen, Siwei Lyu


[probability, pairwise] [online, learning, solam, algorithm, sup, problem, maximization, function, loss, spp, let, theorem, convex, lemma, complexity, max, optimal, rate, roc, equivalent, surrogate, proof, defined, hinge, minimization, set, inst, assume, drawn, achieves] [stochastic, convergence, objective, gradient, optimization, datasets, method, saddle, update, step, descent, iteration, feat, approximation] [auc, can, min, formulation, following, main, linear, running, one, denote, solution, existing, need, analysis] [space, data, given, point, estimator, positive, true, sample, two, covariance, curve] [time, area, information, ability] [training, used, use, table, performance, proposed, classification, using, work, store, previous, several]
Adaptive Averaging in Accelerated Descent Dynamics
Walid Krichene, Alexandre Bayen, Peter L. Bartlett


[variable, theory, thus, ezi, many, average, weighted, unique] [function, convex, adaptive, rate, theorem, algorithm, set, give, feasible, defined, whenever, consider, will, case, show, studied, since, example, lipschitz, problem, adaptively, learning, corollary, study, guarantee, existence] [accelerated, averaging, lyapunov, ode, convergence, energy, mirror, primal, descent, restarting, method, gradient, dual, replicator, strongly, quadratic, discretization, optimization, supplementary, derivative, restart, uniqueness] [can, solution, condition, second, following, note, suppose, one, also] [given, discrete, section, positive, equation, mathematical, two] [heuristic, trajectory, simple, speed, taking, time, corresponds, significant, form] [using, original, propose, weight, proposed, figure, used, different]
Preference Completion from Partial Rankings
Suriya Gunasekar, Oluwasanmi O. Koyejo, Joydeep Ghosh


[total, partial, monotonic, ordering, subset, entity, definition, dag, directed, often, number] [algorithm, let, set, learning, arbitrary, convex, case, complexity, monotone, problem, margin, index, may] [standard, log, proximal, operator, efficient, parameter, gradient] [matrix, preference, completion, ranking, rank, can, low, collaborative, observed, isotonic, norm, following, listwise, vector, dbl, nuclear, order, onto, error, projection, recommender, letor, regression, retargeted, exact, singular, retargeting] [estimator, estimate, data, numerical, estimation, denotes, underlying, generalization] [within, brain, cognitive, value, simple, information, voxels, representing, evaluated] [using, proposed, performance, jointly, task, used, score, dataset, table, propose, affinity, approach, single]
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
Tejas D. Kulkarni, Karthik Narasimhan, Ardavan Saeedi, Josh Tenenbaum


[knowledge, number] [learning, function, set, conference, chooses, algorithm, rate, consider, expected] [key, stochastic, end, processing, efficient] [can, order, also, success, first, sparse] [space, critic, provides] [goal, reward, agent, reinforcement, controller, state, intrinsic, exploration, policy, hierarchical, value, temporal, model, extrinsic, learns, framework, time, game, delayed, action, intrinsically, information, decision, motivated, process, terminal, future, receives, atari, environment, autonomous, artificial, door, motivation, new, reaching, current, episode, internal, transition, left] [deep, figure, learn, using, neural, approach, different, training, use, international, representation, preprint, arxiv, proposed, architecture]
Multistage Campaigning in Social Networks
Mehrdad Farajtabar, Xiaojing Ye, Sahar Harati, Le Song, Hongyuan Zha


[social, intervention, number, average, programming, find, relation, influence, level, total] [function, optimal, problem, algorithm, maximization, online, appendix, cost, rate, let, constant, best, budget, will] [objective, stage, optimization, due, size] [can, user, one, real, minimum, first, overall, linear, synthetic, still, suppose, following] [point, given, data, random] [exposure, intensity, time, process, campaigning, control, cll, opl, rnd, exogenous, dynamic, activity, event, hawkes, policy, campaign, temporal, form, shaping, history, information, state, future, framework, desired, modeling, wei, world, next, grd, wfl, prk, target, steer, current] [network, performance, prediction, accuracy, different, used, previous, outperforms]
Double Thompson Sampling for Dueling Bandits
Huasen Wu, Xin Liu


[number, normalized, thus, according, present, probability] [dueling, regret, arm, copeland, algorithm, condorcet, bound, learning, bandit, thompson, conference, general, lower, lemma, pij, achieves, pji, show, refine, rucb, may, bij, ccb, mab, will, winner, appendix, bounded, confidence, optimal, expected, suboptimal, rcs, achieve, study, let, consider] [log, sampling, posterior, machine] [can, first, preference, one, also, analysis, second, comparison, much, significantly, following, existing, proposition, user] [two, theoretical, practical, comparing, based, reduces, robust, selected] [information] [candidate, using, traditional, international, compared, propose, double, shown, compare, back, work, pair, score, similar]
Even Faster SVD Decomposition Yet Without Agonizing Pain
Zeyuan Allen-Zhu, Yuanzhi Li


[block, find, rayleigh] [algorithm, theorem, obtain, let, best, even, corollary, dependence, depends, since, guarantee, provide, call, make] [stochastic, accelerated, method, faster, convergence, approximation, sampling, full, lanczos, compute, respect, due, approximate, apply] [running, matrix, krylov, can, singular, one, lazysvd, norm, musco, spectral, paper, also, first, svd, frobenius, satisfying, gap, yes, knd, relative, nnz, denote, column, rank, appxpca, symmetric, following, main, power, zeyuan, orthonormal, alternating, stated] [needed, result, given, type, two] [time, found, state, framework, value] [using, table, use, arxiv, used, four, output, performance, open, multiplicative, question]
Estimating the class prior and posterior from noisy positives and unlabeled data
Shantanu J. Jain, Martha White, Predrag Radivojac


[negative, probability, added, theory, labeling, identifiable] [class, algorithm, learning, set, let, label, setting, general, defined, function, lemma, theorem, binary, follows, will] [posterior, size] [noise, can, also, noisy, one, first, related, component] [data, positive, estimation, sample, mixture, true, proportion, nonparametric, parametric, univariate, distribution, practical, identifiability, estimating, mixing, estimate, density, transformation, random, alphamax, space, gaussian, msgmm, dimensionality, section, robust, statistical, equation, two, maximum] [prior, form, model] [unlabeled, labeled, classification, using, used, approach, input, use, classifier, learn, performance, different, explicitly, pair, generate]
Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering
Dogyoon Song, Christina E. Lee, Yihua Li, Devavrat Shah


[probability, according, variable, whose] [algorithm, function, provide, bounded, lipschitz, bound, set, lemma, follows, unknown, exists, theorem, assume] [latent, variance, method, variant, parameter, due, around, approximation] [collaborative, matrix, error, filtering, row, can, observed, regression, rating, analysis, local, user, movie, completion, also, column, factorization, dimension, order, noise, netflix, fraction, movielens, first, recommendation] [sample, kernel, neighbor, empirical, classical, mean, estimate, nearest, nonparametric, given, based, true, metric, difference, two, taylor, squared, exp, gaussian, commonly] [framework, information, long, model] [using, similar, similarity, use, predict, input, used, work]
Beyond Exchangeability: The Chinese Voting Process
Moontae Lee, Seok Hyun Jin, David Mimno


[quality, voting, cvp, negative, helpfulness, vote, display, chinese, number, amazon, trendiness, average, social, crp, positional, polarity, review, probability, many, community, popularity, presentational, stackexchange, conformity, urn, stackoverflow, sentiment, opinion] [conference, function, ratio, since, online, every, learning, show, even, study] [log, parameter, due] [can, rank, user, first, phase, item, also, qij, one, existing, order, relative, note, restaurant] [two, based, positive, selection, given, length, estimated, bias] [response, time, model, process, intrinsic, new, information, whereas, behavioral, predictive, early, trajectory] [different, table, figure, better, using, single, instead, previous]
Stochastic Variance Reduction Methods for Saddle-Point Problems
Balamurugan Palaniappan, Francis Bach


[number, thus, probability, split, proportional, sum, many] [convex, algorithm, consider, may, monotone, appendix, complexity, problem, function, defined, get, always, assume, minimization, show, loss, learning, rate, fast, constant, theorem, every] [stochastic, operator, saga, proximal, svrg, convergence, accelerated, batch, bilinear, gradient, method, variance, separable, optimization, machine, sampling, factored, efficient, siam, reduction, iterate, iteration, variational, saddle, acceleration] [need, following, can, leading, matrix, vector, note, analysis, see, existing, extension, regularizer, linear, solution] [section, journal, two, associated, given] [individual, simple, value] [using, use, several, table, shown]
Towards Unifying Hamiltonian Monte Carlo and Slice Sampling
Yizhe Zhang, Xiangyu Wang, Changyou Chen, Ricardo Henao, Kai Fan, Lawrence Carin


[monomial, variable, total, becomes] [function, drawn, show, uniformly, algorithm, may, adaptive, consider, case] [sampling, monte, momentum, standard, carlo, parameter, large, energy, method, convergence, initial, auxiliary] [can, one, solving, also, generalized, analysis, via, symmetric, first] [slice, hamiltonian, hmc, distribution, theoretical, numerical, sampler, analytic, sample, point, conditional, density, mixing, transformation, space, provided, practical, gamma, two, section, legendre, exponential, family, gaussian, random, acceptance, chain, journal, described, equation, resampling, uniform, independent, given] [target, system, kinetic, time, trajectory, corresponds, integration, value, along, autocorrelation, form, new, defines] [performance, using, original, figure, shown, approach, table]
Parameter Learning for Log-supermodular Distributions
Tatiana Shpakova, Francis Bach


[probabilistic, base, probability, thus, gumbel, review] [function, submodular, bound, consider, learning, convex, binary, max, may, show, lower, minimization, polytope, upper, std, set, since, known, equivalent, algorithm, subgradient, modular, problem] [logistic, log, parameter, alogistic, stochastic, likelihood, approximation, optimization, inference, approximate, expectation, efficient, gradient, variational] [can, min, noise, noisy, one, missing, also, note, following, linear] [maximum, section, two, based, random, conditional, given, data, journal, equal, distribution, sample, independent, well, discrete, estimation] [model] [image, use, using, used, learn, figure, perform, better, supervised, table, unsupervised, several, approach]
The Parallel Knowledge Gradient Method for Batch Bayesian Optimization
Jian Wu, Peter Frazier


[knowledge, number, report] [function, algorithm, learning, set, regret, ucb, expected, deviation, show, improvement, maximizing, choose, will] [parallel, batch, optimization, gradient, machine, bayesian, method, size, immediate, qkg, posterior, standard, logistic, sampling, hyperparameters, especially, compute, efficient, develop, proposes, evaluation, processing, initial] [can, one, synthetic, noisy, min, regression, error, global, also, solution] [test, gaussian, section, mean, point, distribution, practical, two, sample, independent, finite, provides, testing, journal] [acquisition, tuning, process, optimize, next, prior, information] [evaluate, scale, domain, neural, using, better, figure, several, training, previous, performance, different]
Stochastic Optimization for Large-scale Optimal Transport
Aude Genevay, Marco Cuturi, Gabriel Peyré, Francis Bach


[probability, thus] [problem, optimal, algorithm, defined, set, function, maximization, known, arbitrary, consider, since, max, show, define, rate] [stochastic, dual, gradient, convergence, sgd, optimization, sag, processing, sinkhorn, compute, solve, iterates, incremental, method, approximation, proxy, supplementary, faster, expectation, large, iteration, primal] [can, solution, regularized, computational, note, plot, linear, solving, regularization, one, norm, denote, proposition, vector] [discrete, transport, kernel, two, section, finite, metric, sample, distance, wasserstein, word, density, empirical, converge, random, space] [continuous, averaged, information, corresponds, another] [using, figure, different, compare, propose, used, neural, use, shown, approach, three]
Learning Kernels with Random Features
Aman Sinha, John C. Duchi


[consistency, base, probability] [problem, learning, randomized, provide, consider, set, let, risk, convex, conference, misclassification, concentration, define, function, lemma, show, best] [optimization, method, machine, respect, iid, sampling, efficient, solve, employ, standard, requires, approximate, large, processing, speedup] [error, can, linear, regression, computational, solution, matrix, one, following] [kernel, random, generalization, data, space, procedure, empirical, joint, gaussian, distribution, estimator, section, rahimi, reuters, test, selection, well, denotes, given, sample, ridge] [optimized, time, model, benchmark, simple, information] [feature, performance, approach, training, use, alignment, figure, using, supervised, learn, original, dataset, neural, international, input]
CliqueCNN: Deep Unsupervised Exemplar Learning
Miguel A. Bautista, Artsiom Sanakoyeu, Ekaterina Tikhoncheva, Bjorn Ommer


[resulting, number, clique, negative, transitivity, obtained, average, many, thus] [learning, problem, set, now, since, show, query, label, wang, cost] [batch, large, method, optimization, sgd, compute, due, initial] [can, mutually, one, analysis, also, matrix, unreliable] [nearest, positive, data, estimation, sample, based] [model, information] [exemplar, training, similarity, approach, cnn, using, pose, different, similar, unsupervised, visual, object, compact, single, performance, proposed, image, posture, olympic, deep, representation, supervised, neural, classification, learned, used, pascal, feature, cnns, use, evaluate, hog, figure, train, compare, part, alexnet, compared, trained, dataset, voc, convolutional]
Adaptive optimal training of animal behavior
Ji Hyun Bak, Jung Choi, Ilana Witten, Athena Akrami, Jonathan W. Pillow


[rule, theory] [learning, optimal, set, expected, may, algorithm, will, adaptive, function, dependence, rate, learner, defined, toward, let] [log, method, full, gradient, likelihood, posterior, parameter, supplementary, hyperparameters, suggests] [can, vector, order, first, success, matrix] [estimate, true, space, fixed, two, given, based, random, bias, estimating, discrimination] [model, stimulus, behavior, history, trial, animal, rat, evidence, reward, prior, choice, desired, simulated, internal, effect, alignmax, policy, change, time, current, variability, reinforcement, psychometric, difficult, hyperparameter, drive, experimental, value, series, governing, correct, behavioral, infer, new, early] [training, weight, using, figure, shown, used, use, task, neural, accurately, map]
Improved Dropout for Shallow and Deep Learning
Zhe Li, Boqing Gong, Tianbao Yang


[number, definition, connected, evolving, covariate, according, present] [learning, risk, bound, smaller, let, theorem, achieves, dependent, set, upper, bernoulli, minimize, minimization, make, obtain, defined] [sampling, standard, batch, stochastic, optimization, convergence, normalization, faster, gradient, processing, logistic, usually] [can, also, error, noise, following, analysis, proposition, denote, second, order, note, one, first, vector] [data, multinomial, distribution, testing, given, theoretical, based, uniform, empirical] [internal, information, issue, experimental] [dropout, deep, evolutional, neural, training, using, different, shallow, proposed, feature, use, layer, performance, similar, network, iters, figure, propose, three, used, convolutional, compare, geoffrey, four, alex]
Using Fast Weights to Attend to the Recent Past
Jimmy Ba, Geoffrey E. Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu


[rule, expression, many] [fast, learning, will, set, rate, conference] [processing, capacity, machine, normalization, term] [can, vector, one, much, product, first, need, also, computational, retrieval] [two] [model, information, state, slow, current, activity, transition, synaptic, past, time, decay, long, agent, scalar, allows, game, brain, new, process] [hidden, memory, neural, associative, attention, different, visual, recurrent, using, input, lstm, used, use, store, recent, layer, network, sequence, glimpse, task, rnn, weight, stored, learn, figure, table, storage, single, classification, temporary, convnet, performance, facial, previous, without, image, mnist, object, integrate, training, presented, shown, paddle, recognition]
Learning Parametric Sparse Models for Image Super-Resolution
Yongbo Li, Weisheng Dong, Xuemei Xie, GUANGMING Shi, Xin Li, Donglai Xu


[cluster, average, obtained, denoted] [learning, function, set, algorithm, conference, problem] [method, argmin, factor, solved, processing, conventional, large] [sparse, can, via, ieee, desirable, dictionary, solving, pca, recovered, also, first, matrix, sparsity, denote, regression, reconstructed, recover, anchored] [parametric, gaussian, estimated, denotes, estimate, basis, based, test, mean, selection] [prior, scaling, coding, followed, experimental] [image, learned, similar, proposed, mapping, patch, learn, used, bicubic, using, feature, ncsr, generated, computer, training, srcnn, input, performance, propose, natural, shown, psnr, international, original, extracted, downsampling, neural, novel, scsr, degradation, table, blur]
CNNpack: Packing Convolutional Neural Networks in the Frequency Domain
Yunhe Wang, Chang Xu, Shan You, Dacheng Tao, Chao Xu


[cluster, number, larger, thus, coefficient, obtained, pruning] [compression, ratio, since, scheme, set, algorithm, will, learning, smaller, complexity] [large, method, size, although] [frequency, can, compressed, computational, matrix, also, small, sparse, note] [data, two] [operation, model] [convolutional, dct, proposed, domain, neural, filter, accuracy, deep, layer, cnns, approach, network, compressing, feature, alexnet, used, original, similar, image, using, huffman, cnn, mobile, memory, residual, cnnpack, convolution, relatively, performance, figure, impact, table, storage, arxiv, spatial, preprint, cosine, use, rdi, effective, net, weight, storing, object, applying]
Poisson-Gamma dynamical systems
Aaron Schein, Hanna Wallach, Mingyuan Zhou


[structure, negative, definition, report, subset, many] [set, conference, obtain, define, algorithm, provide] [latent, inference, sampling, bayesian, factor, auxiliary, parameter, five, draw, pass, expressive] [can, component, matrix, also, via, observed, error, one, linear, following, note, analysis] [data, distribution, gamma, two, conditional, well, nonparametric, gaussian, introduce, given] [pgds, count, model, transition, poisson, time, lds, information, therefore, dynamical, binomial, burstiness, sequentially, inferred, marginalize, process, gdelt, new, backward, alternative, prior, lkk, system, sotu, linked, marginalizing] [used, figure, top, using, feature, three, international, performance, neural]
A Unified Approach for Learning the Parameters of Sum-Product Networks
Han Zhao, Pascal Poupart, Geoffrey J. Gordon


[spn, cccp, spns, induced, wij, sum, structure, monomial, complete, node, learnspn, decomposable, sma, number, tree, pgd, signomial, unique, fvj, program, fact, edge, often] [learning, function, convex, show, set, problem, algorithm, concave, will, let, since, equivalent, obtain, surrogate] [objective, log, optimization, convergence, parameter, update, bayesian, gradient, sequential, computed, although] [can, also, product, unified, vector, note, first, despite] [two, mixture, polynomial, data, based, univariate, random, difference, mle, procedure, distribution, maximum, expressed] [model, form, corresponds] [network, use, multiplicative, four, using, input, different, better, used]
Kernel Bayesian Inference with Posterior Regularization
Yang Song, Jun Zhu, Yong Ren


[probability, rule, consistency, whose, variable, relation] [theorem, let, learning, problem, set, assume, show, will, since, function] [posterior, bayesian, inference, likelihood, expectation, machine] [regularization, can, following, thresholding, regression, linear, note, arg, regularized, filtering, first, formulation, via, product, also] [kernel, embeddings, rkhs, distribution, embedding, data, space, estimator, kregbayes, sample, cxx, arthur, cxy, hilbert, random, conditional, covariance, theoretical, reproducing, estimate, joint, based, given, optimizational, nonlinear, journal, consistent, two, pkbr, squared, finite] [framework, new, kalman, observation, markov] [training, used, propose, use, hidden, feature, filter, different, camera, compared, alex, international, position, china]
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
Amit Daniely, Roy Frostig, Yoram Singer


[node, connected, normalized, obtained, denoted, number, whose, probability, theory, incoming, induced, often] [learning, let, function, loss, show, example, every, bounded, set, theorem, assume, conference, convex, complexity] [processing, dual, machine, supplementary, log, size] [can, computation, initialization, norm, denote, note, also, analysis, ieee, following] [kernel, skeleton, random, space, corresponding, empirical, hilbert, well, polynomial, distribution, family, gaussian, theoretical, refer, reproducing, sample] [information, simple, realization, next, extend] [neural, activation, network, output, input, convolutional, layer, deep, representation, fully, used, single, training, relu, use, figure, work, supervised, sequence]
Regret Bounds for Non-decomposable Metrics with Missing Labels
Nagarajan Natarajan, Prateek Jain


[] [] [] [] [] [] []
Accelerating Stochastic Composition Optimization
Mengdi Wang, Ji Liu, Ethan Fang


[lambda] [problem, algorithm, composition, rate, function, convex, learning, optimal, case, minimization, wang, special, scgd, consider, achieves, best, complexity, general, known, theorem, show, example, choose, let, expected, necessarily, fast, online] [stochastic, convergence, gradient, objective, strongly, proximal, optimization, expectation, experiment, accelerated, inner, method, siam, machine] [linear, can, first, solution, denote, one, analysis, note, see, min, assumption, solving, error, main, regularization] [two, random, result, compositional, application, empirical, journal, denotes, given, theoretical, equation, important, section, sample] [state, reinforcement, new, bellman, policy, averaged, value, transition, reward] [proposed, generate, preprint, arxiv, using, figure, use]
Proximal Deep Structured Models
Shenlong Wang, Sanja Fidler, Raquel Urtasun


[unary, configuration, subset] [learning, function, loss, set, algorithm, problem, show, convex, considered, special, rate, exist, minimize] [proximal, inference, operator, energy, method, gradient, compute, dual, primal, pass, iteration, refinement, flownet, step, reader, size, mrfs] [can, first, note, min, arg, second, sparse] [random, family, type, refer, data, shrinkage, gaussian, given, discrete] [model, continuous, complex, forward] [deep, structured, image, depth, output, used, optical, neural, use, flow, performance, recurrent, using, convolution, approach, proposed, shown, dataset, denoising, figure, convolutional, training, learn, different, train, activation, natural, network, whole, recent, table]
Supervised Learning with Tensor Networks
Edwin Stoudenmire, David J. Schwab


[site, number, diagram, structure, thus, represent] [learning, algorithm, set, function, label, index, will, cost, optimal, consider, adaptively, implies] [large, gradient, approximation, machine, step, optimization, additional] [tensor, can, bond, matrix, one, product, vector, local, mapped, decomposition, also, svd, renormalization, error, dimension, first, small, order, singular, good, interesting, group, orthogonal] [data, test, space, basis, density, quantum, type, dimensional, two, kernel] [form, decision, model, next, optimizing, right, value, optimize, new] [feature, input, training, map, network, weight, using, use, shown, figure, classification, neural, mnist, similar, approach, different, image, used, illustrated, grayscale, pixel]
Total Variation Classes Beyond 1d: Minimax Rates, and the Limitations of Linear Smoothers
Veeranjaneyulu Sadhanala, Yu-Xiang Wang, Ryan J. Tibshirani


[total, theory, graph, sharp, many, often] [minimax, smoothing, rate, optimal, bound, function, lower, risk, theorem, will, class, slope, defined, bounded, learning, upper, even, known, constant, max, study, smaller, eigenmaps, now, conference, trivial, problem, lemma, setting] [log, mse, parameter, siam] [linear, also, can, see, interesting, gap, first, wavelet, via, signal, remark, though, denote] [laplacian, sobolev, estimator, mean, canonical, estimation, discrete, journal, section, nonlinear, nonparametric, given, result, annals, provides, trend, estimating, two, inscribed, ryan, space] [ball, scaling, grid] [denoising, variation, like, figure, different, image, international, using, shown, denoiser, neural]
Communication-Optimal Distributed Clustering
Jiecao Chen, He Sun, David Woodruff, Qin Zhang


[clustering, graph, blackboard, message, passing, sparsifier, coordinator, edge, total, site, among, number, notice, centralized, david, present, partition, constructing, correctly, sends, normalized, sparsifiers, qin, undirected, probability, cluster, vertex] [algorithm, cost, theorem, let, constant, show, set, lower, every, will, since, study, define, optimal, obtain, defined, assume, bound, function, lemma, setting] [distributed, sampling, computing] [spectral, can, matrix, one, also, following, diagonal, leverage] [two, chain, data, based, constructed, laplacian, geometric, length, space, construct] [model, communication, value, broadcast, information, write] [using, different, input, used, figure, weight, similar]
Exponential Family Embeddings
Maja Rudolph, Francisco Ruiz, Stephan Mandt, David Blei


[negative, contains, find, structure] [context, function, set, example, study, will, define, learning] [objective, stochastic, latent, sampling, gradient, factor, parameter, fit, inner, term, size] [can, one, link, also, movie, nonnegative, matrix, analysis, vector, linear, product, item, see, pca] [data, embedding, word, embeddings, conditional, exponential, point, shopping, two, family, gaussian, test, distribution, basket, mean, market, given, section, supplement, equation, surrounding, cbow, purchased, purchase, based, hpf] [poisson, model, activity, neuron, time, across, count, form, information] [neural, use, similar, different, table, language, natural, using, three, similarity, used]
Data Programming: Creating Large Training Sets, Quickly
Alexander J. Ratner, Christopher M. De Sa, Sen Wu, Daniel Selsam, Christopher Ré


[labeling, programming, dependency, knowledge, relation, extraction, describe, distant, crowdsourcing, base, many, number, conflict, filling] [learning, set, function, label, provide, loss, may, show, expected, will, conference, problem, risk, example, achieve, general, class, improvement] [large, machine, parameter, gradient, stochastic, processing] [can, also, user, one, first, noisy, min, error] [data, given, independent, two, distribution, section, point, empirical] [model, information, scaling, framework, new] [training, using, labeled, generative, approach, used, score, supervision, unlabeled, use, supervised, lstm, feature, neural, without, paradigm, input, dataset, accuracy, text, discriminative]
Regret of Queueing Bandits
Subhashini Krishnasamy, Rajat Sen, Ramesh Johari, Sanjay Shakkottai


[number, probability, cycle, thus] [regret, queueing, bound, queue, algorithm, bandit, optimal, late, service, lower, problem, regenerative, theorem, cumulative, show, arm, loaded, heavily, mab, rate, scheduling, corollary, thompson, learning, let, consider, obtain, since, upper, job, stabilize, expected, depends, instance, schedule, allocation] [log, stage, server, stochastic, standard, sampling, beginning, university] [can, analysis, one, first, phase] [given, length, two, asymptotic, regime, mean, corresponding, independent, distribution, difference] [time, early, system, exploration, policy, behavior, transition, reward, scaling, arrival, past, across] [performance, structured, different, use, like, figure, single]
Improving PAC Exploration Using the Median Of Means
Jason Pazis, Ronald E. Parr, Jonathan P. How


[number, probability, possible, average, many, definition, fact] [pac, algorithm, function, will, complexity, learning, best, qmax, bounded, set, theorem, least, every, even, let, prove, cost, bound, conference, defined, inequality, known, lower, diameter, optimal, exist, max, regret, optimistic, show] [variance, operator, factor, end, expectation, approximation] [can, first, error, note, analysis, one, high, following] [sample, fixed, finite, space, estimate, discrete, mean] [exploration, policy, state, bellman, value, reward, median, action, mdps, reinforcement, transition, rather, take, decision, unbounded, mdp, concurrent, simple, pazis, markov, continuous, next, time, range, delayed] [work, using, per, available, used]
Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Jack Rae, Jonathan J. Hunt, Ivo Danihelka, Timothy Harley, Andrew W. Senior, Gregory Wayne, Alex Graves, Tim Lillicrap


[number, external, level] [learning, set, differentiable, since, access, scheme, defined, best, cost, constant, may] [supplementary, step, machine, large, efficient, overhead, recently, processing] [sparse, can, one, via, linear, also] [space, length, nearest, word, neighbor, test, comparable] [time, sam, write, read, model, ntm, ann, backward, wtr, operation, forward, controller, usage, curriculum, turing, copy, information, accessed, addressing] [memory, neural, dense, used, training, sequence, use, able, learn, figure, associative, using, network, priority, previous, character, perform, preprint, approach, arxiv, task, omniglot, train]
PAC-Bayesian Theory Meets Bayesian Inference
Pascal Germain, Francis Bach, Alexandre Lacoste, Simon Lacoste-Julien


[negative, probability, theory, obtained, frequentist, majority, valid, number] [loss, bound, function, theorem, learning, corollary, consider, risk, study, obtain, set, least, optimal, bounded, john, equivalent, may, minimizing, show, general, upper, defined, appendix, provide, tighter] [bayesian, marginal, posterior, likelihood, factor, variance, parameter, machine, log, rely, compute] [regression, can, one, linear, note, assumption, analysis, real, first] [equation, generalization, given, distribution, section, gibbs, empirical, selection, gaussian, exp, sample, squared, data, alexandre, result, done, two, yevgeny, bayes] [model, prior, unbounded, choice, new, extend] [using, used, figure, classification, pascal, training, input]
Tree-Structured Reinforcement Learning for Sequential Object Localization
Zequn Jie, Xiaodan Liang, Jiashi Feng, Xiaojie Jin, Wen Lu, Shuicheng Yan


[detection, number, tree, path, level, threshold, find, average, recursively] [set, optimal, fast, best, obtain, scheme, learning, uncovered, function, active] [proposal, large, faster, sequential] [one, local, localization, can, small, also, global, group] [two, testing, based, well, searching] [search, action, agent, current, scaling, reward, state, starting, reinforcement, model, next, taking, new, taken, sequentially, value, decision] [object, window, recall, whole, proposed, multiple, image, iou, single, voc, feature, translation, different, deep, table, pascal, rpn, training, approach, using, layer, map, visual, better, figure, preprint, arxiv, generated, attended, instead]
On the Recursive Teaching Dimension of VC Classes
Xi Chen, Xi Chen, Yu Cheng, Bo Tang


[recursive, number, sat, definition, boolean, string, present, subset, among, formula, theory, introduced] [concept, class, set, bound, tsmin, learning, every, upper, smallest, lemma, nonempty, proof, prove, lower, rtd, complexity, known, let, contradiction, vcd, cbi, shattered, kuhlmann, least, compression, learner, obtain, since, ratio, assume, conference, example, satisfiability, show, depends, follows, binary, cby, moran, always, appear, case, induction] [size, log, university, factor] [dimension, one, can, also, first, projection, denote, small, see, following, still, computational, order] [given, sample, two, family, needed] [teaching, must, teacher, desired] [figure, domain, use, computer, improve, learn, open]
Adaptive Smoothed Online Multi-Task Learning
Keerthiram Murugesan, Hanxiao Liu, Jaime Carbonell, Yiming Yang


[rule, number, knowledge, sentiment, detection, average] [learning, online, algorithm, spam, loss, hypothesis, may, set, regret, will, learner, let, argminwk, problem, show, consider, since, theorem, corollary, adaptive, setting, omtrl] [batch, optimization, updating, end, machine, university, datasets, update, large] [matrix, can, related, formulation, multitask, also, existing, paper, addition] [data, two, given, well, independent, section] [model, information, time, learns, benchmark] [task, training, relationship, pkj, proposed, learn, use, shared, similar, multiple, classification, performance, three, compared, better, different, domain, per, landmine, several, using, single, learned, consists, available, work, dataset, shamo]
Conditional Generative Moment-Matching Networks
Yong Ren, Jun Zhu, Jialian Li, Yucen Luo


[present, knowledge, number] [learning, function, set, appendix, class, defined, element, theorem, competitive] [bayesian, objective, stochastic, gradient, draw, size, minibatch, operator] [can, also, dimension, first, one, error, gram, interesting, via] [conditional, cgmmn, mean, kernel, distribution, embedding, data, space, sample, maximum, cmmd, discrepancy, two, estimate, given, difference, test, mmd, dark, dxy, distill, hilbert, rkhs] [model, predictive, simple, dgm, whether, range] [generative, network, dataset, training, generated, deep, hidden, using, input, learn, mnist, neural, performance, including, mlp, generating, use, image, prediction, architecture, generate, feature, various, convolutional, discriminative, layer, used, shown, unit, figure, table, cnn]
DeepMath - Deep Sequence Models for Premise Selection
Geoffrey Irving, Christian Szegedy, Alexander A. Alemi, Niklas Een, Francois Chollet, Josef Urban


[conjecture, many, number, definition, average, logic, negative] [theorem, set, proof, learning, best, defined, max, prove, since] [large, stage, evaluation] [can, rank, order, proved, first, also, relative] [selection, premise, mizar, automated, atp, mathematical, axiom, embeddings, proving, formal, itp, embedding, two, test, length, useful, based, given, corpus, formalization, formalized, pavail, word] [model, time, experimental, simple, statement, baseline] [neural, network, using, training, sequence, reasoning, figure, use, volume, fully, used, trained, convolutional, deep, recurrent, without, task, preprint, arxiv, architecture, previous, google, language, approach, different, produce]
Interpretable Nonlinear Dynamic Modeling of Neural Trajectories
Yuan Zhao, Il Memming Park


[stable] [function, set, constant, august] [latent, around, initial, step, black, method, inference, standard, reduction, autoregressive] [computation, can, one, linear, phase, note, computational, coherence, also, global] [nonlinear, fixed, true, basis, ring, point, two, test, journal, chaotic, gaussian, locally, random, mean, finite, space, corresponding, theoretical] [model, time, dynamical, trajectory, direction, velocity, series, attractor, continuous, slow, system, simulated, within, head, stimulus, interpretable, blue, decision, driven, dynamic, current, ghost, neuroscience, november] [neural, prediction, training, input, figure, train, field, use, using, proposed, trained, network, used, recurrent, perceptual, red]
Learning Deep Embeddings with Histogram Loss
Evgeniya Ustinova, Victor Lempitsky


[negative, histogram, number, pairwise, probability] [loss, learning, set, conference, online, function, margin, rate, smaller, class, best, uniformly] [batch, datasets, size, sampling, large, computed, parameter] [can, ieee, also, product, mij, sampled] [positive, two, embedding, embeddings, metric, distance, sample, random, corresponding, test, estimate] [binomial, new] [deep, used, similarity, triplet, computer, use, training, figure, deviance, pair, vision, using, image, pattern, dataset, neural, person, classification, different, recognition, network, international, face, sij, last, randomly, contrastive, several, architecture, shown]
Coordinate-wise Power Method
Qi Lei, Kai Zhong, Inderjit S. Dhillon


[number, rule, social] [algorithm, show, set, greedy, function, will, learning, appendix, selecting, let, problem, strategy, even] [method, coordinate, convergence, lanczos, iterate, descent, faster, iteration, processing, optimization, select, update, machine, computing, requires, converges, large, objective] [power, matrix, cpm, sgcd, dominant, can, tan, sparse, symmetric, largest, one, also, vanilla, real, vector, see, arg, eigenvector, eigenvalue, vrpca, linear, much, global, loading, following, comparison, good, analysis] [data, selection, section, important, positive] [time, therefore, next, value] [figure, performance, part, propose, dataset, mechanism, using, better, dense, similar, neural, applied, compared, use, shown]
PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
Mikhail Figurnov, Aizhan Ibraimova, Dmitry P. Vetrov, Pushmeet Kohli


[number, structure, modern, obtained] [set, cost, since, may, best, achieve, consider, function, choosing] [size, acceleration, method, speedup, increase, evaluation, reduce, decrease, reduction] [error, computational, can, matrix, tensor, see, high, first, support] [redundancy, two, data, kernel, uniform] [grid, exploit, time, value, required] [perforation, convolutional, network, layer, mask, impact, spatial, output, using, use, cpu, alexnet, neural, cnn, pooling, perforated, training, approach, table, deep, gpu, fractional, memory, several, input, perforatedcnns, proposed, used, evaluate, nin, figure, cnns, propose, combined, perform, architecture, store, calculated, lebedev, original, different]
Faster Projection-free Convex Optimization over the Spectrahedron
Dan Garber, Dan Garber


[probability, strong, dan] [algorithm, convex, let, problem, learning, follows, function, since, conference, set, rate, minimizing, theorem, elad, bound, optimal, setting, lemma] [method, gradient, convergence, optimization, machine, standard, faster, kvt, iteration, processing, variant, update, full, requires, strongly] [matrix, eigenvector, min, decomposition, denote, can, also, leading, linear, solving, semidefinite, computation, following, error, completion, rank, eigenvalue, analysis, one, much, norm, main, regularized, certain, nuclear, ovie, note, first, convexity, vector, largest, garber, spectrahedron] [conditional, given, distribution, important] [new, choice, information, current] [unit, international, single, neural, task, several]
An algorithm for L1 nearest neighbor search via monotonic embedding
Xinan Wang, Sanjoy Dasgupta


[tree, average, number, possible] [query, set, hashing, algorithm, lsh, case, show, scheme, general, function, will, cost, choose, consider, rate, hash] [approximate, log, end, method, reduction] [projection, can, explicit, dimension, also, high, linear, one, rank, satisfying, practice, mapped] [data, random, nearest, embedding, neighbor, distance, metric, gaussian, two, project, space, point, erp, multivariate, corel, joint, cauchy, based, distribution, embedded, andoni, journal, embeddings, scan, given] [search, time, choice, implement, accessed, along, save] [using, use, used, table, performance, volume, generate, available, original, per, multiple]
Learning Tree Structured Potential Games
Vikas Garg, Tommi Jaakkola


[tree, potential, structure, configuration, node, nij, assignment, many, edge, vote, agree, possible, graphical, supreme] [set, algorithm, learning, setting, strategy, optimal, problem, may, since, will, max, context, function, uniformly, chosen, theorem, least] [dual, method, iteration, compute, optimization, approximate, stochastic, solve] [local, global, decomposition, can, real, one, recover, synthetic, phase, following, first, note, sampled] [data, underlying, two, locally, associated, random, procedure] [game, player, across, value, strategic, form, correct, court, model, typically] [structured, prediction, using, learn, figure, training, without, dataset, different, language]
Natural-Parameter Networks: A Class of Probabilistic Neural Networks
Hao Wang, Xingjian SHI, Dit-Yan Yeung


[probabilistic, number, according] [learning, function, since, set, algorithm, show, achieve, assume, rate, class, defined, may] [variance, bayesian, compute, supplementary, datasets, inference, efficient, proxy, detailed] [can, error, link, also, note, linear, rank, first, computation, one, matrix, following, vector, support] [npn, gaussian, gamma, distribution, nonlinear, mean, data, transformation, test, equation, exponential, two, bdk, section, corresponding, estimated, random, denotes] [poisson, another, pgm, uncertainty, form, model] [natural, different, use, used, neural, deep, activation, training, dropout, performance, table, output, like, network, learned, using, figure, higher, representation, consists, input, learn]
Using Social Dynamics to Make Individual Predictions: Variational Inference with a Stochastic Kinetic Model
Zhen Xu, Wen Dong, Sargur N. Srihari


[social, number, probability, grows, infection, infected, transmission, collected, represents] [rate, algorithm, set, complexity, make, expected, problem] [inference, stochastic, variational, sampling, approximate, bayesian, large, linearly, log, likelihood, efficient, latent, step, tractable] [can, filtering, order, coupled, one, computation] [data, given, hmm, gibbs, introduce, positive, section, people, discrete, kernel, two] [individual, time, model, state, infectious, kinetic, event, transition, markov, dynamic, viskm, epidemic, evolution, particle, temporal, susceptible, form, collective, current, demonstrate, dartmouth, disease, sensor, modeling, backward] [hidden, three, performance, figure, using, predict, sequence, use, network, different, field, entire, shown]
Computing and maximizing influence in linear threshold and triggering models
Justin T. Khim, Varun Jog, Po-Ling Loh


[influence, threshold, edge, infected, graph, vertex, cascade, probability, infection, node, social, lbtrig, kempe, directed, subset, number, monotonic, preferential, live, adjacency, weighted, becomes, acm, runtime] [lower, bound, set, upper, may, theorem, greedy, maximization, algorithm, function, maximizing, general, submodular, conference, case, since, inequality, class, bji, bij, let, tighter, define, problem] [step, size] [linear, following, also, note, via, matrix, proposition, certain, denote, gap, paper, suppose] [independent, section, random, theoretical, data, two, discrete, length, maximum, selected, mathematical] [triggering, model, time, simulation, information, write] [network, international, weight, computer, applied, figure]
Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm
Kejun Huang, Xiao Fu, Nikolaos D. Sidiropoulos


[criterion, identification, mining, identifiable, thus, obtained, runtime, clustering] [algorithm, problem, since, may, considered, show, theorem, least] [optimization, size, separable, usually] [matrix, can, assumption, anchorfree, sufficiently, scattered, also, xray, nonnegative, anchor, linear, factorization, det, snpa, much, one, mined, solution, column, first, coherence, condition, cone, noise, fastachor, cec, deflation, fastanchor, still, analysis, projection, via, see] [topic, data, identifiability, document, corpus, based, spa, two, vocabulary, word, lewinsky, reuters, procedure] [model, modeling, correlation, subject] [proposed, using, compared, use, text, work, table, propose, used, better, approach]
Robust k-means: a Theoretical Revisit
ALEXANDROS GEORGOGIANNIS


[cluster, consistency, clustering, definition, influence, clear, become, valid, called, subset, number] [set, function, theorem, let, show, least, consider, rate, class, general, bounded, optimal, case, problem, defined, corollary, define, convex, every, lemma] [proximal, quadratic, unbiased, distributed, datasets, convergence] [one, error, following, min, analysis, also, can, outlier, proposition, vector, order, lie, closest] [robust, trimmed, breakdown, two, moreau, point, robustness, envelope, center, estimation, journal, positive, given, section, associated, equal, radius, modification, statistical, sample, distance, universal, based, biased, remain, mean] [form, value] [map, dataset, figure, top, different, like]
The Power of Adaptivity in Identifying Statistical Alternatives
Kevin G. Jamieson, Daniel Haas, Benjamin Recht


[probability, number, knowledge, detection, total, bag, many, identifying, identification, presence, identify, repeat] [coin, problem, algorithm, lower, heavy, strategy, known, theorem, least, complexity, adaptive, bound, just, arm, reservoir, consider, constant, unknown, bandit, let, expected, show, difficulty, observe, max, upper, fix, hypothesis, corollary, general, whenever, bounded, absolute, armed, case, optimal, set, since, learning, may, will] [log, requires, draw, sampling] [one, can, detecting, note, first, anomaly, following] [distribution, biased, sample, mixture, two, procedure, mean, given, section, infinite, versus, random, fixed, test, exponential] [correct, prior, whether, information, player] [work, proposed, single, propose, using]
Combinatorial Multi-Armed Bandit with General Reward Functions
Wei Chen, Wei Hu, Fu Li, Jian Li, Yu Liu, Pinyan Lu


[base] [algorithm, arm, expected, regret, super, problem, general, utility, function, bound, combinatorial, cmab, online, known, set, learning, upper, outcome, bandit, confidence, case, offline, may, cumulative, round, supported, maximizing, provide, let, depends, assume, optimal, consider, obtain, constant, sdcb, mab, show, oracle, cdf, just, theorem, maximization, ptas, chooses] [stochastic, parameter, optimization, supplementary, stochastically, approximation, end] [can, assumption, also, first, one, linear, note, support, high, dominant, vector] [distribution, random, finite, section, nonlinear, given] [reward, framework, player, time, wei, value, typically] [used, use, work, input, previous]
An equivalence between high dimensional Bayes optimal inference and M-estimation
Madhu Advani, Surya Ganguli


[thus, find, message, equivalence, passing] [optimal, loss, smoothing, smoothed, opt, equivalent, special, case, problem, function, convex, setting, unknown, dependent] [inference, step, log, descent, bayesian, gradient, proximal, variance, iid, approximate, likelihood, optimization, posterior] [signal, measurement, mamp, mmse, bamp, regularizer, noise, gamp, via, can, sparse, high, lopt, linear, related, also, amp, compressed, component, qht, rigorous, error, note, one, lasso, see] [gaussian, moreau, distribution, given, fixed, correction, mean, section, dimensional, additive, statistical, derivation, random, data, estimation, envelope] [time, choice, information, found, determine, evolution] [map, performance, using, dense, input, output, used, neural]
Breaking the Bandwidth Barrier: Geometrical Adaptive Entropy Estimation
Weihao Gao, Sewoong Oh, Pramod Viswanath


[degree, science, consistency, basic] [show, theorem, defined, dependent, case, special, proof, maximization, adaptive, depends, general, known, unknown] [log, likelihood, standard, step, large, key] [local, can, order, one, main, small, following, also] [estimator, bandwidth, entropy, density, mutual, estimation, bias, independent, sample, asymptotic, nearest, neighbor, kernel, fixed, estimate, based, data, underlying, distribution, random, estimating, section, gaussian, numerical, theoretical, resubstitution, nonparametric, empirical, distance, boundary, polynomial, family, uniform, mathematical, result, joint, kozachenko] [information, choice, form, closed, correlation, state, simple] [proposed, use, using, work, recent, figure, approach, outperforms, used, art, unit]
Normalized Spectral Map Synchronization
Yanyao Shen, Qixing Huang, Nati Srebro, Sujay Sanghavi


[graph, probability, normalized, connected] [algorithm, problem, show, complexity, consider, lemma, proof, bound, conference, yet, pij, theorem, convex] [optimization, step, method, initial, processing, irregular] [normspecsync, matrix, xij, noise, sdp, can, specsync, synchronization, eigenvectors, cmu, spectral, leading, recovery, gsparse, diffsync, generalized, first, analysis, much, see, synthetic, second, noisy, sparse, min, ujt, stability, also, ransac, following, ieee, solving, real, exact] [permutation, described, density, distance, two, data, underlying, section, theoretical, random] [observation, model, varying, time] [input, map, object, figure, computer, similar, using, consists, proposed, quantitative, performance, evaluate, pattern, generate, vision]
Fast and Flexible Monotonic Functions with Ensembles of Lattices
Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta


[monotonic, lattice, ensemble, base, calibrators, monotonicity, calibration, rtl, number, subset, torsion, larger] [set, learning, function, provide, problem, algorithm, inequality, defined, selecting, consider] [machine, evaluation, size, experiment, respect, step, constrained, datasets, optimization, large, key, method, extreme] [can, linear, one, regression, see, first, error, regularizer, small] [random, test, two, joint, testing, journal, useful, mean, selection] [model, time, optimized] [feature, training, table, accuracy, used, dataset, separate, shared, different, neural, validation, train, trained, using, better, randomly, figure, single, learn, jointly, classification, compared, use, similar, per, propose]
CRF-CNN: Modeling Structured Information in Human Pose Estimation
Xiao Chu, Wanli Ouyang, hongsheng Li, Xiaogang Wang


[passing, message, tree, structure, among, flooding, loopy, graphical, probabilistic, node, pairwise, kong, chen, hong, obtained, graph] [learning, algorithm, scheme, receive, obtain, set] [latent, efficient, implementation, term, end, factor] [can, also, order, group, denote] [joint, two, estimation, denotes, estimated] [model, human, information, modeling, location, connecting] [body, pose, feature, structured, cnn, use, neural, used, deep, using, network, crf, training, output, convolutional, part, dataset, relationship, shown, lsp, table, hidden, spatial, proposed, compared, performance, image, vgg, map, preprint, different, implemented, arxiv, score, person, layer, without]
Online Pricing with Strategic and Patient Buyers
Michal Feldman, Tomer Koren, Roi Livni, Yishay Mansour, Aviv Zohar


[obtained, probability, number, half, acm] [price, regret, algorithm, seller, buyer, revenue, bound, patience, lower, pricing, switching, consider, cost, set, posted, every, regrett, theorem, proof, assume, online, valuation, case, observe, may, best, bandit, loss, optimal, let, obtain, upper, show, expected, buy, setting, problem, constant, implies, mab, even, define, since, day, exists, might, choose] [reduction, size, step, log, full, stochastic, epoch, iteration, respect, university] [can, note, one, main, item, order, following] [given, fixed, purchase, section, two, result] [time, feedback, strategic, model, value, information, within, receives, continuous] [sequence, window, single, work, different, mechanism, use, neural]
GAP Safe Screening Rules for Sparse-Group Lasso
Eugene Ndiaye, Olivier Fercoq, Alexandre Gramfort, Joseph Salmon


[obtained, level, introduced, rule, whose] [feasible, algorithm, set, defined, max, active, optimal, appendix, duality, convex, case, learning, since, strategy, else] [dual, screening, efficient, primal, term, compute, coordinate, convergence, sequential, descent, reduction, optimization] [lasso, norm, gap, group, can, qwg, one, sparsity, solution, computation, sparse, following, proposition, vector, regression, pxq, thanks, xgj, note, high, see, need, regularization, arg, climate, critical, first, computational, matrix] [point, data, given, section, sphere, two, corresponding, radius, equation] [safe, time, ball, value, new, dynamic] [figure, feature, using, natural, sequence, region, propose, proposed, unit]
Catching heuristics are optimal control policies
Boris Belousov, Gerhard Neumann, Constantin A. Rothkopf, Jan R. Peters


[theory, resulting, probability] [optimal, will, function, constant, cost, problem, show, ratio, fast, maximal] [fit, acceleration, away, optimization, run] [can, running, noise, tan, linear, computational, sufficiently, observed] [point, based, journal, described, tangent, two, distance, gaussian] [ball, agent, time, angle, control, catching, model, uncertainty, human, belief, interception, state, trajectory, reactive, system, policy, behavior, gaze, internal, tracking, predictive, future, bearing, reaction, backwards, towards, baseball, turn, intercept, catch, explain, direction, heuristic, catcher, planning, elevation, cancellation, motor, continuous, optic, outfielder, modeling, simulation, long, action, observation, force, speed, simple] [figure, using, optical, task, final, different, visual, prediction]
Hierarchical Question-Image Co-Attention for Visual Question Answering
Jiasen Lu, Jianwei Yang, Dhruv Batra, Devi Parikh


[level, many, hierarchy, ablation, number, recursively] [set, problem] [parallel, method, full, machine] [alternating, can, first, vector, also, see, following] [word, two, based, embedding, given, test, space] [model, answer, hierarchical, light, stop, encoding, information, research, focused] [question, image, attention, visual, vqa, phrase, lstm, neural, propose, color, use, three, proposed, dataset, answering, feature, different, mechanism, language, oursa, used, table, snow, representation, top, using, layer, novel, pooling, convolution, preprint, arxiv, better, affinity, spatial, attend, jointly, dhruv, similar, final, devi, deep, four, attended, figure, recent, holding, attends, train, bird]
Local Similarity-Aware Deep Feature Embedding
Chen Huang, Chen Change Loy, Xiaoou Tang


[negative, mining, pairwise, many, thus, lifted] [learning, loss, margin, class, set, function, hinge, max, absolute, online, adaptive] [large, method, standard, select] [hard, can, local, retrieval, global, also, one, small] [embedding, metric, euclidean, distance, sample, positive, two, space, distribution, data, mean, test, mahalanobis] [new, heterogeneous, towards] [feature, similarity, deep, pddm, figure, image, score, using, learned, transfer, quadruplet, cnn, use, used, imagenet, proposed, pair, triplet, training, unit, structured, visual, contrastive, vision, intraclass, classification, shown, trained, learn, position, jointly, network, table, propose, effective, performance]
Provable Efficient Online Matrix Completion via Non-convex Stochastic Gradient Descent
Chi Jin, Sham M. Kakade, Praneeth Netrapalli


[runtime, neighborhood, number, probability] [algorithm, online, lemma, theorem, will, let, offline, case, problem, show, complexity, learning, general, function, rate, proof, always, set, provide, fast, max, convex, consider, make, since, every, observe] [log, gradient, sgd, saddle, convergence, descent, stochastic, initial, update, step, efficient, away, warm, however] [matrix, rank, completion, can, local, psd, low, denote, also, svd, linear, phase, stay, symmetric, first, one, main, row, benjamin, good, via, see, order, def, analysis, computational] [sample, section, estimate, given, result] [time, start, framework] [preprint, arxiv, region, work, use, using, several]
The Generalized Reparameterization Gradient
Francisco R. Ruiz, Michalis Titsias RC AUEB, David Blei


[probabilistic, variable, report] [function, algorithm, set, obtain, learning, conference, class, show, general] [variational, gradient, reparameterization, advi, bbvi, latent, stochastic, gcorr, depend, apply, variance, standard, elbo, respect, inference, log, monte, grep, carlo, machine, method, beta, posterior, processing, logit, compute, znk, technique, optimization, expectation, nonconjugate, auxiliary, olivetti, kingma, fit] [can, generalized, also, one, xnd] [distribution, gamma, transformation, gaussian, estimate, family, two, random, test, sample, exponential, density, standardization] [model, transformed, information, time, form, control, artificial] [use, neural, score, omniglot, shape, using, used, mnist, figure, international, approach, different, deep, single, consists, dataset]
Split LBI: An Iterative Regularization Path with Structural Sparsity
Chendi Huang, Xinwei Sun, Jiechao Xiong, Yuan Yao


[split, consistency, path, partial, structural, theory, enough, strong, panel, called, stanley, graphical] [let, algorithm, theorem, equivalent, will, define, consider, set, now, constant, smallest, example, defined, minimax] [standard, log, parameter, large, supplementary, iteration] [assumption, lbi, can, lasso, regularization, generalized, sparsity, condition, sign, iterative, order, genlasso, matrix, following, sparse, fiba, error, irrepresentable, ranking, auc, linear, computational, remark, weaker, singular, nonzero, recovery, comparison, one, see, support, also, admm, note, subspace] [selection, journal, fused, based, estimator, bregman, family] [model, right, left, value, information, basketball, design, simple] [figure, image, shown, top, better, denoising, compare]
Semiparametric Differential Graph Models
Pan Xu, Quanquan Gu


[graph, graphical, gene, cancer, expression, number, present, dependency, genetic] [precision, rate, oracle, setting, theorem, show, defined, will, property, concave, since, minimax] [nonconvex, latent, log, faster, large, parameter, likelihood, convergence] [penalty, matrix, can, assumption, group, also, nonzero, analysis, following, order, magnitude, norm, note, lasso, support, condition, synthetic, sparse] [differential, estimator, estimation, two, transelliptical, dpm, gaussian, semiparametric, distribution, sepglasso, mcp, difference, data, estimating, section, true, statistical, comp, estimate, based, random, tau, elliptical, covariance, ovarian, journal, molecular, clinical, mild] [model, research, correlation, world] [network, proposed, different, figure, propose, preprint, arxiv]
Joint quantile regression in vector-valued RKHSs
Maxime Sangnier, Olivier Fercoq, Florence d'Alché-Buc


[thus, called, probability, decomposable] [learning, loss, let, problem, function, algorithm, set, since, property, risk, theorem, appendix, conference, now, convex] [coordinate, machine, datasets, descent, standard, optimization, dual, efficient, method] [regression, can, linear, one, also, order, matrix, vector, comparison, hard, following, remark] [quantile, kernel, joint, conditional, crossing, estimation, quantiles, pinball, two, curve, random, journal, rkhs, space, empirical, theoretical, based, data, estimator, sample, independent, estimating, given, mean, numerical, true, associated, provided, estimated] [intercept, framework, along] [proposed, output, training, approach, multiple, feature, figure, several, different, use, table, using, used, novel, work, weight]
Assortment Optimization Under the Mallows model
Antoine Desir, Vineet Goyal, Srikanth Jagabathula, Danny Segev


[probability, subset, number, expression, sum, program, pairwise] [set, general, problem, algorithm, theorem, chosen, optimal, outside, let, show, learning, price, revenue, consider, class, known, choosing, offered, always, studied, highest, ranked, assume, expected] [optimization, computing, parameter, compute, machine, large, key, efficiently, solve, logit, sampling] [product, can, ranking, running, preference, demand, order, formulation, also, solution, following, first, note, existing, customer, rank] [mixture, distribution, given, two, exponential, random, commonly, equal, fixed, permutation] [choice, model, offer, assortment, mip, time, management, option, location, universe, substitution, determining, preferred, inserted] [position, using, work, scale, approach]
Dynamic Network Surgery for Efficient DNNs
Yiwen Guo, Anbang Yao, Yurong Chen


[pruning, pruned, number, connected, thus] [compression, learning, will, function, rate, loss, make, problem, since, set, achieves, even, learnable, algorithm, binary, may] [method, parameter, efficient, importance, reduce, update, updating, gradient] [can, error, also, matrix, compressed, order, following, much, sparse, one] [reference, two, section, intel, test, corresponding] [model, dynamic, process, current, making, policy] [network, neural, convolutional, deep, figure, training, surgery, prediction, han, connection, accuracy, table, splicing, alexnet, better, dnn, compress, fully, shown, hidden, propose, classification, layer, without, proposed, shall, use, consists, trained, improve, iter, china, retraining]
Stochastic Gradient MCMC with Stale Gradients
Changyou Chen, Nan Ding, Chunyuan Li, Yizhe Zhang, Lawrence Carin


[number, worker, theory, average, indicates, level] [loss, defined, optimal, theorem, learning, bound, setting, algorithm, function, constant, appendix, bounded] [stochastic, distributed, variance, gradient, mse, stale, speedup, convergence, bayesian, standard, staleness, decrease, parameter, parallel, iteration, asynchronous, mcmc, due, langevin, faster, log, monte, server, optimization, maxl, size, computed, discussed, verify, step, posterior] [linear, can, running, technical, order, one] [bias, section, independent, integrator, test, based, increasing, estimation, data, chain, two, described, sample] [time, system, simple, model, typically] [deep, using, neural, architecture, figure, used, multiple, use, instead]
High resolution neural connectivity from incomplete tracing data using nonnegative spline regression
Kameron D. Harris, Stefan Mihalas, Eric Shea-Brown


[number, find, present, site] [problem, since, algorithm, version, will, smoothing, unknown, function, assume] [full, method, fit, operator, inference, term, optimization] [connectivity, rank, matrix, injection, low, regional, mouse, allen, can, ninj, nonnegative, tracing, regression, also, projection, institute, completion, relative, solution, mesoscale, spatially, error, wfull, wtrue, throughout, local, rny, linear, spline, product, mserel, signal] [data, result, space, journal, test, estimator, laplacian, well] [brain, within, voxels, model, cortical, cortex, target, cell, system, varying] [visual, voxel, using, source, neural, use, similar, without, image, training, applied, available, field, spatial, region, map, higher, output]
Boosting with Abstention
Corinna Cortes, Giulia DeSalvo, Mehryar Mohri


[base, ensemble, present, threshold, rule] [abstention, algorithm, loss, learning, rejection, convex, function, lsb, problem, general, appendix, cost, surrogate, defined, lmb, set, expected, dhl, let, scenario, since, max, theorem, define, optimal, assume, best, finding, tsb, consider, hold, stump, price, binary, show] [boosting, standard, optimization, descent, log, step, coordinate, size, argmin] [can, following, solution, error, via, analysis, projected, first, note] [based, predictor, given, two, section, family, data, sample, well, distribution, introduce, bayes, test] [model, direction, new, option, search, decision, value, along] [classification, figure, pair, reject, using, classifier, learned, several, consists, natural]
Gaussian Process Bandit Optimisation with Multi-fidelity Evaluations
Kirthevasan Kandasamy, Gautam Dasarathy, Junier B. Oliva, Jeff Schneider, Barnabas Poczos


[number, querying, many, queried] [fidelity, regret, query, appendix, set, bandit, will, problem, strategy, lower, algorithm, let, bound, upper, function, capital, learning, confidence, best, maximise, smaller, consider, ucb, oracle] [expensive, optimization, bayesian, likelihood, optimisation, method, cheap, posterior, size, due, standard, large, implementation] [can, also, first, high, analysis, second, small, one, optimum, via, real, synthetic, error, global] [gaussian, theoretical, kernel, maximum, two, bad, section] [time, goal, process, simple, tuning, information, next, payoff, demonstrate] [single, using, used, use, training, different, figure, previous, computer, region, work]
Testing for Differences in Gaussian Graphical Models: Applications to Brain Connectivity
Eugene Belilovsky, Gaël Varoquaux, Matthew B. Blaschko


[graphical, edge, many, often] [confidence, obtain, learning, consider, known, now, show, precision, will, setting, hypothesis, problem, set, case, general] [parameter, end, unbiased, method] [lasso, can, sparsity, sparse, matrix, regression, connectivity, one, power, analysis, projected, group, support, second] [debiased, fused, difference, functional, gaussian, two, data, estimate, estimator, statistical, test, permutation, ridge, parametric, estimation, bias, testing, distribution, selection, covariance, significance, autism, given, section, joint, sample, comparing, ggm, well] [brain, fmri, control, inverse, across, van] [use, using, used, figure, work, network, approach, different, dataset, recent]
An ensemble diversity approach to supervised binary hashing
Miguel A. Carreira-Perpinan, Ramin Raziperchikolaei


[number, ensemble, subset, bit, disjoint, neighborhood] [hash, binary, hashing, precision, function, diversity, ilht, learning, set, kshcut, ksh, ynm, case, may, loss, bre, bagging, algorithm, fast, lsh, since, show, will] [objective, optimization, faster, large, approximate, quadratic] [can, linear, also, one, much, alternating, matrix, orthogonality, vector, coupled, retrieval, solution] [random, laplacian, test, hamming, data, well, independent, kernel, two, nearest] [information, optimize, far, decision, optimizing] [training, using, different, use, supervised, neural, output, single, train, classifier, image, better, svms, work, cifar, recall, approach]
Learning shape correspondence with anisotropic convolutional neural networks
Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Michael Bronstein


[partial, diffusion, triangle, many] [learning, defined, function, pointwise, follows, minimizing] [method, challenging, operator, processing, supplementary] [spectral, yes, local, can, error, matrix, one, denote] [geodesic, point, construction, geometric, geometry, reference, euclidean, radius, done, lscnn, two, well, random, tangent, functional, faust, refer, riemannian] [intrinsic, soft, unlike, model, template] [correspondence, shape, anisotropic, acnn, convolutional, neural, used, heat, different, applied, using, figure, deep, cnn, computer, training, gcnn, network, approach, filter, compared, output, surface, cnns, performance, representation, proposed, recent, deformable, outperforms, add, mesh, input, matching, work, previous, patch, several]
Brains on Beats
Umut Güçlü, Jordy Thielen, Michael Hanke, Marcel van Gerven, Marcel A. J. van Gerven


[number, assignment, assigned] [show, highest, sensitive, revealed, set] [music, posterior, gradient, analyzed, processing] [analysis, first, tag, musical, matrix, significantly, correlated, spectrum] [mean, corresponding, functional, test, random, data, two] [model, stg, representational, rdms, voxels, sensory, brain, auditory, control, rdm, target, human, temporal, fmri, stimulus, anterior, along, information, gyrus, subject, increasingly, followed, found, representing, time, value, ventral, whereas, significant, fdr, across] [dnn, neural, layer, performance, used, visual, candidate, similarity, deep, figure, region, dataset, shown, different, entire, convolutional, prediction, compared, deeper, training, predict, architecture, audio, dnns, dissimilarity]
Dynamic matrix recovery from incomplete observations under an exact low-rank constraint
Liangbei Xu, Mark Davenport


[number, weighted, constraint, definition, program, nmin] [case, set, will, complexity, assume, bound, theorem, minimization, problem, optimal, consider, let, since, context, provide, convex] [operator, convergence, factor, parameter] [matrix, recovery, can, lowems, error, one, completion, measurement, sensing, noise, also, alternating, perturbation, recover, rank, note, following, min, analysis, norm, denote, recommendation, linear, arg, nuclear, rmse, ieee, via, synthetic, exact, incoherence, suppose, noisy] [sample, two, estimator, reduces, given, section, result, gaussian, fixed, theoretical] [baseline, dynamic, time, information, required, simple, model] [approach, using, use, compared, figure, proposed, accuracy, validation, similar, recent, relatively]
Graphical Time Warping for Joint Alignment of Multiple Curves
Yizhi Wang, David J. Miller, Kira Poskanzer, Yue Wang, Lin Tian, Guoqiang Yu


[warping, graph, dtw, path, gtw, cut, directed, edge, labeling, valid, reverse, definition, ggtw, neighboring, structure, astrocyte, dependency, find, node, propagation, gstruct, connected, induced, graphical, lmf, align, programming] [set, max, algorithm, problem, function, cost, lemma, may, conference, defined, since, example, case] [primal, dual, capacity, parameter, solved, hyperparameters] [can, one, also, imaging, noise, global, analysis, solution, calcium, comparison] [data, two, given, corresponding, curve, distance, reference, joint, equation, estimated] [time, series, dynamic, another, area, transformed, form] [flow, pair, multiple, alignment, single, different, pattern, shown, using, jointly, international, computer, approach, applied, used, similar]
Deep Learning Models of the Retinal Response to Natural Scenes
Lane McIntosh, Niru Maheswaranathan, Aran Nayebi, Surya Ganguli, Stephen Baccus


[circuit, precise, david] [function, general] [variance, glm] [noise, second, first, can, computational, accurate, linear, also, generalized, one] [mean, data, nonlinear, distribution, particular, journal] [retinal, model, response, stimulus, spike, white, sensory, spatiotemporal, ganglion, time, firing, variability, temporal, count, found, encoding, activity, markus, poisson, spiking, retina, early, cell, stephen, internal, nature, recorded] [natural, neural, convolutional, cnns, cnn, layer, figure, trained, network, capture, spatial, performance, better, used, learned, scene, using, filter, understanding, recurrent, generalize, adaptation, deep, contrast, understand, predicting, training, compared, shown, similar, different, generate, resolution, including, feedforward, architecture, work]
Review Networks for Caption Generation
Zhilin Yang, Ye Yuan, Yuexin Wu, William W. Cohen, Ruslan R. Salakhutdinov


[negative, probability, added, theory, labeling, identifiable] [class, algorithm, learning, set, let, label, setting, general, defined, function, lemma, theorem, binary, follows, will] [posterior, size] [noise, can, also, noisy, one, first, related, component] [data, positive, estimation, sample, mixture, true, proportion, nonparametric, parametric, univariate, distribution, practical, identifiability, estimating, mixing, estimate, density, transformation, random, alphamax, space, gaussian, msgmm, dimensionality, section, robust, statistical, equation, two, maximum] [prior, form, model] [unlabeled, labeled, classification, using, used, approach, input, use, classifier, learn, performance, different, explicitly, pair, generate]
Active Learning from Imperfect Labelers
Songbai Yan, Kamalika Chaudhuri, Tara Javidi


[number, probability, membership, querying, threshold, find] [algorithm, learning, active, abstention, query, labeler, label, complexity, rate, theorem, lower, function, hypothesis, return, least, conference, setting, close, class, learner, optimal, case, let, instance, interval, heck, will, near, constant, checksignificant, upper, kamalika, pac, agnostic, binary, abstain, adaptive, consider, satisfies, fragment, problem, since] [end, log, smooth, variance] [condition, can, noise, noisy, one, also, small, satisfying, significantly, analysis] [boundary, space, statistically, closer, random, two, based, given, statistical, distance, universal] [decision, information, response, model, search, significant] [ground, output, work, truth, classifier, figure, use, used]
Human Decision-Making under Limited Time
Pedro A. Ortega, Alan A. Stocker


[theory, resulting, average, made] [utility, expected, function, set, optimal, regret, bounded, version, setting] [posterior, log, experiment, fit, parameter, university, processing, normalizing, solve] [decomposition, one, can, solution, first, relative, also, percentage, following] [test, distribution, two, empirical, given, mutual, well, exp, maximum, derived, gibbs, determined, based, increasing, data] [choice, inverse, model, prior, information, stimulus, resource, temperature, time, subjective, human, subject, rational, puzzle, trial, decision, correct, change, value, economic, inferred, within, duration] [using, performance, training, used, figure, shown, neural, learned, conducted, calculated, different, untrained, task]
Unsupervised Domain Adaptation with Residual Transfer Networks
Mingsheng Long, Han Zhu, Jianmin Wang, Michael I. Jordan


[dan, made, block, adding] [learning, function, adaptive, show, will, make, best] [standard, method, parameter, approximate, key] [can, tensor, penalty, perturbation, paper, assumption, one, hence] [data, mmd, entropy, kernel, bridge, discrepancy, well, based] [target, new, state, across, directly, adapting, model, module, abstract, framework, prior] [domain, deep, classifier, adaptation, residual, source, transfer, feature, rtn, labeled, learn, multiple, transferable, layer, different, unsupervised, using, neural, training, network, art, figure, convolutional, accuracy, approach, shown, enable, perform, learned, alexnet, work, previous, dataset, better, unlabeled, jointly, recent, effective, applied, input, propose, proposed, trained]
Finite-Dimensional BFRY Priors and Variational Bayesian Inference for Power Law Models
Juho Lee, Lancelot F. James, Seungjin Choi


[average, stable, normalized, many, pyp, variable, law, crp, number, obtained] [algorithm, let, class, get, learning, proof, since, bounded] [variational, beta, bayesian, easily, datasets, inference, log, posterior, computed, develop, likelihood, efficient, machine, tractable, supplementary, latent] [can, also, one, synthetic, explicit, first] [random, bfry, measure, mixture, bayes, based, collapsed, gamma, density, dirichlet, crm, laplace, converge, finite, ggp, journal, statistical, fnspm, functional, data, buffet, fdpm, construct, indian, gibbs, ideal, test, sbp, independent, infinitely, kingman, six, divisible, corresponding, marginalized, dimensional] [process, written, simple, model] [using, generated, figure, compared, performance, table, relationship]
Optimistic Gittins Indices
Eli Gutin, Vivek Farias


[probability, number, frequentist] [gittins, index, regret, arm, problem, discount, ogi, bandit, optimistic, thompson, optimal, algorithm, will, horizon, bound, ucb, proof, bernoulli, let, consider, show, lower, expected, lemma, mab, theorem, constant, class, set, learning, agrawal, stopping, even] [bayesian, factor, sampling, approximation, parameter, experiment, large, beta, log, posterior, solves] [denote, one, computational, analysis, sufficiently, can, paper, also, following] [bayes, gaussian, mean, infinite, result, equation, two, given, random, finite, distribution] [time, policy, prior, reward, information, state, discounted, value, simplest, tuning, therefore] [use, performance, using, proposed, recent, applied]
Cooperative Inverse Reinforcement Learning
Dylan Hadfield-Menell, Stuart J. Russell, Pieter Abbeel, Anca Dragan


[sum, induced] [optimal, learning, best, problem, function, maximizes, regret, theorem, let, consider, algorithm, show, set, make, general, will, expected, may, complexity, case] [compute, computing, approximate, initial] [can, observed, assumption, phase, solution, reduced, one, first, solving] [distribution, section, two, joint, given, maximum, common, difference, mean] [reward, policy, robot, cirl, game, human, state, irl, inverse, reinforcement, behavior, value, response, apprenticeship, demonstration, cooperative, model, action, deployment, maximize, world, trajectory, expert, pomdp, information, dbe, belief, design, expects, prior, learns, infers, arises] [pair, use, feature, alignment, better, figure, ground, truth]
Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations
Behnam Neyshabur, Yuhuai Wu, Ruslan R. Salakhutdinov, Nati Srebro


[path, number, adding, node, proceeding] [learning, function, problem, conference, show, consider, feasible, define, even] [optimization, term, gradient, parameter, step, size, respect, plain, standard, descent, update, sgd, calculating, large, calculate] [can, also, second, invariant, first, error, initialization, addition] [length, geometry, two, section, test, transformation] [time, modeling, therefore, long, internal, win] [neural, rnn, recurrent, shared, rnns, input, feedforward, output, using, different, network, hidden, relu, sequence, training, activation, table, wrec, international, language, use, better, performance, including, irnn, calculated, deep, train, wout, implemented]
Bootstrap Model Aggregation for Distributed Statistical Learning
JUN HAN, Qiang Liu


[number, average, total, probabilistic, weighted] [learning, since, set, fix, max, show, problem, rate] [bootstrap, method, mse, distributed, log, averaging, size, variance, ppca, parameter, applicable, liu, large, efficient, reduction, ihler, due, machine, latent, likelihood] [local, linear, can, dimension, global, arg, first, see, need, also, assumption, note, exact, real, analysis] [estimator, data, mixture, practical, based, asymptotic, theoretical, true, sample, mle, gmm, empirical, estimated, section, statistical, consistent, estimating, given, positive] [model, control, information, simple, communication, world, varying] [different, using, figure, shown, use, dataset, better, learn, combined, original, approach]
Proximal Stochastic Methods for Nonsmooth Nonconvex Finite-Sum Optimization
Sashank J. Reddi, Suvrit Sra, Barnabas Poczos, Alexander J. Smola


[number, average] [algorithm, complexity, convex, constant, theorem, obtain, problem, corollary, function, optimal, let, known, set, learning, assume, general, show, cost, since, alexander, best, studied] [rox, nonconvex, convergence, stochastic, vrg, proximal, ifo, gradient, aga, nonsmooth, minibatch, size, step, method, variance, optimization, minibatches, machine, iteration, smooth, variant, suvrit, descent, end, sashank, faster, svrg, due, strongly, converges, siam, reduction, incremental, rfi, key, full, inner, saga] [following, can, solution, also, analysis, linear, one, first, note, much, global] [point, theoretical, result, journal, stationary, important] [typically] [using, use, performance, used, similar, output, figure, work, multiple]
Tagger: Deep Unsupervised Perceptual Grouping
Klaus Greff, Antti Rasmus, Mathias Berglund, Tele Hao, Harri Valpola, Juergen Schmidhuber


[connected, many, number] [learning, even, example, class, cost, fast, make, conference] [inference, posterior, iteration, amortized, end, method, efficient] [can, group, iterative, tag, also, error, one, first] [two, based, parametric, result, given] [model, system, framework, baseline, internal, information, complex] [grouping, neural, tagger, network, using, unsupervised, fully, use, dataset, ladder, classification, trained, denoising, figure, table, training, segmentation, input, different, object, textured, perceptual, ami, preprint, zki, score, similar, attention, convolutional, help, train, arxiv, used, digit, learn, mapping, evaluate, mnist, supervised, generative, task, international, rnn, iter, texture, layer, without]
Learning Treewidth-Bounded Bayesian Networks with Thousands of Variables
Mauro Scanagatta, Giorgio Corani, Cassio P. de Campos, Marco Zaffalon


[treewidth, number, graph, dag, parent, thus, structural, variable, moral, subgraph, node, twilp, nie, tree, reverse, structure, whose, topological, consistently, report, bic, edge, gobnilp, allow, campos, find, obtained, undirected, actual] [learning, set, bounded, consider, algorithm, problem, yet, conference, cost, every, let, since, function, will, best] [bayesian, large, method, run, initial, size] [order, can, one, first, exact, also] [data, space, given, two] [inverted, state, time, information, search, artificial, exploration] [score, approach, table, network, different, higher, randomly, figure, adopt, propose, using, novel, proposed]
Neurally-Guided Procedural Models: Amortized Inference for Procedural Graphics Programs using Neural Networks
Daniel Ritchie, Anna Thomas, Pat Hanrahan, Noah Goodman


[program, many, partial, number, probabilistic, constraint, possible] [example, function, learning, set] [likelihood, inference, variational, monte, method, importance, faster, sequential, university, efficient, amortized, size, carlo, constrained] [can, also, local, via, computation, require, one] [random, chain, well, distribution, section, mixture, given, reference] [procedural, model, target, smc, unguided, choice, time, guided, current, state, took, match, design, forward, pgm, siggraph, generates, continuous, control, modeling] [image, figure, neural, using, output, training, generate, similarity, network, train, shape, generated, performance, capture, used, use, work, generative, accumulative, matching, different, visual, mask, recent]
Graphons, mergeons, and so on!
Justin Eldridge, Mikhail Belkin, Yusu Wang


[graphon, graph, cluster, clustering, level, merge, tree, probability, node, edge, height, consistency, say, connected, mergeon, equivalence, graphons, claim, theory, structure, possible, collection, identify, represents, piecewise, present, neighborhood, relabeling] [set, let, will, algorithm, every, appendix, consider, now, class, notion, general, setting, may, function, almost, said, equivalent, show] [method, stochastic] [sampled, also, can, recovery, one, suppose, matrix, see, following, zero] [random, consistent, measure, two, estimator, density, distribution, distortion, given, particular, preserving, sense, measurable, statistical, distance, result] [model, hierarchical, must, rather] [pair, single, figure, recent, shown, using, approach]
Geometric Dirichlet Means Algorithm for topic inference
Mikhail Yurochkin, XuanLong Nguyen


[number, clustering, obtained, weighted, cluster, probabilistic, probability] [algorithm, polytope, convex, problem, function, will, loss, conference, let, may, learning, since, show, fast] [inference, sampling, latent, simplex, likelihood, method, objective, variational, extreme, processing] [can, extension, min, matrix, via, also, subspace, following] [topic, geometric, recoverkl, dirichlet, gdm, document, tgdm, gibbs, length, distribution, given, geometry, vem, based, increasing, lda, word, pmi, distance, two, data, section, statistical, point, nonparametric, estimation, true, vocabulary] [modeling, varying, information, model] [perplexity, performance, approach, viewed, neural, international, connection]
Robust Spectral Detection of Global Structures in the Data by Learning a Regularization
Pan Zhang


[bethe, community, clustering, localized, pairwise, adjacency, block, detection, thus, degree, graph, average, detectability, informative, structure, many, obtained, topology, panel] [problem, learning, algorithm, general, example, will, set] [stochastic, large, method, inference, however, usually] [matrix, spectral, eigenvectors, can, sparse, regularization, hessian, see, localization, rank, singular, eigenvector, completion, first, global, planted, also, eigenvalue, largest, noise, local, vector, participation, one, trimming, noisy, perturbation] [data, random, given, well, way, based, theoretical, associated] [model, information, transition, left, inverse, right] [using, work, different, figure, three, use, generated, several, approach, proposed, network, similarity]
Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data
Xinghua Lou, Ken Kansky, Wolfgang Lehrach, CC Laan, Bhaskara Marthi, D. Phoenix, Dileep George


[detection, graph, edge, strong, many, variable, false] [conference, learning, considered, every] [factor, inference, due, machine] [can, real, high, ieee, one, analysis, first, local, also, vector, accurate, global] [word, two, random, given, true, robust, selection, test, data, document] [model, reading, forward, process, world, system, unlike, hypothetical] [text, parsing, character, shape, training, using, scene, recognition, letter, segmentation, generative, icdar, image, candidate, font, output, discriminative, computer, used, structured, invariance, lateral, svt, approach, trained, vision, single, figure, pattern, feature, photoocr, clean, international, dataset, train, representation, perform, score, language, cnn]
Matrix Completion has No Spurious Local Minimum
Rong Ge, Jason D. Lee, Tengyu Ma


[claim, probability, many, partial, find] [proof, function, satisfies, will, lemma, theorem, case, let, satisfy, convex, problem, known, show, close, prove, implies, learning, version, setting] [objective, gradient, descent, optimization, showed, step, sampling, nonconvex, machine] [matrix, local, optimality, order, can, first, second, completion, condition, minimum, also, global, necessary, hessian, rank, assumption, spurious, solution, norm, main, incoherent, observed, regularizer, via, low, linear, suppose, initialization, alternating] [point, section, converge, two, tij, random, equation, statistical, positive, result] [observation, must, therefore, information, time, form, starting] [use, work, arxiv, different, preprint, using, similar]
Solving Random Systems of Quadratic Equations via Truncated Generalized Gradient Flow
Gang Wang, Georgios Giannakis


[number, many, normalized, thus] [rate, set, constant, function, algorithm, convex, since, will, even, complexity, cost, consider, problem] [gradient, quadratic, nonconvex, method, initial, large, stage] [spectral, tggf, ati, truncated, initialization, generalized, success, real, relative, truncation, phase, can, twf, solving, one, signal, retrieval, error, solution, matrix, vector, global, order, noiseless, recovery, also, computational, via, eigenvalue, sufficiently, ieee, exact, sign, formulation, linear] [gaussian, random, empirical, data, numerical, estimate, given, two, corresponding, sample, squared, stationary] [complex, model, simple, system, form, time, varying, upon, search, worth, superior] [performance, using, novel, flow, figure]
Hierarchical Object Representation for Open-Ended Object Category Learning and Recognition
Seyed Hamidreza Kasaei, Ana Maria Tomé, Luís Seabra Lopes


[number, obtained, assigned] [learning, set, conference, since, known, defined] [latent, batch, incremental, evaluation] [can, view, dictionary, local, first, one, observed, ieee, also] [topic, lda, word, based, distribution, data, dirichlet, well, two, given, gibbs, selected, point, described, bow, provides] [system, new, model, hierarchical, robot, demonstration, must, information, process, simulated, increasingly] [object, category, recognition, used, using, approach, representation, visual, proposed, layer, accuracy, performance, learned, training, table, feature, learn, presented, dataset, shape, scene, candidate, mug, international, available, memory, different, per, classification, semantic, figure, recognize, three, neural, unsupervised]
Select-and-Sample for Spike-and-Slab Sparse Coding
Abdul-Saboor Sheikh, Jörg Lücke


[probabilistic, number, among, variable] [learning, study, set, algorithm, may, setting, general] [latent, posterior, sampling, large, inference, standard, variational, latents, factored, approximation, efficient, apply, scalability, approximate, expectation, scalable, preselection] [sparse, can, observed, truncated, also, first, high, subspace, following, proposition, zero, noise] [data, given, dimensional, gaussian, gibbs, distribution, dimensionality, selection, application, based, sampler, result] [coding, model, complex, previously, within, typically, continuous, encoding, sensory] [using, generative, image, used, neural, use, applied, multiple, hidden, shown, different, scale, reported, approach, work, per, visual, accuracy, including]
Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy
Aryan Mokhtari, Hadi Daneshmand, Aurelien Lucchi, Thomas Hofmann, Alejandro Ribeiro


[variable, number, neighborhood, average] [risk, set, problem, loss, algorithm, erm, consider, convex, optimal, satisfied, theorem, constant, argument, observe, since, implies, learning, upper, conference, bound, minimization] [newton, size, method, ada, stochastic, convergence, gradient, quadratic, factor, step, cvn, increase, iteration, large, line, descent, backtracking, saga, rrn, requires, initial, stepsize, solves, processing, argmin, update, strongly, compute] [condition, order, can, proposition, solution, regularized, assumption, first, one, see, second, phase, also, regularization, following, local] [statistical, sample, empirical, section, result, journal, increasing, associated] [within, information, search] [accuracy, training, use, dataset, neural, performance, single]
Constraints Based Convex Belief Propagation
Yaniv Tenzer, Alex Schwing, Kevin Gimpel, Tamir Hazan


[consistency, cbcbp, constraint, cbp, potential, xtp, message, passing, average, xsp, variable, program, lagrange, pairwise, propagation, enforce, include, many, slot, number] [algorithm, set, function, problem, since, define, may, consider, convex, feasible, learning, defined, lemma, show, conference] [dual, standard, inference, update, log, machine, factor, large, faster, respect] [computational, following, can, one, also, first, order, running, via] [given, word, based, two, exp, derivation, associated, random] [model, belief, value, must, time] [source, phrase, using, translation, use, region, accuracy, table, language, used, work, computer, feature, segmentation, neural]
Joint M-Best-Diverse Labelings as a Parametric Submodular Minimization
Alexander Kirillov, Alexander Shekhovtsov, Carsten Rother, Bogdan Savchynskyy


[diverse, labelings, runtime, divmbest, quality, labeling, number, pairwise, called, master, unary, solver, graphical, grows, variable, probable, definition] [diversity, submodular, problem, minimization, algorithm, will, binary, case, set, theorem, best, known, let, concave, give, minimizing, function, finding, consider, implies, since, defined] [method, energy, efficient, sequential, faster, optimization, approximate, respect, computing, inference] [can, also, solution, invariant, min, one, exact, interactive] [parametric, measure, joint, two, hamming, distance, permutation, difference, given] [time, dynamic] [use, proposed, used, work, segmentation, single, using, different, multiple, structured, pascal, approach, shown]
Tracking the Best Expert in Non-stationary Stochastic Environments
Chen-Yu Wei, Yi-Te Hong, Chi-Jen Lu


[total, number, dependency, according, theory] [regret, algorithm, loss, bound, bandit, arm, let, theorem, learning, expected, will, lower, show, switching, upper, interval, consider, constant, drifting, set, offline, achieve, rate, every, may, appendix, make, problem, online, provide, optimal, achieves, conference, prove, best, even, choose, existence, achievable, case, almost, smaller, idea, chosen, implies] [step, stochastic, variance, supplementary, large] [can, one, vector, order, small, following, denote, note, still, related, good, need] [distribution, given, two, mean, measure, well, way, estimate] [time, information, switch, dynamic, new, action, expert] [using, use, previous, different, like, whole, better]
Visual Question Answering with Question Representation Update (QRU)
Ruiyu Li, Jiaya Jia


[interaction, possible, weighted, resulting] [since, set, may, query, focus, max, learning] [update, updating] [one, also, global, related, following] [two, based, common, given, mean, type, space, retrieved, data] [model, next, answer, information, ability, correct, learns, modeling] [image, question, reasoning, attention, neural, vqa, object, network, layer, visual, representation, answering, preprint, arxiv, language, region, pooling, using, figure, table, use, natural, color, proposed, convolutional, qil, candidate, original, several, memory, dataset, updated, gru, spatial, feature, top, recurrent, relevant, understanding, multiple, generate, single, performance, qru, classification, input, deep, cnn, used, mechanism, content]
Learning a Metric Embedding for Face Recognition using the Multibatch Method
Oren Tadmor, Tal Rosenwein, Shai Shalev-Shwartz, Yonatan Wexler, Amnon Shashua


[number, expression, according, proportional, partition, variable, sum] [let, every, lemma, case, obtain, consider, fix, proof, observe, set, bound, appear, theorem, show, follows, analyze, since, inequality] [variance, rewrite, sampling, expectation] [row, can, first, denote, matrix, diagonal, vector] [estimate, quantity, random, two, derivation] [value, take, allowing, observing, turn] [used, without, use, entire, like]
Ladder Variational Autoencoders
Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, Ole Winther


[weighted, number, purely, recursively, obtained] [learning, show, lower, bound, best, study, tighter, set] [inference, latent, variational, lvae, stochastic, vae, importance, approximate, vaes, likelihood, deterministic, distributed, log, term, five, posterior, additional, parameterization, recently, processing, normalizing] [see, can, one, highly, regularization, also] [gaussian, distribution, test, data, provides, true, two] [model, information, new, observation, prior, process, hierarchical] [generative, using, training, figure, performance, layer, deep, arxiv, trained, preprint, used, neural, representation, different, train, learned, compared, better, omniglot, fully, seen, norb, higher, ladder, consisting, table, unit, qualitatively, improve, mnist, autoencoder]
Dynamic Filter Networks
Xu Jia, Bert De Brabandere, Tinne Tuytelaars, Luc V. Gool


[connected, number] [learning, set, idea, loss, show, general, function] [apply, method, applies] [filtering, can, local, also, view, one, note] [transformation, section, given, specific, two, address] [dynamic, model, module, dynamically, another, new, baseline, next, future] [network, filter, input, layer, video, convolutional, image, generated, prediction, spatial, figure, use, deep, stereo, conditioned, learned, feature, single, using, applied, similar, moving, generate, flow, like, work, proposed, residual, highway, learn, different, task, output, predict, propose, previous, convolution, transformer, neural, training, position, sequence, architecture, instead, predicting, lstm, several, driving, used, shown, optical, frame, map, unsupervised, iminds, steerable]
Refined Lower Bounds for Adversarial Bandits
Sébastien Gerchinovitz, Tor Lattimore


[probability] [regret, bound, lower, loss, algorithm, bandit, proof, theorem, exists, lemma, let, eqj, round, every, prove, optimal, upper, show, arm, follows, learning, inequality, corollary, learner, defined, randomness, conference, minimax, case, randomised, randomisation, hazan, bounded, chosen, get, since, max, improvement, cumulative, sup, best, smaller, always] [log, expectation, respect, stochastic, quadratic, depend, variance, parameter, supplementary, key, divergence] [can, first, min, also, second, one, high, assumption, denote, existing, order, note] [distribution, two, section, result, difference, random, noting] [internal, range, action, value, therefore, taken, new, must] [sequence, adversarial, variation]
A Non-parametric Learning Method for Confidently Estimating Patient's Clinical State and Dynamics
William Hoiles, Mihaela Van Der Schaar


[number, quality, unique, intervention, contains, level, possible, cancer] [algorithm, learning, hypothesis, bound, set, bernstein, function] [stochastic, bayesian, inference, parameter, increase, method, processing] [can, note, necessary, sufficient] [clinical, given, test, patient, segment, associated, estimate, construct, data, mean, covariance, multivariate, vital, therapeutic, estimated, based, clinician, distribution, estimation, hmm, statistical, estimating, provided, utilize, infinite, gaussian, maximum, bias, provides, admission, rothman, physiological, icu, sample, independent, ppv, section, constructed, dirichlet, empirical] [state, dynamic, model, new, prior, equality, medical, time, information, transition] [using, used, dataset, segmentation, several, accuracy, neural, evaluate, detected, compared]
Low-Rank Regression with Tensor Responses
Guillaume Rabusseau, Hachem Kadri


[structure, constraint] [problem, algorithm, function, least, theorem, let, set, defined, learning, drawn, greedy, minimization, consider, show, loss, finding, convex, returned, equivalent, setting, bounded, will] [approximation, size, supplementary, efficient, method, term, material] [tensor, regression, rank, multilinear, holrr, matrix, can, linear, low, following, analysis, extension, regularization, one, product, tucker, lrr, error, decomposition, kernelized, solution, proposition, good, note, see, orthogonal] [data, kernel, multivariate, generalization, given, space, theoretical, nonlinear, based, computationally, ridge] [model, time, along] [output, approach, proposed, feature, input, using, task, training, top, use, different, image, propose]
The Multiple Quantile Graphical Model
Alnur Ali, J. Zico Kolter, Ryan J. Tibshirani


[graphical, neighborhood, variable, panel, dependency, structure] [learning, function, consider, problem, set, considered, may, example, assume, studied, algorithm, minimize, smoothing] [machine, optimization, sampling, method, university, likelihood, efficient] [regression, sparse, can, also, one, lasso, matrix, following, via, technical, sparsity, solving, recovery, admm] [conditional, mqgm, quantile, distribution, given, estimation, joint, basis, selection, journal, gaussian, data, additive, section, gibbs, covariance, pseudolikelihood, corresponding, flu, random, multivariate, conditionals, space, estimate, nonparametric, quantiles, underlying, annals, conditionally, independent, univariate, ring, fitted, spacejam, statistical, positive] [model, modeling, continuous, inverse, across] [using, multiple, figure, approach, region, bottom, available]
Stochastic Gradient Geodesic MCMC Methods
Chang Liu, Jun Zhu, Yang Song


[variable, simulate, constraint] [learning, appendix, define, known, conference, set, defined, consider, now] [stochastic, gradient, log, sampling, coordinate, bayesian, inference, monte, posterior, machine, carlo, mcmc, variational, energy, langevin, develop, draw, large, inner, processing, apply, due, scalable] [can, global, also, matrix, first, see, small, one, synthetic, local, denote] [sggmc, geodesic, distribution, embedded, riemann, space, manifold, gmc, empirical, spherical, hamiltonian, true, integrator, gsgnht, topic, estimate, sample, data, mixture, journal, two, test, embedding, expressed] [model, sam, target, information, process, system, von] [using, use, international, used, recipe, dataset, task, figure, neural, like, better]
Contextual semibandits via supervised learning oracles
Akshay Krishnamurthy, Alekh Agarwal, Miro Dudik


[valid, number, pmin, probability, find, reedy] [algorithm, regret, contextual, semibandit, vcee, learning, composite, bound, set, known, bandit, unknown, oracle, best, setting, agarwal, combinatorial, observe, expected, let, problem, may, klt, achieve, semibandits, achieves, learner, theorem, play, consider, round, context, ucb, least, exploitation, optimal, dependence, make, since, show, appendix] [log, efficient, parameter, variance, unbiased] [can, linear, vector, one, first, regression, also, min, analysis, existing] [empirical, uniform, distribution, space] [reward, policy, simple, action, feedback, exploration, new, information] [use, supervised, using, weight, structured, compare, implemented, better, work, performance, generalizes, feature]
Learning to learn by gradient descent by gradient descent
Marcin Andrychowicz, Misha Denil, Sergio Gómez, Matthew W. Hoffman, David Pfau, Tom Schaul, Nando de Freitas


[base, number, many] [learning, function, problem, will, conference, algorithm, show, setting, appendix] [optimizers, optimization, gradient, update, objective, machine, standard, method, step, additional] [can, one, also, see, small, avoid, plot] [test, generalization, two, well] [behavior, simple, design, allows, optimize, model, optimizing, baseline, artificial] [optimizer, neural, lstm, different, using, trained, network, performance, figure, training, work, style, hidden, learned, optimizee, used, use, content, learn, image, international, train, shown, applying, instead, approach, architecture, final, able, similar, convolutional, resolution, mlp, recurrent, coordinatewise, transfer]
Nested Mini-Batch K-Means
James Newling, François Fleuret


[assignment, thus, cluster, number, triangle, assigned, obtained, rapidly] [algorithm, bound, consider, scheme, may, conference, let, problem, inequality, learning, idea] [premature, nmbatch, energy, end, mbatch, nested, size, iteration, sculley, update, line, standard, machine, large, infmnist, initialisation, yinyang, full, discussed, extreme, accelerate, approximate] [one, first, can, exact, sparse, relative, note, second, minimum, running] [centroid, distance, data, based, redundancy, sample, random, lloyd, two, elkan, mean, difference, test] [new, time, balancing, within] [using, used, use, bounding, figure, training, approach, shown, validation, international, performed, dataset, presented, already]
Quantized Random Projections and Non-Linear Estimation of Cosine Similarity
Ping Li, Michael Mitzenmacher, Martin Slawski


[bit, number, expression, probability] [set, optimal, case, scheme, let, function, may, oracle, consider, lemma, theorem, even, depends, problem, learning, chosen, general] [variance, likelihood, full, inner, requires] [can, linear, one, fraction, high, relative, also, real, small, matrix] [data, mle, random, quantized, quantization, given, estimator, hamming, neighbor, nearest, retrieved, based, test, fisher, two, maximum, specific, result, estimation, statistical, empirical, farm, corresponding, estimate, quantizer, distance, denotes] [information, choice, form, search, required, inverse, cell] [approach, similarity, figure, unit, using, accuracy, performance, better, cosine]
Generating Images with Perceptual Similarity Metrics based on Deep Networks
Alexey Dosovitskiy, Thomas Brox


[connected] [loss, learning, function, class, show, best, set, even] [method, term, latent, supplementary, variational] [can, error, interesting, good, see, local, real, first] [space, two, euclidean, based, distribution, important, random, squared, metric, given] [model] [image, feature, alexnet, training, adversarial, network, generator, similarity, generative, shown, trained, convolutional, used, natural, discriminator, perceptual, deep, different, neural, conv, generate, using, architecture, figure, approach, proposed, use, inversion, work, similar, comparator, table, unsupervised, fully, eat, layer, generation, generated, uconv, reconstruction, train, encoder, generating, extracted, videonet, learn, dosovitskiy, inverting]
Automated scalable segmentation of neurons from multispectral images
Uygar Sümbül, Douglas Roossien, Dawen Cai, Fei Chen, Nicholas Barry, John P. Cunningham, Edward Boyden, Liam Paninski


[number, graph, basic, many, connected, obtained, expression, normalized, level] [problem, set, index, binary, show, algorithm] [size, due, method, university, processing, merging, large] [can, noise, projection, real, fluorescent, ieee, cij, one, analysis, much, imaging, denoised] [maximum, two, data, distance, based, density, slice, automated] [intensity, voxels, individual, simulated, neuronal, within, neuron, nature, variability, simple, anatomy, simulation, light, adjusted, occupied, along, form] [color, image, segmentation, supervoxels, brainbow, single, stack, different, voxel, neural, microscopy, supervoxel, approach, spatial, use, accuracy, figure, patch, nervous, background, using, used, similar, rand, foreground, topographic, various, dataset]
The Multiscale Laplacian Graph Kernel
Risi Kondor, Horace Pan


[graph, vertex, flg, kflg, definition, mlg, multiscale, base, level, structure, number, induced, subgraphs, karsten, subgraph, degree, topological, collection, purely] [let, defined, just, learning, set, define, conference, may, will, sensitive, now, assume, function, property] [compute, computing, machine, approximation, computed, size, requires, processing] [can, matrix, local, low, proposition, gram, also, first, small, one, eigenvalue, following, overall, orthonormal, similarly, rank, paper, spectral, eigenvectors, vector, invariant] [kernel, two, space, laplacian, corresponding, joint, given, basis, section, positive, random, risi, based, permutation, data] [information, new, literature] [feature, different, used, capture, international, multiple, single, neural]
Fast ε-free Inference of Simulation Models with Bayesian Conditional Density Estimation
George Papamakarios, Iain Murray


[number, probability, enough] [learning, algorithm, will, set, conference, rejection, make, lower] [mdn, abc, proposal, posterior, bayesian, approximate, inference, log, monte, parameter, approximation, carlo, efficient, learnt, likelihood, supplementary, mdns, conventional, processing, stochastic] [can, one, observed, need, exact, vector, order] [density, true, gaussian, sample, data, conditional, parametric, mixture, two, estimation, section, estimate, refer, procedure, given, population] [prior, model, time, left, process, typically] [used, neural, training, use, using, figure, effective, trained, work, learn, train, single, hidden, generative, approach, three, network, recognition, generated, propose, tanh, layer, calculated, per]
Reshaped Wirtinger Flow for Solving Quadratic System of Equations
Huishuai Zhang, Yingbin Liang


[number, chen] [loss, algorithm, function, complexity, problem, cost, case, set, proof, guarantee, expected, achieves, show, rate] [gradient, convergence, nonsmooth, nonconvex, step, faster, quadratic, log, method, converges, update, processing, although] [signal, phase, can, initialization, ati, via, retrieval, truncation, candes, computational, paper, also, global, wirtinger, real, solving, relative, noise, error, cdp, sign, comparison, magnitude, altminphase, recovery, much, spectral, reshaped, truncated, following, small] [sample, two, section, random, gaussian, based, given, geometric] [time, complex, poisson, direction, next, design, information] [figure, performance, neural, proposed, shown, flow, compare, applied]
Learning Infinite RBMs with Frank-Wolfe
Wei Ping, Qiang Liu, Alexander T. Ihler


[number, probability, average, sum, find] [algorithm, learning, convex, function, set, greedy, relaxation, general, best, since, max, stopping] [gradient, standard, likelihood, optimization, log, rbms, update, size, parameter, term, solve, inference, softplus, step, efficient, method, machine, boosting, latent, iteration, draw, initialize, marginal] [can, one, restricted, also, initialization, sparse, linear, following, proposition, error] [rbm, test, boltzmann, infinite, random, gibbs, distribution, provides, functional, exp, two, sampler, selected, bias, empirical, mixture] [model, directly, new] [hidden, using, training, fractional, neural, validation, use, unit, propose, figure, learned, used, mnist, deep, layer, better, part, weight]
Joint Line Segmentation and Transcription for End-to-End Handwritten Paragraph Recognition
Theodore Bluche


[weighted, vertical, complete] [may, conference, offline, workshop, set] [line, processing, full, standard, automatic, machine, step, method] [analysis, error, one, transcription, can, also, explicit] [document, section, word] [model, system, simple, implicit, state, followed, module, baseline, time, information, encoded] [recognition, neural, segmentation, text, handwriting, attention, collapse, handwritten, network, image, proposed, international, applied, layer, feature, recurrent, language, mdlstm, dpi, trained, preprint, arxiv, blstm, character, sequence, paragraph, presented, table, iam, decoder, architecture, softmax, whole, figure, validation, training, input, different, without, convolutional, transcribe]
Feature-distributed sparse regression: a screen-and-clean approach
Jiyan Yang, Michael W. Mahoney, Michael Saunders, Yuekai Sun


[number, possible, total, thus, enough] [algorithm, theorem, problem, cost, least, show, round, learning, absolute, set] [log, size, screening, distributed, machine, stage, convergence, reduce, deco, constrained, datasets, step, method, michael, ensures, iterates, fit, central] [sparse, error, sketch, lasso, regression, sent, matrix, sketching, linear, high, creena, sure, cleaning, martin, can, sparsity, concerned, order, contraction, main, condition, lean, cone, pilanci] [statistical, data, two, exp, estimator, independent, random, selected, selection, refer, theoretical] [communication, model] [approach, prediction, amount, figure, preprint, arxiv, similar, single, performance, network, using]
CMA-ES with Optimal Covariance Update and Storage Complexity
Oswin Krause, Dídac Rodríguez Arbonès, Christian Igel


[runtime, root, average, genetic] [algorithm, function, complexity, conference, learning, problem, every, krause, consider, annual] [update, cholesky, factor, evolutionary, triangular, optimization, objective, machine, rotation, standard, large, hansen, coordinate, rosenbrock, igel, discus, step, updating, term, denmark, ajj, cigar, diffpowers, efficient] [matrix, can, square, computation, one, decomposition, linear, also, sampled, error, note, global] [covariance, given, theoretical, sample, distribution, sphere, space, refer, normal] [time, evolution, benchmark, search, required, change, long, new, policy, value] [using, figure, adaptation, compared, approach, memory, original, propose, instead]
Unsupervised Learning of 3D Structure from Images
Danilo Jimenez Rezende, S. M. Ali Eslami, Shakir Mohamed, Peter Battaglia, Max Jaderberg, Nicolas Heess


[number, probabilistic, structure] [context, learning, show, appendix, consider, class, function, best, bound, set, conference] [inference, latent, operator, variational] [can, projection, observed, first, via, also, one, missing, recover] [data, conditional, given, provided] [model, directly, form, abstract, complex, inferred, infer] [representation, figure, generative, volume, training, image, object, volumetric, mesh, using, capture, trained, arxiv, different, network, dataset, shapenet, neural, opengl, scene, deep, generation, camera, vision, seen, use, train, performance, computer, renderer, shown, multiple, hidden, jimenez, conditioned, rendered, unsupervised, used, work, learn, generate]
Budgeted stream-based active learning via adaptive submodular maximization
Kaito Fujii, Hisashi Kashima


[many, number, obtained, partial] [adaptive, submodular, active, secretary, learning, algorithm, submodularity, set, setting, problem, expected, utility, maximization, selecting, conference, class, function, budget, selects, let, optimal, general, constant, give, rate, adaptivestream, adaptively, instance, observe, monotone, maximizing, budgeted, consider, golovin, assume, bound, return] [select, marginal, stochastic, sampling, objective, machine, step] [item, can, one, error, also, existing, first, real, analysis, denote] [theoretical, based, given, random, two, data] [policy, uncertainty, gain, realization, information, pool, state, world, prior, new, another] [stream, several, proposed, entire, international, propose, used, performance, labeled, figure]
Exact Recovery of Hard Thresholding Pursuit
Xiaotong Yuan, Ping Li, Tong Zhang


[strong, number, level, larger, implied] [theorem, will, minimizer, arbitrary, learning, set, loss, conference, algorithm, problem, bound, function, bounded, appendix, guarantee, greedy, let] [logistic, parameter, log, large, free] [recovery, htp, support, condition, can, regression, sparse, analysis, exact, restricted, success, sparsity, relaxed, linear, also, recover, following, vector, error, hard, exactly, proposition, global, thresholding, certain, jain, high, arg, yuan, sparsistency, nonzero, min, recovering, sufficient, significantly, sensing] [estimation, result, data, provided, statistical, two, given, numerical, journal, well, finite] [model, target, information, pursuit, supporting] [able, figure, performance, output]
Robustness of classifiers: from adversarial to random noise
Alhussein Fawzi, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard


[relation] [case, show, chosen, let, defined, conference, even, study, define, learning, general, close, label, class, binary, arbitrary, theorem, provide, now] [large, machine, factor, optimization] [noise, subspace, dimension, can, small, high, perturbation, assumption, analysis, noted, following, sufficiently, note, global, linear, fraction, support, condition, also, first, norm, minimal] [robustness, random, curvature, boundary, result, robust, regime, theoretical, two, datapoint, data, quantity, affine, precisely, dimensional, vulnerable, radius, empirical, nonlinear, behaves, estimated, given, provided] [decision, therefore, ball] [adversarial, classification, different, neural, deep, classifier, image, shown, figure, international, classified, arxiv, preprint]
Designing smoothing functions for improved worst-case competitive ratio in online optimization
Reza Eghbali, Maryam Fazel


[variable, piecewise] [competitive, ratio, algorithm, function, online, optimal, smoothing, problem, convex, bound, adwords, concave, since, nesterov, set, let, consider, umax, lemma, show, case, satisfies, lower, pseq, theorem, dseq, learning, derive, worst, monotone, satisfy, assume, regret, finding, duality, provide, defined, differentiable, equivalent, att, studied, covering, dsim, cost, orthant] [dual, optimization, separable, objective, primal, update, sequential, respect, conjugate, method] [can, assumption, note, linear, solution, cone, simultaneous, regularization, condition, solving, one, also, sufficient, optimality, order] [given, two, section, positive, provides, based, derived, point] [maximize, continuous, simplify, subject] [figure, preprint, arxiv, achieved, using]
Deep Learning for Predicting Human Strategic Behavior
Jason S. Hartford, James R. Wright, Kevin Leyton-Brown


[weighted, sum, number, theory, level] [utility, element, learning, depends, set, function, expected, every, max, best] [parameter, respect] [row, can, column, matrix, iterative, assumption, also, first] [distribution, corresponding, two, normal, section, given, test] [response, action, model, game, player, strategic, behavioral, scalar, human, form, behavior, respond, cognitive, allows, internal, experimental, optimize, payoff, modeling, sharpness, blue, quantal, arl, arj] [hidden, output, layer, input, using, architecture, network, pooling, performance, deep, feature, figure, final, training, unit, used, use, reasoning, representation, predicting, red, softmax, applying, neural, invariance, able, previous, original, learn]
Clustering with Bregman Divergences: an Asymptotic Analysis
Chaoyue Liu, Mikhail Belkin


[clustering, number, coefficient, probability] [set, function, loss, algorithm, optimal, will, case, convex, lemma, general, defined, setting, define, example, theorem, constant, show] [divergence, size, respect, converges, method, convergence, apply, large] [error, analysis, support, order, min, also, first, following, suppose, matrix, can, hessian, intuition, related, paper, power, hard, note, one] [bregman, quantization, distribution, euclidean, asymptotic, density, measure, squared, data, corresponding, point, theoretical, distance, limit, based, discrete, mahalanobis, finite, cube, limiting, well, centroid, graf, reduces] [continuous, cell, form, information, experimental] [figure, used, volume, shown, dataset]
Blind Attacks on Machine Learners
Alex Beatson, Zhaoran Wang, Han Liu


[thus, knowledge, either] [minimax, attacker, learner, risk, lower, bound, setting, blind, consider, set, informed, proof, problem, cam, learning, bounded, theorem, privacy, may, fano, corollary, upper, chooses, loss, drawn, service, arbitrary, assume, rate, optimal, provide] [dkl, parameter, standard, log, convergence, machine, divergence, size, university] [attack, malicious, injection, analysis, regression, linear, can, denote, observed] [data, distribution, uniform, section, estimation, true, family, density, estimator, estimating, given, provides, sample, differential, mean, robust, mutual, two, fixed, random, considering] [information, security, form, interest, value, framework, choice, ability] [training, work, effective, presented, network, used, dataset, aware]
Spatiotemporal Residual Networks for Video Action Recognition
Christoph Feichtenhofer, Axel Pinz, Richard Wildes


[average] [learning, loss, best, show] [size, batch, large, method, normalization] [can, also, first] [two, dimensionality] [temporal, spatiotemporal, action, model, time, across, human] [residual, convolutional, motion, appearance, layer, spatial, stream, recognition, network, training, relu, video, approach, architecture, input, use, image, table, performance, convnet, using, deep, convnets, field, resnet, idt, stride, flow, several, feature, pooling, used, work, fully, receptive, visual, resnets, three, trained, learn, optical, output, accuracy, stacking, classification, shown, connection, neural, mapping, initialized]
Minimizing Quadratic Functions in Constant Time
Kohei Hayashi, Yuichi Yoshida


[number, graph, probability, cut, independently, partition, whose, obtained, aij] [let, function, problem, lemma, every, set, define, algorithm, constant, optimal, uniformly, exists, theorem, least, bijection, pearson, inf, property, obtain, show, dikernels, minimization, assume, defined, prove, satisfies, want, analyze] [method, approximation, quadratic, optimization, machine, log, sampling, siam, divergence, requires] [can, dikernel, wpt, matrix, following, sampled, note, min, solution, vector, equipartition, error, real, one, norm, need, linear, small, via] [journal, kernel, measure, independent, denotes, distribution, random, measurable, two, data, numerical] [time, value] [sequence, dense, proposed, different, using, used]
Tensor Switching Networks
Chuan-Yung Tsai, Andrew M. Saxe, Andrew M. Saxe, David Cox


[thus] [learning, equivalent, switching, even, may, function, since, scheme, consider, problem, rate, provide, define] [standard, gradient, expressive] [linear, can, analysis, also, tensor, error, still, vector, one, contraction, first, hence, much, via, following] [kernel, random, asymptotic, ridge, expansion, denotes, based, data, two, given] [backpropagation, inverted, time, learns, decision, activity, scalar, experimental, across] [deep, network, hidden, representation, activation, input, relu, using, unit, expanded, layer, cnn, readout, neural, nonlinearities, depth, convolutional, figure, entire, mlp, vec, different, lrc, deeper, vanishing, rnl, training, back, used, learn, activate, better, mnist, conveys]
Barzilai-Borwein Step Size for Stochastic Gradient Descent
Conghui Tan, Shiqian Ma, Yu-Hong Dai, Yuqiu Qian


[kong] [algorithm, function, convex, smoothing, learning, assume, may, prove, sensitive, show, idea, chosen, problem, unknown] [step, size, gradient, sgd, stochastic, method, svrg, convergence, technique, variance, line, diminishing, initial, compute, descent, strongly, computed, end, processing, large, dashed, full, usually, optimization, barzilai, ima, parameter, machine, dai] [can, also, following, linear, one, solving, see, first, need, comparison, running, min] [numerical, two, given, journal, fixed, section, correspond, denotes] [information, choice, search, solid, research, option] [use, different, propose, used, figure, neural, better, performance, using]
Variance Reduction in Stochastic Gradient Langevin Dynamics
Kumar Avinava Dubey, Sashank J. Reddi, Sinead A. Williamson, Barnabas Poczos, Alexander J. Smola, Eric P. Xing


[number, average, subset] [algorithm, set, cost, theorem, let, learning, function, show, since, constant, assume] [stochastic, aga, gradient, variance, log, langevin, convergence, gld, posterior, vrg, pass, monte, sgld, bayesian, step, mse, reduction, datasets, due, carlo, approximate, approximation, size, optimization, reduce, large, update, method, machine, term, standard, faster, however, smooth, sampling] [can, computational, regression, also, analysis, high, following, note, noisy, noise, much, component] [data, test, true, given, equation, independent, theoretical, mixture, two, distribution, empirical, section, based] [] [using, performance, use, figure, used, memory, dataset, approach, reducing, proposed, shown, similar]
Unified Methods for Exploiting Piecewise Linear Structure in Convex Optimization
Tyler B. Johnson, Carlos Guestrin


[piecewise, many, structure, include, resulting, relation, number] [algorithm, set, theorem, problem, since, convex, define, consider, function, lemma, lower, learning, appendix, minimizer, optimal, general, choose, minimizing, conference, svm, may, bound, let, case, toward, choosing, loss] [screening, working, optimization, dca, objective, suboptimality, progress, exploiting, machine, xik, dual, constrained, minimizes, subproblem, convergence, applies, scalability, litz, select, ggl] [can, solving, linear, also, group, lasso, one, gap, sparse, principled, solution, suppose, existing, via, support] [test, result, important, point, simpler, theoretical] [time, prior, choice, upon, simple, safe] [using, international, training, feature, used]
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Michaël Defferrard, Xavier Bresson, Pierre Vandergheynst


[graph, localized, regular, coarsening, clustering, chebyshev, number, thus, vertex, fake, node, introduced, quality, level, normalized] [learning, complexity, may, set, conference, fast, defined, algorithm, loss] [size, efficient, processing, however, requires] [spectral, can, signal, local, computational, formulation, one, linear, matrix, vector, order, filtering, analysis, support, spline, ieee, via] [data, two, fourier, classical, basis, polynomial, mathematical, word, euclidean] [model, operation, time, future] [pooling, proposed, convolutional, cnns, table, neural, cnn, spatial, filter, figure, architecture, classification, use, text, input, mnist, accuracy, learn, learned, applied, convolution, deep, extract, domain, performance, meaningful, training, using, layer]
Mistake Bounds for Binary Matrix Completion
Mark Herbster, Stephen Pasteris, Massimiliano Pontil


[number, graph, made, thus, science] [bound, algorithm, learning, margin, online, binary, theorem, complexity, upper, conference, every, problem, set, mistake, max, setting, bounded, may, will, case, regret, lemma, best, corollary, define, optimal, london, uij, let, annual, example, observe, unknown, label, appendix, exists, learner, loss, inf, assume, smaller, kjj, make] [machine, log, although] [matrix, row, norm, via, see, following, analysis, denote, completion, entry, hence, vector, assumption] [kernel, perceptron, given, quantity, underlying, section, consistent, finite] [trace, trial, nature] [predicting, using, sequence, prediction, use, task, international, different, per]
Optimizing affinity-based binary hashing using auxiliary coordinates
Ramin Raziperchikolaei, Miguel A. Carreira-Perpinan


[cut, number, many, resulting, bit] [hash, binary, function, loss, hashing, learning, algorithm, mac, problem, ynm, ksh, precision, maccut, macquad, since, quad, will, finding, graphcut, set, fast, minimizer, esplh] [optimization, objective, free, step, term, quadratic, method, optimizes, apply, large, although, involves, auxiliary] [can, linear, one, first, vector, still, much, alternating, also, following] [given, nonlinear, kernel, two, embedding, data, space, point, section] [optimizing, continuous, optimize, framework, form, simply, value, directly] [using, use, training, original, image, better, learn, supervised, approach, work, proposed, different, cifar]
Operator Variational Inference
Rajesh Ranganath, Dustin Tran, Jaan Altosaar, David Blei


[program, many, find] [function, class, algorithm, consider, satisfies, learning, will, conference, sup, set, known] [variational, operator, objective, inference, posterior, log, divergence, respect, approximating, latent, stochastic, approximate, expectation, tractable, gradient, black, optimization, machine, approximation, develop, bayesian, requires, unbiased, method, calculate, factor, opvi, tractability, logistic, university] [can, require, zero, analysis, second, linear, one, note] [data, family, equation, test, distribution, two, density, mixture, distance, gaussian, given, statistical, positive, construct] [model, box, new, optimizing, design, continuous, typically, value, rich] [use, using, score, generative, neural, international, arxiv, different, preprint, truth]
Dueling Bandits: Beyond Condorcet Winners to General Tournament Solutions
Siddartha Y. Ramamohan, Arun Rajkumar, Shivani Agarwal, Shivani Agarwal


[pairwise, anytime, definition, nij, cycle, probability] [regret, tournament, set, dueling, elect, roc, algorithm, pij, winner, arm, bandit, copeland, condorcet, uij, let, uncovered, bound, satisfies, appendix, will, general, ucbs, conference, winning, upper, maximal, define, defined, pjk, ucb, else, theorem, uji, selects, return, max, borda, cumulative, always, confidence, case] [select, stochastic, end] [preference, can, condition, matrix, also, solution, following, relative, one, see] [selection, procedure, based] [safe, target, feedback, individual, trial, design, exploration] [top, figure, used, work, international, natural, three, without, pair]
Learning brain regions via large-scale online structured sparse dictionary learning
Elvis DOHMATOB, Arthur Mensch, Gael Varoquaux, Bertrand Thirion


[number, constraint, thus, represent, often] [online, problem, learning, algorithm, defined, set] [variance, parameter, update, method, large] [dictionary, sodl, sparse, explained, can, sparsity, regularization, canica, via, one, penalty, analysis, impose, component, bcd, small, linear, onto, see, vanilla, also, imaging, hcp, good, matrix, decomposition, neuroimage, tcanica, overall, varoquaux] [data, functional, laplacian, mean, statistical, test, sample, estimated, corresponding, ridge, major] [model, brain, across, fmri, variability, current, raw, behavioral, cognitive] [proposed, structured, spatial, like, different, compared, use, performance, propose, training, using, work]
Noise-Tolerant Life-Long Matrix Completion via Adaptive Sampling
Maria-Florina F. Balcan, Hongyang Zhang


[probability, represent, many, number] [algorithm, complexity, online, bound, theorem, bounded, lower, least, problem, learning, assume, setting, let, study, best, passive, drawn, uniformly, conference, constant, adaptive, upper, guarantee] [sampling, deterministic, size, machine, log, supplementary] [matrix, noise, subspace, column, completion, can, error, sparse, recovery, noisy, dictionary, one, exact, rank, small, linear, arriving, norm, ieee, exactly, vector, noiseless, recover, outlier, assumption, global, via, denote, dimension, corrupted] [sample, underlying, random, space, two, result, mixture, data, estimated, given, robust, estimate] [information, prior, another, goal] [figure, hidden, clean, add, layer, without]
The Product Cut
Thomas Laurent, James von Brecht, Xavier Bresson, arthur szlam


[cut, graph, normalized, partition, pcut, ncut, cluster, vertex, number, purity, thus, partitioning, connected, quality, nmfr, obtained, weighted] [algorithm, set, theorem, will, relaxation, convex, problem, bound, lower, rate, let, general, may, define, version, provide] [objective, energy, optimization, large, supplementary, variant, iterates] [product, linear, denote, matrix, can, following, perturbation, convexity, relies, exact, stability, solution, conductance, algorithmic, small, quite] [data, random, two, provides, given, denotes, classical, theoretical, point, mathematical] [model, continuous, maximize, therefore, simple, time, experimental, optimize] [use, using, table, figure, sequence]
Learning Supervised PageRank with Gradient-Based and Gradient-Free Optimization Methods
Lev Bogolubsky, Pavel Dvurechensky, Alexander Gasnikov, Gleb Gusev, Yurii Nesterov, Andrei M. Raigorodskii, Aleksey Tikhonov, Maksim Zhukovskii


[number, level, graph, obtained, page, walk, probability, gfn, introduced] [set, algorithm, oracle, problem, function, lemma, choose, loss, general, learning, assume, convex, consider, theorem, complexity, gbn, inequality, obtain, upper, let, proof, lower, considered] [method, optimization, inexact, calculate, gradient, convergence, apply, supplementary, gbp, step, solve, parameter, restart] [vector, ranking, pagerank, following, can, also, first, power, surfer, iterative, solving] [random, stationary, test, point, distribution, calculation] [value, web, framework, required, unlike, allows, markov] [accuracy, supervised, using, used, use, different, figure]
Theoretical Comparisons of Positive-Unlabeled Learning against Positive-Negative Learning
Gang Niu, Marthinus Christoffel du Plessis, Tomoya Sakai, Yao Ma, Masashi Sugiyama


[probability, either, many] [risk, learning, misclassification, rate, theorem, bound, loss, rpu, will, let, tighter, rnu, function, assume, drawn, minimizers, rademacher, problem, surrogate, set, case, class, upper, lemma, least, may, smaller, since, defined, label, proof, remaining, even, satisfies] [size, machine, sampling, marginal, unbiased] [error, can, order, one, also, denote, analysis] [data, estimation, based, theoretical, given, positive, random, estimator, fixed, empirical, kernel, two, statistical, limit, density, journal, ordinary] [experimental, benchmark, artificial] [figure, three, gpu, unlabeled, table, using, compare, supervised, rpn, training, improve, classifier]
A Simple Practical Accelerated Method for Finite Sums
Aaron Defazio


[theory, relation, strong, number] [function, algorithm, rate, convex, theorem, chosen, known, set, general, loss, define, now, notation] [step, proximal, gik, accelerated, method, gradient, zjk, epoch, operator, suboptimality, sdca, saga, inner, gjk, stochastic, size, convergence, processing, incremental, catalyst, dual, recently, descent, log, primal, however, datasets, tong, shai, term, iterate, supplementary, efficiently, curran, lyapunov] [can, also, product, linear, note, main, see, following, condition, convexity, much] [finite, two, point, given] [fig, information, form, whereas] [used, using, neural, use, instead, applied, able, approach]
Feature selection in functional data classification with recursive maxima hunting
José L. Torrecilla, Alberto Suárez


[variable, recursive, number, base, rule, average, complete, whose, introduced] [optimal, function, set, class, relevance, close, problem, max, depends, considered] [standard, reduction, method, end, size] [error, can, pca, one, local, analysis, also, second, first, linear, regression, vector, plot] [functional, rmh, selection, data, selected, hunting, distance, pls, brownian, bayes, dimensionality, berrendero, maximum, corresponding, two, tmax, redundancy, important, space, correction, journal, empirical, tinf, test, statistical, based, section, procedure] [correlation, information, process, goal, value] [classification, feature, used, different, figure, relevant, using, accuracy, performance, training, approach, similar]
Kernel Observers: Systems-Theoretic Modeling and Inference of Spatiotemporally Evolving Processes
Hassan A. Kingravi, Harshal R. Maske, Girish Chowdhary


[number] [let, set, problem, lower, learning, function, index, bound, show, general, property, provide, may, will] [sampling, inference, latent, machine, supplementary, approximate, large, auto] [can, matrix, error, measurement, proposition, linear, note, monitoring, sensing, condition, sufficient] [kernel, nonstationary, gaussian, given, observability, random, covariance, functional, space, data, differential, two, spatiotemporally, section, wind, rkhs] [model, spatiotemporal, evolution, observer, state, system, process, modeling, autonomous, temperature, rms, shaded, observation, cyclic, time, required, design, temporal, pclsk, sensor, form, heuristic, modeled] [approach, using, figure, map, input, spatial, original, training, work, feature, presented, domain, use, prediction]
Linear Contextual Bandits with Knapsacks
Shipra Agrawal, Nikhil Devanur


[probability, total, present, definition, relation] [algorithm, opt, contextual, online, bound, regret, arm, problem, every, confidence, bandit, learning, context, lemma, budget, round, let, case, optimal, set, special, optimistic, ellipsoid, corollary, get, define, unknown, outcome, lincbwk, played, proof, defined, setting, consider, theorem, general, since, may, lower, conference, bounded] [consumption, stochastic, parameter, log, optimization, depend, update] [linear, can, vector, first, assumption, matrix, column, following, arg, also, denote, one, high] [given, estimate, section, distribution] [time, reward, policy, value, resource, required, maximize, dynamic, along, choice] [using, use, static, used, weight, work, similar]
Achieving budget-optimality with adaptive schemes in crowdsourcing
Ashish Khetan, Sewoong Oh


[assignment, worker, probability, average, number, fundamental, crowdsourcing, assigned, total, level, threshold, quality, assign, requester, made, majority, introduced, enough] [algorithm, adaptive, budget, difficulty, rate, theorem, bound, set, scheme, lower, achieve, round, constant, minimax, get, assume, make, since, best, conference, provide, choose, let, case, achieves, bounded] [inference, end, supplementary, log, processing, large, standard] [error, can, one, arriving, fraction, denote, following, sufficient, first, also, note] [distribution, given, random, limit, mean, based, true, section] [model, gain, choice, difficult, information, next, scaling, take] [task, using, per, neural, approach, performance, multiple, figure, classify, classification]
Search Improves Label for Active Learning
Alina Beygelzimer, Daniel J. Hsu, John Langford, Chicheng Zhang


[disagreement, probability, number, many, rune, loop] [earch, abel, learning, hypothesis, active, version, arch, counterexample, algorithm, set, label, complexity, example, let, class, query, theorem, least, may, rate, cal, ersion, cost, assume, lemma, oracle, agnostic, setting, consider, proof, case, realizable, appendix, return, earchh, learner, hanneke, call, balcan, ccq, just, now] [log, end, nested, iteration, step, processing] [can, error, union, practice, contain] [space, section, positive, consistent, provides, powerful] [search, target, information, current, substantially, new] [labeled, region, unlabeled, using, use, natural, neural, daniel, sequence]
A Credit Assignment Compiler for Joint Prediction
Kai-Wei Chang, He He, Stephane Ross, Hal Daume III, John Langford


[assignment, probabilistic, programming, dependency, program, number, many, cache, variable] [learning, loss, function, algorithm, define, best, online, may, show, make, set, svm, complexity, essentially, defined] [run, machine, factor, hyperparameters, end, optimization] [can, one, also, much, tagging] [joint, space, two, perceptron, test, reference, based, underlying, statistical, point] [search, policy, credit, state, tdolr, compiler, time, rollin, action, rollout, system, parser, current, ctb, ner, speed, markov, effect, code, making, complex, simple] [prediction, training, structured, sequence, performance, figure, use, different, output, input, approach, predict, library, neural, crf, using, language, classifier]
A Bio-inspired Redundant Sensing Architecture
Anh Tuan Nguyen, Jian Xu, Zhi Yang


[number, level, many, total, larger, degree, thus, theory, calibration, probabilistic, precise] [set, precision, known, even, may, ratio, optimal, defined, binary] [increase, processing, secondary, large, due, implementation] [error, can, component, sensing, high, power, computational] [mismatch, distribution, data, quantization, shannon, redundant, reference, redundancy, conversion, adc, two, maximum, binocular, limit, journal, digital, quantizer, differential, integrated, assembly, result] [information, human, allows, design, biological, intrinsic, artificial, heuristic, primary, simulation, eye, geometrical, nonlinearity] [figure, resolution, visual, proposed, without, unit, vision, effective, approach, different, physical, architecture, similar, amount, million, generate, field, neural, unsupervised, using]
Spatio-Temporal Hilbert Maps for Continuous Occupancy Representation in Dynamic Environments
Ransalu Senanayake, Lionel Ott, Simon O'Callaghan, Fabio T. Ramos


[obtained, number, probability, probabilistic, collected, regular] [learning, query, problem, assume, conference] [method, discussed, due, processing] [can, regression, main, also, computational] [section, point, kernel, hilbert, data, space, embedding, based, given, gaussian, distribution, two, random, equation] [dynamic, occupancy, model, time, new, laser, future, uncertainty, shm, process, grid, past, continuous, robotics, information, location, sensor, nll, state, raw, environment, dgm, occupied, hinged, ahead, merely] [motion, using, map, spatial, figure, used, mapping, predict, static, feature, dataset, object, predicting, approach, moving, international, frame, table, field, neural, similar]
Deconvolving Feedback Loops in Recommender Systems
Ayan Sinha, David F. Gleich, Karthik Ramani


[number, probability, influence, identify, likely, possible, induced] [algorithm, show, set, consider, considered, make, now] [line, datasets, parameter, supplementary, method, compute, due, validate] [matrix, recommender, rtrue, rating, observed, robs, user, collaborative, item, deconvolved, filtering, deconvolving, singular, assumption, ttrue, recommendation, plot, netflix, recommended, synthetic, can, see, also, first, hyperbola, recover, jester, rrecom, progressively, preference] [true, data, metric, based, given, density, equation] [feedback, model, system, effect, time, state, information, value, varying, straight, future, implicit] [score, figure, similarity, use, using, approach, dataset, higher, without, able, per, final]
Bi-Objective Online Matching and Submodular Allocations
Hossein Esfandiari, Nitish Korula, Vahab Mirrokni


[probability, balance, assign, weighted, wij, present, edge, total] [online, algorithm, allocation, submodular, set, budgeted, problem, super, greedy, competitive, theorem, ratio, lemma, welfare, let, hardness, swm, show, vahab, proof, upper, virtual, optimal, bij, provide, allocate, consider, maximization, offline, budget, almost, maximizing, since, tight, function, grp, bound, expected, nitish, special, max, define, may, optj] [objective, optimization, stochastic, run, approximation, dual, marginal, factor] [item, second, one, first, optimum, can, following, fraction, need, analysis] [two, based, result, exponential, curve] [agent, value, goal, gain, resource, maximize, model, assuming] [matching, weight, figure, adversarial, using, previous]
Scaling Factorial Hidden Markov Models: Stochastic Variational Inference without Messages
Yin Cheng Ng, Pawel M. Chilinski, Ricardo Silva


[number, message, passing, dependency, david, structure] [algorithm, learning, budget, set, binary, show, class] [variational, inference, fhmm, stochastic, computing, posterior, latent, bivariate, copula, large, elbo, fhmms, scalability, subchains, computed, approximate, factorial, respect, smf, marginal, machine, gradient, size] [can, also, computational, observed] [data, gaussian, distribution, length, chain, emission, given, random, space, equation, based, test, family, two] [markov, long, state, model, simulated, time, allows, information, correlation] [proposed, recognition, hidden, neural, network, sequence, using, learned, approach, different, compared, scale, training, use, structured, figure, per, table, propose, preprint, arxiv, validation]
Probing the Compositionality of Intuitive Functions
Eric Schulz, Josh Tenenbaum, David K. Duvenaud, Maarten Speekenbrink, Samuel J. Gershman


[structure, number, many, mechanical, average, base] [function, learning, set, since, chosen, best] [extrapolation, experiment, standard, showed, interpolation, university] [spectral, can, sampled, error, first, significantly, linear, via, regression, real] [compositional, kernel, mean, mixture, people, lxp, pxr, lxr, inductive, asked, basis, data, gaussian, radial, intuitive, lin, proportion, functional, space, grammar, predictability, two, given, rbf, perceived, well, chain, turk, expressed, distance, recruited, periodic, simpler, distribution, accepted, assessed, exp, pxlxr] [human, model, cognitive, complex, continuous, world, received, choice, markov, information] [figure, different, approach, input, used, per, shown, generate, using, structured, prefer, last, thought, ground]
Learnable Visual Markers
Oleg Grinchuk, Vadim Lebedev, Victor Lempitsky


[bit, thus] [learning, loss, conference, might, show, make, defined] [capacity, transforms, augmented, machine, size, stochastic] [can, certain, also, one, sampled, matrix, recover, gram, related, suitable] [two, robust, random, geometric, based, affine, well, associated, corresponding] [information, design, process, system, code, encoding, adapt, maximize, environment] [network, marker, recognizer, synthesizer, texture, visual, deep, approach, using, convolutional, computer, recognition, neural, image, vision, rendering, trained, used, figure, color, input, layer, use, architecture, implemented, spatial, different, sequence, pattern, learned, renderer, single, international, style, training, natural, work, printing, fiducial, blur, output, generate, pretrained, performance]
Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits
Vasilis Syrgkanis, Haipeng Luo, Akshay Krishnamurthy, Robert E. Schapire


[probability, admissible, number, thus, variable, definition] [max, algorithm, regret, contextual, bound, relaxation, cost, let, will, now, observe, upper, learner, learning, set, strategy, oracle, bandit, setting, minimax, rakhlin, online, lemma, rademacher, drawn, robert, problem, sridharan, optimal, access, bounded, show, defined, best, chooses, since, proof, assume, cumulative, consider, theorem, maxy] [optimization, coordinate, efficient, argmin, unbiased, term, computed, end, compute, standard] [can, min, also, first, denote, vector, one, following] [random, distribution, equal, based, maximum, quantity, given, mass, equation, section] [value, policy, framework, information, action, time, goal] [adversarial, sequence, using, use, recent, work]
A posteriori error bounds for joint matrix decomposition problems
Nicolo Colombo, Nikos Vlassis


[obtained, strictly] [problem, bound, defined, upper, set, conference, theorem, inequality, since, algorithm, proof, lower, feasible, show, case, learning, let, implies, smallest, function] [optimization, machine, approximate, operator, parameter, triangular, siam, processing, objective, due] [matrix, low, plow, decomposition, can, schur, analysis, triangularizer, orthogonal, signal, noise, skew, perturbation, exact, tensor, simultaneous, error, posteriori, triangularization, one, diagonalization, eigenvalue, observed, via, following, triangularizers, first, closest, ieee, norm, nonsymmetric, diagonalizable, synthetic, hence, also, frobenius, side] [joint, journal, two, empirical, estimation, manifold, distance, canonical, uinit] [] [used, international, jointly, use, using, approach, figure, ground]
LightRNN: Memory and Computation-Efficient Recurrent Neural Networks
Xiang Li, Tao Qin, Jian Yang, Xiaolin Hu, Tieyan Liu


[probability, node, number, unique, total, represent, larger] [will, algorithm, allocation, set, problem, achieves, loss, since, complexity, allocate, cost, learning, best] [size, large, log, processing, factor, calculate, datasets, key] [row, can, column, vector, also, one, computational, still, need, technical, matrix, see] [word, embedding, vocabulary, two, associated, based, test, distribution, given] [model, next, modeling, time, benchmark, hierarchical, state, share, information] [lightrnn, language, training, neural, table, rnn, recurrent, figure, perplexity, preprint, arxiv, shared, network, use, memory, aclw, compared, several, using, position, used, gpu, dataset, hidden, billionw, input, proposed, reducing, matching, natural, softmax]
On Mixtures of Markov Chains
Rishi Gupta, Ravi Kumar, Sergei Vassilvitskii


[number, probability, definition, grows, block] [let, algorithm, since, set, show, consider, learning, problem, every, lemma, will, exists, now, setting, case, make, defined] [full, compute, step] [matrix, error, can, first, rank, diagonal, note, one, denote, see, decomposition, also, recovery, shuffle, condition, column, vector, app, recover, recovering, spectral, real, entry] [mixture, chain, given, section, two, distribution, length, underlying, together] [markov, state, transition, starting, form, jth, research] [use, using, different, figure, reconstruction, hidden, recall, work, performance, approach, generated, reconstructing]
Understanding Probabilistic Sparse Gaussian Process Approximations
Matthias Bauer, Mark van der Wilk, Carl Edward Rasmussen


[adding, number, many, obtained] [function, show, learning, will, complexity, bound, always, conference, lower] [inducing, fitc, vfe, full, objective, marginal, qff, likelihood, fit, term, variance, posterior, heteroscedastic, variational, optimisation, kff, optimised, log, approximation, additional, lengthscales, extra, inference, snelson, approximate, method, processing, away, intelligence, university, optimiser, machine] [can, noise, sparse, penalty, good, remark, reduced, solution, also, still, see] [data, gaussian, true, covariance, section, mean, practical, two, common, conditional] [behaviour, process, model, predictive, artificial, change, information, placed, trace, uncertainty] [input, training, using, neural, dataset, without, figure, top, investigate, like, fully, improves]
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks
Wenjie Luo, Yujia Li, Raquel Urtasun, Richard Zemel


[many, number, larger] [case, get, consider, study, make, may, show, loss] [size, gradient, variance, standard, central, large] [can, analysis, also, note, one, initialization, still, much, see, following, relative, assumption, linear, first, signal] [random, gaussian, theoretical, kernel, distribution, center, nonlinear, well, section, uniform, empirical, fourier] [effect, information, change, within, therefore, binomial] [receptive, field, erf, deep, effective, convolution, convolutional, output, image, input, training, use, network, impact, neural, pixel, unit, layer, weight, relu, cnn, dilated, cnns, different, semantic, arxiv, preprint, like, used, activation, understanding, figure, residual, single, object, classification, using, shown, trained]
Backprop KF: Learning Discriminative Deterministic State Estimators
Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel


[probabilistic, piecewise, graph, knowledge, number] [since, conference, set, function, difficulty] [deterministic, standard, latent, method, gradient, normalization, inference] [can, computation, error, also, filtering, require, need] [estimation, test, two, based, nonlinear, distribution, space, well, corresponding, estimate, conditional] [state, model, observation, kalman, simple, raw, tracking, complex, time, directly, typically, optimize, backpropagation, design, corresponds] [neural, network, training, discriminative, recurrent, filter, lstm, generative, trained, feedforward, approach, using, bkf, figure, task, use, convolutional, image, visual, sequence, kitti, domain, international, dst, predict, used, train, camera, hidden, performance, entire, computer, shown, preprint, including, vision]
Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates
Yuanzhi Li, Yingyu Liang, Andrej Risteski


[potential, larger, level, negative, present, thus] [algorithm, case, will, theorem, even, general, since, learning, problem, function, show, proof, bound, consider, constant, max, lemma, guarantee, assume, let, upper, induction, assumed, heavy, focus, unknown] [large, unbiased, update, requires, provably, step, updating] [matrix, noise, can, decoding, small, also, factorization, order, assumption, one, much, interesting, note, nmf, still, related, signal, provable, recover, practice, sparse, popular, fraction, analysis, norm, see, alternating, denote] [data, topic, two, mild, theoretical, positive, robust, section] [model, potentially, simplified] [feature, adversarial, relu, use, work, different, intermediate, generative, used, similar]
Ancestral Causal Inference
Sara Magliacane, Tom Claassen, Joris M. Mooij


[causal, ancestral, aci, independence, mek, fci, hej, pkc, erk, jnk, plcg, pka, raf, cfci, akt, discovery, possible, weighted, anytime, number, acyclic, directed, probability, scoring, frequentist, reliability, relation, represents] [confidence, show, obtain, will, set, function, loss, precision] [standard, method, supplementary, latent, bayesian, optimization, inference, log] [can, order, also, one, synthetic] [test, data, observational, conditional, given, statistical, two, distribution, interventional] [bootstrapped, execution, direct, experimental, encoding, time] [using, use, figure, input, score, several, different, like, used, weight, propose, recall, approach]
SEBOOST - Boosting Stochastic Learning Using Subspace Optimization Techniques
Elad Richardson, Rom Herskovitz, Boris Ginsburg, Michael Zibulevsky


[number, added, many, balance] [algorithm, learning, set, problem, function, will, rate] [optimization, stochastic, method, sgd, seboost, step, descent, boosting, secondary, experiment, nag, gradient, size, sesop, adagrad, large, momentum, michael, although, sequential, usually, mse, processing] [subspace, can, one, error, anchor, overall, note, following, small, vanilla, regression] [test, based, section, two, point, spanned] [baseline, direction, current, process, effect, taking, time, change, significant, information] [different, figure, mnist, autoencoder, previous, original, deep, applied, used, neural, last, train, training, using, achieved, applying, classification, composed, better]
Learning Bound for Parameter Transfer Learning
Wataru Kumagai


[probability, definition, knowledge, often, number] [learning, bound, theorem, learnability, algorithm, show, set, satisfies, setting, consider, inequality, let, since, assume, conference, assumed, hypothesis, case, provide, gray, arbitrary, label, zhao, margin, expected, permissible, bounded, kind, exists, known] [parameter, machine, although, processing, large] [sparse, dictionary, stability, following, local, assumption, can, first, note, denote, matrix, perturbation, also, suppose, norm, regularization, paper, noise] [section, parametric, sample, based, space, theoretical, data, radius, estimator, exp, refer, useful] [target, coding, model, corresponds] [transfer, source, region, feature, representation, mapping, unlabeled, neural, used, performance, domain, effective, international, approach, using, applying, unsupervised, task, yang]
Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, Josh Tenenbaum


[probabilistic, structure] [learning, show, loss, observe] [latent] [can, vector, also, following] [space, distribution, two, based, sample, test, data] [model, modeling, framework, demonstrate] [object, generative, training, adversarial, image, figure, shape, discriminator, representation, generator, generated, learned, use, network, single, convolutional, deep, without, volumetric, reconstruction, classification, ikea, used, table, sharma, girdhar, synthesis, synthesize, generate, voxel, discriminative, consists, using, unsupervised, neural, input, previous, thomas, evaluate, performance, semantic, radford, preprint, recent, synthesized, visualize, novel, supervised, generating, proposed, arxiv, trained, able, different, dataset, supervision, higher, last]
An urn model for majority voting in classification ensembles
Victor Soto, Alberto Suárez, Gonzalo Martinez-Muñoz


[ensemble, number, siba, disagreement, voting, complete, queried, urn, probability, average, pruning, lookup, possible, partial, vote, knowledge, majority, halted, made, querying, sonar, hypergeometric] [class, learning, expected, conference, instance, set, confidence, equivalent, label, remaining, case, problem, rate, considered, upper, algorithm, will] [hyper, machine, compute, size, method, faster, full] [can, one, error, analysis, proposition, following] [distribution, uniform, given, estimate, random, statistical, specified, closer, test, estimated, based, fixed, data] [prior, process, decision, dynamic, individual, assuming, new] [using, used, table, prediction, classification, training, different, international, color, proposed, final, use, classifier]
Large Margin Discriminant Dimensionality Reduction in Prediction Space
Mohammad Saberian, Jose Costa Pereira, Nuno Nvasconcelos, Can Xu


[number, combination, possible] [learning, multiclass, algorithm, set, svm, margin, duality, class, risk, loss, example, defined, hashing, max, optimal, binary] [boosting, codewords, discriminant, codeword, mcboost, large, reduction, method, iteration, update, gradient, optimization] [can, linear, dimension, note, arg, sign, ieee, pca, error] [predictor, dimensionality, data, space, embedding, kernel, based, two, given, dimensional, basis, wzi, preserving, procedure] [decision, current, learns, traffic, new] [ladder, using, learn, mapping, use, figure, prediction, classifier, classification, learned, neural, propose, deep, feature, implemented, intermediate, proposed, compared, image, multiple, different, performance, scene, weak, training, dataset, jointly, table, approach, computer, used, sij]
Deep Alternative Neural Network: Exploring Contexts as Early as Possible for Action Recognition
Jinzhuo Wang, Wenmin Wang, xiongtao Chen, Ronggang Wang, Wen Gao


[number, connected, larger, often, split, possible, contains] [adaptive, arbitrary, context, class, learning, competitive] [size, method, preserve, standard, clipping] [can, local, first, one, also, comparison] [length, data, kernel, sample, two] [temporal, action, alternative, early, time, current, determine, model, human, followed] [layer, recurrent, input, dann, volumetric, video, convolutional, deep, pooling, network, neural, table, flow, figure, pyramid, optical, performance, recognition, feature, spatial, visual, impact, fully, training, using, semantic, architecture, use, unit, similar, preprint, arxiv, used, motion, clip, output, accuracy, different, previous, investigate, applied, three, fusion]
Latent Attention For If-Then Program Synthesis
Chang Liu, Xinyun Chen, Eui Chul Shin, Mingcheng Chen, Dawn Song


[program, either, sum] [function, learning, set, active, observe, will, achieve, problem, best, show, consider, example] [latent, standard, size, supplementary] [can, also, first, one, existing, dictionary] [embedding, two, token, refer, test, data] [action, model, new, determine, code, take, simple, prior] [attention, trigger, training, using, figure, language, natural, accuracy, neural, prediction, channel, output, softmax, semantic, description, input, better, work, network, task, predicting, approach, instagram, parsing, weight, use, three, ensembling, rebalanced, sequence, trained, dataset, synthesis, bdlstm, different, preprint, train, arxiv, performance, recipe, improve, presented, previous, similar]
End-to-End Goal-Driven Web Navigation
Rodrigo Nogueira, Kyunghyun Cho


[node, number, graph, page, evaluating, allowed, many, find, probability, thus, edge, average] [query, set, learning, example, make] [step, challenging, away] [one, first, vector, wikipedia, can, also] [based, test, two, given, maximum, selected, word] [agent, web, target, navigation, neuagent, wikinav, starting, search, information, world, action, state, current, artificial, human, focused, benchmark, website, webnav, model, outgoing, kentuchy, stop, time, next, software, making, frec, english, navigates, neuagents] [task, proposed, natural, content, neural, language, table, use, whole, description, training, trained, hidden, pretrained, using, evaluate, consists, representation, derby, used, understanding, beam]
A Non-convex One-Pass Framework for Generalized Factorization Machine and Rank-One Matrix Sensing
Ming Lin, Jieping Ye


[probability, recursive, called, chen, whose] [lemma, theorem, algorithm, will, learning, convex, since, define, rate, bound, constant, complexity, bounded, show, hold, least, proof, version] [operator, efficient, step, convergence, standard, sampling, machine, key, method, requires] [matrix, sensing, gfm, rank, order, symmetric, recovery, rip, can, alternating, suppose, via, condition, low, norm, factorization, provable, noise, noisy, generalized, perturbation, following, denote, first, linear, one, need, second, completion, main, singular, prateek, sampled, error, high, cai] [estimation, provided, independent, theoretical, random, construct, gaussian, section, fixed, construction, based] [framework, target, therefore, trace, value, information] [sequence, training, several, proposed, using, feature]
Regularized Nonlinear Acceleration
Damien Scieur, Alexandre d'Aspremont, Francis Bach


[thus, number, degree, chebyshev] [problem, algorithm, max, convex, bound, will, let, now, optimal, function, assume, case, since, rate, oracle, defined, scheme] [gradient, method, acceleration, optimization, convergence, iterates, ampe, extrapolation, rmpe, parameter, conjugate, computed, solve, siam, accelerated, approximate, smooth, respect, compute, strongly, solves] [can, solution, linear, matrix, minimal, min, regularized, vector, condition, following, proposition, small, order, iterative, also, solving, norm, need, eigenvalue, note, error, regularization, perturbation, explicit, optimum, formulation, generic] [polynomial, nonlinear, numerical, estimate, point, equal, section, journal, fixed, equation, given, classical, mathematical, result] [information, control] [using, use, figure, without, similar, produced, performance]
Optimal Learning for Multi-pass Stochastic Gradient Methods
Junhong Lin, Lorenzo Rosasco


[number, probability, larger, total] [learning, optimal, inf, corollary, let, show, consider, least, setting, since, function, proof, derive, algorithm, theorem, rate, case, set, defined, stopping, latter, unknown, studied] [sgm, stochastic, gradient, batch, pass, convergence, capacity, machine, processing, size, variance, parameter, term, large, lead, distributed, optimization] [error, can, one, assumption, computational, regularization, also, see, main, regression, first, following, order, note, related] [sample, bias, fixed, considering, independent, given, data, measure, generalization, two, space, empirical, result, well, random, finite] [simple, information, choice, effect] [using, neural, three, multiple, sequence, different, figure]
Learning Structured Sparsity in Deep Neural Networks
Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li


[structure, number, average] [learning, achieve, show, achieves, even, conference] [speedup, method, reduce, efficient, approximation, university] [sparsity, can, error, regularization, group, computation, lasso, also, rank, low, matrix, zero, sparse, order, note, first] [] [neuron, model, baseline, within] [ssl, convolutional, figure, dnn, structured, deep, layer, filter, accuracy, lenet, alexnet, neural, depth, table, weight, cpu, learn, gpu, convnet, network, preprint, arxiv, without, flop, input, dnns, regularize, shape, learned, classification, feature, different, resnet, proposed, compared, higher, use, using, xeon, achieved, original, trained, work, connection, zeroed, residual]
Estimating the Size of a Large Network and its Communities from a Random Sample
Lin Chen, Amin Karbasi, Forrest W. Crawford


[number, ulse, nsum, vertex, total, probability, edge, graph, pendant, social, larger, degree, thus, blockmodel, becomes, vary, community, block, denoted, subgraph, induced, men] [let, observe, algorithm, set, pij, smaller, study, theorem, deviation, assume, general, problem] [size, posterior, sampling, likelihood, stochastic, parameter, proposal, large, variance, compute, method] [relative, error, also, sampled, can] [sample, estimation, random, given, distribution, population, estimating, estimate, type, people, mean, estimator, joint, data, bias, two] [effect, model, must, value, prior] [network, performance, propose, randomly, different, better, using, shown]
Asynchronous Parallel Greedy Coordinate Descent
Yang You, Xiangru Lian, Ji Liu, Hsiang-Fu Yu, Inderjit S. Dhillon, James Demmel, Cho-Jui Hsieh


[variable, number, often, rule, block, solver, partition] [algorithm, svm, greedy, will, learning, rate, best, set, function, bound, since, define, assume, problem, selecting] [coordinate, descent, asynchronous, convergence, parallel, stochastic, step, gradient, update, machine, gcd, faster, speedup, objective, covtype, dual, method, optimization, webspam, lres, select, conduct, iteration, libsvm, distributed, asynchronously, rik, large, parallelizing] [can, following, linear, solving, note, matrix, first, also, vector, much, one, projected, analysis, error, global, column] [kernel, section, theoretical] [time, scaling, cyclic] [memory, using, used, training, use, figure, approach, proposed, randomly, work, multiple, scale, implemented, shared, stored]
Strategic Attentive Writer for Learning Macro-Actions
Alexander Vezhnevets, Volodymyr Mnih, Simon Osindero, Alex Graves, Oriol Vinyals, John Agapiou, koray kavukcuoglu


[possible, notice] [learning, defined, general, every, function, set] [update, step, experiment, stochastic] [can, one, first, matrix, also] [section, two, gaussian, random, distribution, useful, given] [straw, action, plan, reinforcement, time, temporal, agent, strawe, state, maze, planning, next, goal, learns, atari, commitment, exploration, model, value, attentive, policy, environment, reward, module, observation, demonstrate, pacman, temporally, thereby, frostbite, followed, corresponds] [network, neural, figure, attention, lstm, using, deep, use, prediction, training, feature, sequence, architecture, representation, trained, score, patch, used, learned, character, convolutional, recurrent, structured, text, proposed, layer, without, learn, different, generate, preprint]
Dynamic Mode Decomposition with Reproducing Kernels for Koopman Spectral Analysis
Yoshinobu Kawahara


[obtained, number, describe] [set, algorithm, let, consider, learning, known, since, defined, define, considered] [operator, method, approximation, although, calculate, machine] [can, decomposition, matrix, analysis, subspace, spectral, linear, orthogonal, principal, eigenvalue, krylov, one, onto, note, denote, via, vector, gram] [koopman, nonlinear, dmd, kernel, data, eigenfunctions, given, empirical, corresponding, based, journal, procedure, estimated, section, spanned, fluid, reproducing, eigendecomposition, pod, ritz, true, space, basis, distance, toy, principle, observables, two, finite, embedding, calculation] [dynamical, system, mode, dynamic, change, value] [using, feature, proposed, sequence, perform, several, applied, neural, extracted, presented, used, map, prediction, approach]
Computational and Statistical Tradeoffs in Learning to Rank
Ashish Khetan, Sewoong Oh


[pairwise, ordered, subset, graph, number, ordering, among, grb, topology, resulting, partition, lrb, partial, prb, poset, paired, edge, sum] [bound, lower, oracle, set, dependence, provide, complexity, learning, concave, max, upper, utility, let, problem, fix, theorem, choose, general, since, show, chosen, define] [size, log, optimization, large, processing] [computational, order, error, can, user, ranking, inconsistent, generalized, rank, following, one, spectral, analysis, denote, item, first, ordinal, also, remark] [sample, data, estimator, provides, consistent, statistical, mle, canonical, estimate, finite, random, provided] [model, choice, information, simple, significant, preferred] [figure, effective, extracted, proposed, use]
Flexible Models for Microclustering with Application to Entity Resolution
Brenda Betancourt, Giacomo Zanella, Jeffrey W. Miller, Hanna Wallach, Abbas Zaidi, Beka Steorts


[cluster, nbd, nbnb, entity, number, pyp, microclustering, clustering, partition, probability, negative, record, exchangeable, false, grow, create, singleton, linkage, italy, date, science, obtained, categorical, implicitly] [set, property, class, appendix, algorithm, will, define, assume] [size, posterior, sampling, bayesian, inference, large, university, negbin, variant] [can, one, analysis, require, via, noisy, assumption] [data, mixture, distribution, random, equation, two, statistical, maximum, mean, finite, dirichlet, percentile, fixed, conditional, infinitely, section, true, survey, empirical, journal, address, yield] [model, prior, process, exhibit, value, belief] [resolution, used, using, four, sequence, approach, generated, three, figure, perform, compare]
Object based Scene Representations using Fisher Scores of Local Subspace Projections
Mandar D. Dixit, Nuno Vasconcelos


[combination, obtained] [class, problem, conference, show, function, best, set] [log, factor, variance, respect, large, due, gradient] [vector, can, sparse, local, second, order, linear, ieee, comparison, much] [fisher, gmm, covariance, mixture, based, gaussian, mean, dimensional, two] [model, information, observation, modeling, coding, state] [scene, cnn, object, image, classification, transfer, cnns, recognition, mfa, computer, shown, deep, performance, using, alexnet, vision, table, representation, score, mit, vgg, sun, indoor, used, feature, trained, better, dataset, task, hidden, neural, holistic, accuracy, use, similar, pooling, pattern, patch, training, outperforms, extracted]
Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition
Ahmed Ibrahim, Mihaela Van Der Schaar


[path, probability, partition, structure, tuple] [optimal, stopping, every, theorem, since, risk, cost, function, will, show, whenever, set, observe, interval, instance, context, observes, assume] [processing, posterior, bayesian, sequential, stochastic] [can, following, hence, order, observed, sensing, denote] [survival, sample, two, random, given, estimate, space, based, journal, type] [time, information, policy, process, belief, adverse, decision, event, realization, new, series, model, rendezvous, next, gain, current, predictable, action, gathering, decides, continuation, unlike, deadline, sensory, stop, whether, occurrence, whereas, surprise, care, acquiring, characterize, exemplary, timely, observing] [neural, figure, generated, captured]
Dialog-based Language Learning
Jason E. Weston


[thus, negative, partial, external] [learning, learner, set, setting, will, case, consider] [machine, datasets] [can, also, one, signal, see, first] [positive, given, described, fixed, test, provided, useful, well] [feedback, answer, teacher, forward, imitation, model, dialog, supplied, supporting, asking, goal, correct, reinforcement, expert, babi, information, reward, response, another, rbi, policy, imitating, student, read, movieqa, state] [task, supervision, language, use, memory, prediction, using, learn, natural, dataset, question, training, output, neural, input, different, network, supervised, predict, used, work, help, text, table, preprint, answering, arxiv, incorrect, relevant, accuracy]
A Bandit Framework for Strategic Regression
Yang Liu, Yiling Chen


[effort, worker, level, payment, exerting, partial, quality, number, collected, exert, sum, acm, exertion, scoring, rule, bne] [will, online, privacy, bandit, learner, index, set, learning, idea, consider, every, function, assume, case, bounded, show, least, competitive, best, theorem, conference, setting, incentivize] [log, update, term, convergence, variance, step] [can, regression, linear, noise, also, one, suppose, following, order, denote, much, solution, note, need, first] [data, two, selection, selected, estimator, bias, ridge, well, sample, differential] [time, model, acquisition, strategic, framework, design, information, target, future, change] [different, using, work, training, task, instead, trained, add, top, performance, propose, mechanism]
Doubly Convolutional Neural Networks
Shuangfei Zhai, Yu Cheng, Zhongfei (Mark) Zhang, Weining Lu


[number, dcnns, consistently, larger, denoted] [set, learning, show, define, conference] [size, standard, parameter, datasets, efficient, processing] [also, can, first, one] [two, data, maximum, section, together] [model, along, correlation, corresponds, information] [convolutional, layer, convolution, dcnn, doubly, neural, deep, filter, double, pooling, cnn, output, meta, image, arxiv, preprint, spatial, table, effective, network, performance, shown, maxout, three, shape, architecture, used, figure, without, different, learned, several, training, compared, translation, vggnet, maxoutcnn, cnns, augmentation, translated, computer, classification, input, alexnet, memory, dataset, improves, sharing, channel]
Deep Exploration via Bootstrapped DQN
Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy


[present, many, find, ensemble, number, ian] [learning, algorithm, may, function, optimal, appendix, even, thompson, cumulative, improved, consider, every, strategy, randomized] [efficient, bootstrap, approximate, large, posterior, faster, sampling, maintain] [can, via, require, also, small, benjamin, one, computational, order, initialization] [sample, data, generalization, distribution, random, computationally] [bootstrapped, dqn, exploration, value, agent, reinforcement, state, uncertainty, action, time, across, policy, head, unlike, atari, reward, upon, target, van, dithering, human, planning, extended, complex, temporally, simple] [deep, network, neural, figure, performance, several, approach, single, arxiv, work, preprint, shared, trained, learn, similar, effective]
Relevant sparse codes with variational information bottleneck
Matthew Chalk, Olivier Marre, Gasper Tkacik


[obtained, variable, vertical, relation, subset, thus, find] [algorithm, bound, function, lower, maximizing, just, consider, set] [kib, log, variational, bottleneck, processing, respect, objective, additional, occlusion, variance, standard, latent, likelihood, derivative] [sparse, can, decoding, one, linear, noise, side, iterative, occluded, also, small, zero] [gaussian, kernel, distribution, data, fixed, expansion, way, selection, two, test, closely] [information, encoding, response, coding, model, bar, encoded, unlike, sensory, behaviour, rni] [figure, input, image, neural, used, presented, using, training, approach, relevant, shape, natural, use, original, performance, representation, learned, handwritten]
Optimal Black-Box Reductions Between Optimization Objectives
Zeyuan Allen-Zhu, Elad Hazan


[community, probability, detection, graph, threshold, vertex, number, conjecture, nonbacktracking, abp, propagation, block, definition, adjacent, average, achieving, enough, degree, likely, assign, expect, sum, achievability, detect, cycle, presence, linearized, subtract] [algorithm, let, will, set, general, prove, proof, drawn, define, version, return, exists, least] [stochastic, large, solves, step, compute, initial] [one, matrix, also, approximately, can, spectral, symmetric, eigenvector, order, following, vector, eigenvalue, sparse, high, dominant, first] [length, two, random, equal, distribution, described, bias, positive, way, based] [model, belief, value, simply, snr, new] [different, instead, approach, part, use, multiple]
A Constant-Factor Bi-Criteria Approximation Guarantee for k-means++
Dennis Wei


[cluster, potential, clustering, base, resulting, adding, number, probability, report] [lemma, algorithm, optimal, theorem, bound, case, proof, constant, uncovered, set, since, corollary, implies, guarantee, ratio, known, let, bounded, selecting, satisfy, expected, june, problem, contribution, march, cost, function, assume] [approximation, sampling, factor, expectation, step, respect, term, supplementary, size] [also, first, can, one, technical, running, local, paper, sufficient, denote, main, minimum, existing, following, min] [given, section, center, two, inductive, result, data, euclidean, provided, sense, metric] [covered, search, taking, new, upon, therefore] [using, shown, used, work, randomly, improves]
Data driven estimation of Laplace-Beltrami operator
Frederic Chazal, Ilaria Giulini, Bertrand Michel


[graph, probability, calibration, laplacians, introduced, theory, according, many] [function, theorem, defined, let, consider, max, oracle, inequality, problem, learning, risk, bound, adaptive, lemma, selecting, known, assume, least, instance, constant, pointwise] [method, variance, operator, convergence, smooth, term, machine, standard, large, approximation] [can, following, paper, spectral, first, proposition, see, analysis, sampled] [laplacian, given, bandwidth, section, data, estimator, bias, exp, procedure, estimation, riemannian, manifold, sphere, mikhail, selected, family, mathematical, partha, kernel, quantity, belkin, finite, sample, gaussian, two, metric, jump, density, geometric] [model, driven] [approach, proposed, various, pascal, figure, previous]
Causal Bandits: Learning Good Interventions via Causal Inference
Finnian Lattimore, Tor Lattimore, Mark D. Reid


[causal, graph, variable, intervention, number, weighted, identification, knowledge, thus, possible] [bandit, algorithm, regret, problem, optimal, case, general, learning, set, arm, may, will, depends, focus, best, learner, consider, cumulative, show, theorem, selecting, known, assume, class, unknown, observe, max, contextual] [parallel, sampling, extra, importance, sequential, stochastic, standard, large, variance, log] [can, also, observed, low, one, side, via, truncated, analysis] [observational, distribution, estimate, fixed, given, estimator, conditional] [reward, simple, model, action, value, feedback, information, choice, simultaneously, framework, experimental, design, future] [used, use, figure, improve, work, applying]
Structured Prediction Theory Based on Factor Graph Complexity
Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, Scott Yang


[graph, present, number, theory, possible, many, dependency] [loss, learning, general, complexity, margin, hypothesis, upper, will, set, function, may, bound, rademacher, voted, theorem, defined, surrogate, lemma, arbitrary, max, convex, special, proof, bounded, show, appendix, vcrf, example, assume, case, consider, hold, known] [factor, log, size, standard, term, processing, machine] [can, also, following, analysis, explicit, linear, sparsity, first] [section, empirical, based, random, generalization, family, conditional, result, space, given, principle, sample, theoretical, additive] [new, complex, markov, along] [structured, prediction, used, using, output, use, field, natural, language, several]
An Architecture for Deep, Hierarchical Generative Models
Philip Bachman


[merge, indicates, structure, many] [learning, conference, bound, show, provide, set] [latent, inference, stochastic, machine, log, variational, deterministic, processing, sequential, stage, posterior, lsun, sampling] [can, one, also, first, see, local, second] [conditional, distribution, data, gaussian, mixture, described, section, two] [model, module, state, information, hierarchical, prior, measured, current, ability, trainable] [network, generative, using, performance, matnets, used, use, matnet, international, deep, architecture, neural, three, image, output, cifar, depth, training, modelling, figure, trained, final, meta, quantitative, omniglot, residual, convolutional, updated, generated, ladder, back, input, work, generation, reconstruction]
Fast recovery from a union of subspaces
Chinmay Hegde, Piotr Indyk, Ludwig Schmidt


[number, block, structure] [complexity, algorithm, let, theorem, problem, constant, appendix, general, show, set, guarantee, arbitrary, achieve, case, give, achieves] [approximate, approximation, gradient, faster, size, descent, apply, large, log, requires] [matrix, recovery, projection, subspace, linear, can, also, exact, tail, running, svd, singular, vector, order, following, svp, note, krylov, small, svds, sparse, sufficiently, rip, compressive, algorithmic, sparsity, recovering, hence, orthogonal, union, projected] [sample, two, section, important, corresponding, statistical, result, empirical] [time, head, framework, value, model, design, prior, new] [use, structured, using, work, approach, several, better, already, similar]
Synthesizing the preferred inputs for neurons in neural networks via deep generator networks
Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, Jeff Clune


[quality, many, thus] [learning, set, modified, best, show, maximization] [method, invert] [can, one, highly, global, synthetic, real, also] [space, two, well] [neuron, prior, preferred, code, model, information, human, whether, target, found] [image, trained, dnn, network, different, deep, learned, activation, neural, feature, generator, imagenet, layer, caffenet, architecture, convolutional, hidden, visualizing, natural, training, visualize, generative, dataset, shown, generalizes, recognition, output, encoder, synthesizing, input, dnns, activates, figure, classify, vision, train, visualized, visualization, previous, dgn, produce, computer, pattern, activate, generalize, alexnet, without, qualitatively, understanding, improve, produced, mit, synthesized]
Local Minimax Complexity of Stochastic Convex Optimization
sabyasachi chatterjee, John C. Duchi, John Lafferty, Yuancheng Zhu


[department, number, flat, david, larger] [minimax, convex, complexity, algorithm, function, modulus, inf, risk, continuity, binary, optimal, oracle, sup, set, consider, class, show, define, lower, let, superefficiency, difficulty, subgradient, constant, interval, will, now, rate, case, adaptive, general, problem, defined, achieves, example, satisfies, lipschitz, query, proof, logarithmic, setting] [stochastic, optimization, gradient, descent, log, university, stepsize, derivative] [local, error, analysis, can, computational, minimum, suppose, also, noise, note, following, min, main, arg] [given, result, statistical, point, estimation, section, analogue, polynomial, random] [search, information, benchmark, current, optimizing, alternative, form] [traditional, work, figure]
Measuring Neural Net Robustness with Constraints
Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, Antonio Criminisi


[find, constraint, according, signed, number, describe] [algorithm, set, show, example, since, convex, label, may, learning, pointwise, feasible, function, finding, consider, notion, problem, conference] [optimization, gradient, approximate, compute, computing, parameter, improving] [can, linear, frequency, also, def, perturbation, solving, one, high] [robustness, robust, test, given, distance, two, based, nearest] [baseline, substantially, found, search] [adversarial, neural, net, using, figure, training, use, input, used, work, accuracy, severity, deep, original, network, lenet, mnist, approach, improve, nin, arxiv, relu, layer, generated, region, preprint, compared, trained, image, labeled, fails, evaluate]
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
Tianfan Xue, Jiajun Wu, Katherine Bouman, Bill Freeman


[possible, probabilistic, often] [algorithm, show, problem, set, learning, function] [variational, latent, method, variance, deterministic, michael] [also, can, observed, real, first] [distribution, conditional, difference, sample, mean, test, given, two, kernel, toy, reference] [model, future, learns, movement, next, simple] [motion, image, network, figure, input, frame, convolutional, using, video, training, cross, visual, different, prediction, generative, feature, representation, encoder, single, proposed, layer, deep, flow, learned, without, dataset, synthesize, map, synthesis, field, unsupervised, propose, neural, shown, learn, multiple, decoder, autoencoder, scale, rgb, predict, use, novel, work, able, optical, recognition]
Efficient and Robust Spiking Neural Circuit for Navigation Inspired by Echolocating Bats
Bipin Rajendran, Pulkit Tandon, Yash H. Malviya


[detection, average, obtained] [rate, even, function, constant, will, chosen] [sampling, processing, increase, aim] [noise, signal, can, error, frequency, success, also, observed] [two, uniform, based, additive, difference, inspired, section, fixed] [spike, azimuth, model, snn, poisson, rms, bat, spiking, head, intensity, angle, encoding, target, time, left, neuron, tracking, pid, information, particle, ear, right, response, lso, system, echolocation, avcn, varying, turn, navigation, khz, dnll, received, biological, superior, effect, arrival, interaural, echo, aor, receiver, synaptic, within, angular] [network, input, source, sound, figure, performance, neural, higher, output, layer, detected, using, compared, use, different]
Short-Dot: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products
Sanghamitra Dutta, Viveck Cadambe, Pulkit Grover


[block, number, node, combination, subset] [strategy, set, expected, now, let, case, provide, complexity, exists, problem, learning, since] [parallel, computing, compute, transforms, processing, distributed, computed, parameter, faster, requires, size, reduce, university] [matrix, computation, can, row, linear, dot, processor, sparsity, uncoded, column, vector, one, repetition, also, short, analysis, sufficient, wait, indexed, bcols, square] [length, given, maximum, data, redundancy, exponential, mean] [time, required, experimental, encoding, divide, choice] [using, generate, fusion, different, figure, table, compare, shown, pattern, single, task, entire, instead]
VIME: Variational Information Maximizing Exploration
Rein Houthooft, Xi Chen, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel


[possible, according, adding] [learning, compression, lower, bound, make, improvement, maximizing, strategy, parametrized, show, assume, supported, defined] [variational, dkl, divergence, posterior, bayesian, gradient, method, bnn, approximation, approximate, inference, term, compute, sampling, log, berkeley, performs] [can, following, noise, sparse, also] [distribution, gaussian, section, practical, given, random, polynomial] [exploration, information, vime, reinforcement, continuous, reward, state, model, agent, control, intrinsic, gain, curiosity, policy, action, environment, value, simple, allows, trpo, optimizing, surprise, heuristic, history, mountaincar, research, behavior] [using, performance, figure, neural, use, deep, fully, proposed, without, better, different, prediction, task, propose]
Neural Universal Discrete Denoiser
Taesup Moon, Seonwoo Min, Byunghan Lee, Sungroh Yoon


[probability, rule, obtained, average] [function, binary, context, algorithm, will, loss, learning, since, define, set, show, concentration, lemma, always, setting, obtain, obtaining, cost] [size, objective, large] [can, noisy, note, error, arg, also, order, real, significantly, matrix, see, ieee, vector, linear, much, following, denote] [data, discrete, universal, estimated, true, given, estimate, underlying, result, based] [model, location, framework, future, information] [dude, neural, denoising, sequence, image, ber, clean, denoiser, window, nanopore, dna, used, performance, training, symbol, figure, using, deep, supervised, source, network, sliding, dnn, shown, lnew, different, generated, use, several, minion, channel, single, grayscale, mentioned]
Learning from Rational Behavior: Predicting Solutions to Unknown Linear Programs
Shahin Jabbari, Ryan M. Rogers, Aaron Roth, Steven Z. Wu


[number, constraint, intersection, subset, vertex, contains] [learnedge, problem, unknown, learning, bound, set, known, mistake, feasible, algorithm, day, polytope, learner, show, ellipsoid, lemma, will, theorem, changing, study, learnhull, optimal, make, function, revealed, precision, questionable, conference, convex, interval, infeasible, learnellipsoid, let, consider, give, may, bounded, upper, defined, pac, whenever, always, arbitrary, know] [objective, update, additional, optimization] [can, assumption, solution, linear, observed, also, following, denote, dimension] [given, point, finite, section, specified, fixed, polynomial] [information, written, time, goal, new, must] [prediction, predict, region, predicting, sequence, natural, learn]
Hypothesis Testing in Unsupervised Domain Adaptation with Applications in Alzheimer's Disease
Hao Zhou, Vamsi K. Ithapu, Sathya Narayanan Ravi, Vikas Singh, Grace Wahba, Sterling C. Johnson


[consistency] [will, hypothesis, may, observe, class, whenever, set, problem, show, assume, convex, provide, algorithm, even, theorem, known, consider, learning, since, function] [objective, log, size, optimization, due] [can, minimal, power, linear, min, one, first, error, related, relative, need, ieee, technical, analysis] [mmd, data, sample, estimation, testing, statistical, test, kernel, based, transformation, procedure, section, statistic, estimate, csf, distance, two, null, discrepancy, given, correspond, mean, exp, well, space, biomarkers, corresponding, normal] [model, target, transformed, simply, directly, disease] [domain, source, adaptation, feature, different, using, used, unsupervised, performance, use, multiple, work, recall]
Unsupervised Risk Estimation Using Only Conditional Independence Structure
Jacob Steinhardt, Percy S. Liang


[thus, variable, likely, seed, structure, independence] [risk, learning, loss, theorem, algorithm, even, class, show, will, obtain, make, problem, hinge, complexity, case, close] [log, method, machine, compute, large, full, term] [can, assumption, error, tensor, suppose, also, def, matrix, need, see, recover, hence, small, decomposition] [estimate, conditional, estimation, test, given, data, estimating, section, independent, sample, random, distribution, true, estimated, family, result, two, journal, needed] [model, shift, framework] [unsupervised, unlabeled, using, domain, training, figure, use, without, perform, approach, labeled, adaptation, shown, different, prediction, train, structured, conditioned, single, classification, used, work]
Crowdsourced Clustering: Querying Edges vs Triangles
Ramya Korlakai Vinayak, Babak Hassibi


[triangle, edge, clustering, block, adjacency, probability, graph, cluster, filled, number, obtained, inside, querying, crowdsourcing, reliable, confusion, possible, denoted, belong, program, hit, crowd, total] [query, consider, cost, conference, since, learning, convex, label, will, provide, make, budget, give, revealed, let] [processing, size] [matrix, spectral, can, via, note, following, comparison, error, partially, real, significantly, relative, recovery, reveal, one, condition, hence, also] [random, true, two, data, independent, fixed, based, conditional, density, empirical] [model, information, observing, observation, answer, time] [figure, use, different, neural, table, performance, using, compare, dataset, compared, work]
A Multi-Batch L-BFGS Method for Machine Learning
Albert S. Berahas, Jorge Nocedal, Martin Takac


[node, number] [algorithm, learning, function, set, every, show, problem, convex, let, strategy, case] [gradient, method, batch, compute, convergence, stochastic, step, webspam, sgd, descent, iteration, bfgs, distributed, updating, gksk, optimization, implementation, computing, objective, large, approximation, size, machine, nonconvex, employ, variant, mpi, evaluation] [can, one, hessian, computational, small, also, analysis, computation, paper, fraction] [robust, sample, data, length, based, numerical, illustrates, given, correction, journal, two] [time, communication, scaling, direction, new] [using, different, figure, approach, training, overlap, used, proposed, use, similar, generated, without, consecutive, performance]
Fast and accurate spike sorting of high-channel count probes with KiloSort
Marius Pachitariu, Nicholas A. Steinmetz, Shabnam N. Kadir, Matteo Carandini, Kenneth D. Harris


[number, average, cluster, threshold, thus, sorted, total, identified, false, overlapping, find, obtained] [best, algorithm, cost, function, set, show] [optimization, large, variance, parallel, standard] [can, matrix, noise, local, running, filtering, also, small, svd, amplitude] [data, two, mean, positive, based, covariance, common, automated] [spike, template, kilosort, sorting, klustakwik, model, temporal, waveform, raw, recording, across, required, pursuit, electrode, time, recorded, typically, electrical, found, prototypical, voltage, miss, decision, new, spatiotemporal, framework, manual, allows, location] [matching, channel, spatial, score, previous, using, single, ground, several, different, reconstruction, figure, feature, three, used]
Linear-Memory and Decomposition-Invariant Linearly Convergent Conditional Gradient Algorithm for Structured Polytopes
Dan Garber, Dan Garber, Ofer Meshi


[number, pairwise, vertex, definition] [algorithm, convex, rate, let, problem, set, since, follows, case, optimal, conference, function, setting, depends, polytope, feasible, lemma, theorem, worst, learning, consider] [convergence, gradient, method, optimization, iteration, machine, pcg, variant, standard, dicg, polytopes, iterate, converges, iterates, acg, away, showed, faster, linearly] [linear, decomposition, also, can, denote, garber, certain, analysis, require, solution, gap, dimension, note, following, arg, first, explicit, running, one, suppose] [conditional, section, two, given, point, specific, important] [new, option, current, observation, time, beck] [memory, previous, using, international, structured, figure, performance, several, explicitly, use]
Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula
jean barbier, Mohamad Dia, Nicolas Macris, Florent Krzakala, Thibault Lesieur, Lenka Zdeborová


[formula, possible, community, find, detection, potential, theory, threshold, clique] [proof, theorem, show, problem, function, algorithm, corollary, let, case, will, since, define, inequality, now, optimal, prove, consider] [large] [irs, amp, can, one, coupled, matrix, egood, phase, also, symmetric, mmse, gap, noise, minimum, sparse, algorithmic, global, ieee, first, small, ebad, note, computational, order, spectral, signal, interesting, wigner, spiked, analysis, side, vmmsen, local, second, explicit, replica, mmmsen] [point, statistical, fixed, stationary, theoretic, analytic, asymptotic, equal, mutual, corresponding, estimation, result, limit, gaussian, two] [information, model, system, range, transition, must] [using, region, three, hand, figure]
Variational Information Maximization for Feature Selection
Shuyang Gao, Greg Ver Steeg, Aram Galstyan


[pairwise, number, graphical, subset, average, according, becomes, tree, scoring, independence] [lower, bound, learning, general, will, class, max, complexity, maximizing, maximization, greedy, algorithm, function, may] [variational, vmi, naive, method, machine, step, supplementary, mrmr, iblb, ilb, tractable, hln, term, rely, speccmi, xft, datasets, large, due] [can, assumption, also, existing, following, error, arg, decomposition, first, one, see, ieee] [selection, mutual, data, two, selected, based, estimating, bayes, conditional, distribution, journal, denotes] [information, model, forward, framework, time, demonstrate] [feature, using, shown, table, use, neural, proposed, filter, previous, figure, approach, three]
Observational-Interventional Priors for Dose-Response Learning
Ricardo Silva


[causal, according, level, many, variable, confounding, intervention, sometimes] [outcome, learning, function, might, will, set, study, provide, problem] [posterior, method, bayesian, inference, variance, marginal, hyperparameters, supplementary, sampling] [can, one, analysis, also, synthetic] [observational, data, treatment, gaussian, interventional, distribution, given, mean, covariance, sample, curve, section, provides, competitor, covariates, estimating, true, estimate, confounders, sensitivity, nonparametric, common, two, distortion, affine, journal, difference, practical, andom, done, regime, dobs] [model, prior, process, uncertainty, information, simulated, adjustment, controlled, range, design] [figure, different, generated, computer, using, generate, learned, learn, per, used, use, shown, combined]
Optimal Sparse Linear Encoders and Sparse PCA
Malik Magdon-Ismail, Christos Boutsidis


[number, whose] [algorithm, loss, theorem, optimal, bound, lemma, learning, give, set, minimizing, lower, let, every, maximizing, guarantee, problem, achieves, show, define, now, obtain, prove, since] [variance, batch, approximation, compute, log, machine, parameter, method, term, step] [sparse, linear, sparsity, matrix, iterative, can, pca, symmetric, principal, good, explained, component, error, first, singular, rank, column, encoders, one, eout, running, tpower, eir, also, analysis, much, generalized, power, second, minimum, relative] [data, construct, given, journal, theoretical, selection] [information, time] [encoder, using, use, used, reconstruction, neural, better, feature, residual, traditional, original]
Supervised Word Mover's Distance
Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger


[number, neighborhood, sentiment, unique, introduced] [learning, may, classic, algorithm, fast, optimal, set, problem, complexity, loss, lsi] [gradient, datasets, large, batch, compute, latent, efficient, approximation, dual, stochastic, university, optimization] [can, linear, error, matrix, computation, also, vector, initialization, generalized, first, relaxed] [word, distance, document, metric, wmd, embedding, transport, nca, bow, tij, wcd, section, neighbor, euclidean, described, two, news, nearest, knn, given, bbcsport, wasserstein, lda, space, twitter] [time, information, learns] [training, supervised, learn, use, text, work, using, representation, propose, similar, unsupervised, classification, recipe, table, neural, used, different, dataset, learned, proposed, labeled, semantic, shown]
Finding significant combinations of features in the presence of categorical covariates
Laetitia Papaxanthos, Felipe Llinares-Lopez, Dean Bodenham, Karsten Borgwardt


[cmh, tar, facs, criterion, testability, association, number, subset, categorical, pruning, mining, itemset, attainable, covariate, combination, false, genetic, runtime, confounding, many, prunable, enumeration, threshold, resulting, contingency, testable, fwer, corrected] [algorithm, will, function, set, class, problem, binary, lemma, lower, let, combinatorial, precision, hypothesis, define] [line, method, datasets, efficient, log, large, apply, key] [can, one, minimum, condition, computational, power, first, main] [test, section, statistical, two, significance, based, covariates, testing, correction, conditional, data, associated, envelope, given] [significant, correct, potentially, search, allows, ability, value, current] [feature, discriminative, using, approach, novel, work, multiple, table, figure]
The Forget-me-not Process
Kieran Milan, Joel Veness, James Kirkpatrick, Michael Bowling, Anna Koop, Demis Hassabis


[base, fmn, piecewise, partition, ptw, probabilistic, tree, number, probability, bag, repeating, made, weighting, level, mboc, many, fmnd, completed, obtained] [set, now, binary, will, learning, online, regret, complexity, algorithm, loss, class, defined, bound, best, define, conference, upper, bounded, notation, provide] [log, bayesian, averaging, performing, method] [can, first, denote, also, one, following, ieee, note] [data, given, segment, stationary, equation, provided, distribution, introduce, space, specific] [model, temporal, time, process, atari, information, game, within] [sequence, generating, task, using, used, figure, performance, source, prediction, previous, multiple, depth, perform, similar, generated, domain, digit, trained]
Average-case hardness of RIP certification
Tengyao Wang, Quentin Berthet, Yaniv Plan


[graph, clique, definition, probability, number, detection, subgraph] [property, problem, hardness, lower, algorithm, known, satisfy, show, randomized, even, uniformly, define, will, complexity, satisfies, theorem, set, bound] [parameter, efficient, large, size] [certifier, rip, restricted, planted, matrix, isometry, computational, sparse, can, proposition, compressed, assumption, high, following, recovery, also, certification, signal, order, hard, ieee, note, one, sensing, linear, certifiers, sparsity, incoherence, submatrix, great, detecting] [random, distribution, regime, statistical, polynomial, asymptotic, result, based, given, computationally, testing, two, construction, section, specific, mathematical, sample, well] [design, time, write, whether] [dense, sequence, used, using, use, work]
Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models
Marc Vuffray, Sidhant Misra, Andrey Lokhov, Michael Chertkov


[ising, number, coupling, structure, probability, degree, node, graph, interaction, iso, rise, nmin, spin, xul, greater, graphical, los, national, reconstructs, according, strong] [theorem, lemma, algorithm, learning, function, case, set, least, bound, every, prove, consider, max, bounded, even, complexity, proof, logarithmic, convex] [objective, screening, respect, parameter, gradient, around] [condition, following, error, paper, regularized, can, zero, minimal, main, restricted, second, observed, also, order, perfect, ieee, one, penalty, vector] [maximum, estimator, exp, random, empirical, statistical, based, exponential, uniform, test, two] [model, intensity, information, value, observation, prior, choice] [reconstruction, using, neural, performance]
Interpretable Distribution Features with Maximum Testing Power
Wittawat Jitkrittum, Zoltán Szabó, Kacper P. Chwialkowski, Arthur Gretton


[probability, negative] [set, problem, function, bound, let, consider, lower, learning, hypothesis, class, since, bounded, theorem, may, uniformly] [parameter, optimization, large, size, full] [power, can, high, one, frequency, linear, error, see, following] [test, two, gaussian, scf, statistic, kernel, difference, sample, distribution, empirical, null, testing, positive, multivariate, mean, data, analytic, statistical, chwialkowski, given, witness, maximum, distinguishing, distance, toy, specified, gretton, dtr, gmd, journal, estimate, nonparametric, mmd] [maximize, characteristic, interpretable, model] [use, using, randomly, performance, learned, shown, table, perform, used, figure, facial, discriminative, spatial]
Efficient state-space modularization for planning: theory, behavioral and neural signatures
Daniel McNamee, Daniel M. Wolpert, Mate Lengyel


[theory, degree, number, resulting, thus, level] [optimal, compression, show, will, learning, algorithm, may, complexity, modular, define, function] [efficient, factor, processing, supplementary] [can, local, global, also, matrix, via, note, require] [length, data, entropy, based, given, empirical, corresponding, specific] [planning, state, modularization, module, modularized, start, centrality, soho, trajectory, activity, within, goal, information, across, lever, route, human, behavioral, entropic, hierarchical, markov, policy, simulated, transition, time, representational, firing, temporal, framework, artificial, process, navigation, search, environment, right, decision, action, stop, found] [neural, description, task, used, representation, use, using, compared, different, higher, approach]
Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters
Zeyuan Allen-Zhu, Yang Yuan, Karthik Sridharan


[clustering, cluster, average, definition, probability, proportional] [algorithm, loss, convex, every, best, smaller, since, known, erm, consider, focus, learning, appendix, choose] [gradient, stochastic, clusteracdm, acdm, svrg, dual, coordinate, accelerated, method, clustersvrg, faster, haar, objective, descent, saga, covtype, machine, iteration, tong, step, approximate, large, due, proximal, apcg, auxiliary, datasets, epoch, ahclt] [can, one, running, vector, def, also, following, regularizer, much, see, regularized, comparison, optimum, first, small] [data, ridge, transformation, estimator, two, given, random, length, space] [time, raw, new, option, information] [using, use, training, dataset, feature, without, performance, improve, better, different, figure]
Global Analysis of Expectation Maximization for Mixtures of Two Gaussians
Ji Xu, Daniel J. Hsu, Arian Maleki


[many, either, according] [theorem, learning, may, algorithm, annual, expected, let, show, assume, corollary, lemma, maximization, known, provide, general, proof, conference] [convergence, iterates, parameter, likelihood, initial, log, converges, large, size, expectation, university, latent, royal, iteration] [global, following, denote, analysis, local, also, symposium, can, ieee, suppose, paper, certain, one, still, sufficiently, spectral] [population, mixture, stationary, sample, two, gaussian, point, statistical, fixed, distribution, result, dasgupta, given, balakrishnan, estimate, data, maximum, specific, well, mle, finite, true, chaudhuri, separation, theoretical, journal] [model, characterize, information, starting] [sequence, computer, arxiv, fully, preprint, using, work, performance]
Dual Space Gradient Descent for Online Learning
Trung Le, Tu Nguyen, Vu Nguyen, Dinh Phung


[number] [budget, online, learning, loss, function, mistake, budgeted, algorithm, strategy, rate, set, theorem, hinge, problem, best, achieve, assume] [dualsgd, size, maintenance, gradient, fogd, merging, descent, datasets, fth, dual, removed, hyperparameters, convergence, approximate, machine, provision, nogd, conduct, key, extensive] [can, computational, regression, vector, analysis, support, dimension, running, projection, following, xij, linear, high] [random, space, kernel, data, two, procedure, section, perceptron, address, denotes] [time, model, information, execution, effect, decision, whilst] [proposed, feature, classification, approach, performance, using, use, table, input, part, shown, work, different, dataset]
On Graph Reconstruction via Empirical Risk Minimization: Fast Learning Rates and Scalability
Guillaume Papa, Aurélien Bellet, Stephan Clémençon


[graph, biau, bleakley, rule, probability, incomplete, average, preferential, theory, established, node, possible, thus, obtained, involving, edge, sum] [risk, learning, theorem, rate, problem, fast, set, minimization, class, complexity, consider, excess, minimizers, constant, may, inf, bound, minimizer, always] [sampling, supplementary, large, variance, material, var, additional] [can, order, analysis, one, error, computational, note, symmetric, also, related, remark, decomposition, following, stated] [empirical, statistical, random, distribution, based, given, section, distance, finite, conditional, numerical, estimate, independent, universal, bias, result, two] [form] [reconstruction, training, used, performance, prediction, representation, referred, approach, pair, similar]
Improving Variational Autoencoders with Inverse Autoregressive Flow
Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Xi Chen, Ilya Sutskever, Max Welling


[den, variable, made, number] [learning, lower, context, will, conference, general, best, bound, since] [autoregressive, variational, inference, iaf, latent, posterior, log, flexible, method, vae, oord, stochastic, normalizing, masked, standard, approximate, parameterized, step, jacobian, improving, dkl, compute, elu, marginal, auxiliary, likelihood, sampling] [can, vector, still, see, diagonal, much, one, also, order] [transformation, gaussian, powerful, distribution, density, type, data, sample, computationally, nonlinear, mean] [inverse, model, simple, van, information, long] [flow, arxiv, preprint, deep, generative, neural, using, use, resnet, used, convolutional, pixelcnn, previous, figure, layer, natural, mnist, residual, single, international, autoencoders]
Towards Conceptual Compression
Karol Gregor, Frederic Besse, Danilo Jimenez Rezende, Ivo Danihelka, Daan Wierstra


[level, number, variable, quality, structure, introduced, den, present] [compression, cost, learning, algorithm, will, might, show, conference, lower] [latent, draw, variational, variance, posterior, approximate] [can, one, first, high, also, low, global, see, good] [distribution, given, two, data] [information, model, prior, time, coding, abstract] [input, image, network, convolutional, figure, different, generative, layer, deep, pixel, amount, shown, higher, neural, per, lossy, compress, using, used, imagenet, omniglot, store, use, storing, conceptual, generated, recurrent, scale, generate, lzt, rnn, international, deepmind, representation, encode, stored, trained, original, google, produce, architecture]
Integrated perception with recurrent multi-task neural networks
Hakan Bilen, Andrea Vedaldi


[detection, number, possible] [learning, function, label, consider, class, max] [method, update, efficient] [can, one, first, det, also, iterative, order] [ordinary, data, integrated, corresponding, test, common, two, based, given, integrator, independent, well, space] [model, information, individual, box] [object, part, task, image, representation, multinet, shared, enc, different, neural, recurrent, classification, network, convolutional, used, output, architecture, multiple, using, feature, sharing, decoder, three, prediction, several, computer, initialized, improve, deep, train, propose, instead, encoder, training, input, back, spatial, layer, rcls, ground, pascal, img, last, perceptual, visual, vision, use]
Coin Betting and Parameter-Free Online Learning
Francesco Orabona, David Pal


[potential, number, definition, total] [algorithm, betting, olo, coin, regret, learning, let, wealth, online, optimal, bound, lea, strategy, will, wealtht, adaptive, convex, defined, endowment, satisfies, theorem, show, appendix, set, guarantee, loss, function, regrett, bet, kelly, rewardt, problem, every, excellent, prove, round, upper, best, absolute, rate, gambler, advice, lemma, just, define, proof, lower, assume, case] [initial, processing, optimization] [can, also, see, fraction, vector, linear, good, analysis, following] [hilbert, space, based, section, estimator, normal, two, construction] [reward, information, expert, new, design, time, prior] [sequence, used, neural, prediction, table, use, previous]
Near-Optimal Smoothing of Structured Conditional Probability Matrices
Moein Falahatgar, Mesrob I. Ohannessian, Alon Orlitsky


[probability, thus, number, many, probabilistic, fact] [smoothing, algorithm, risk, bound, since, theorem, learning, problem, context, pij, may, uniformly, give, even, complexity, optimal, minimax, lemma] [log, latent, naive, size, stochastic] [matrix, can, cij, one, rank, first, sampled, also, note, min, following, arg, bigram, penalized, qij, analysis, global, related, factorization, moothed, row, order, main] [empirical, data, conditional, section, estimator, sample, word, uniform, given, test, point, stationary, result] [model, framework, choice, simple, implicit] [language, performance, using, figure, proposed, work, preprint, arxiv, natural, structured, neural, instead, like, used, original]
Consistent Estimation of Functions of Data Missing Non-Monotonically and Not at Random
Ilya Shpitser


[missingness, graph, law, identification, ipw, independence, status, undirected, graphical, mnar, identified, variable, complete, causal, edge, pag, directed, represented, allow, fairly, lcpl] [every, lemma, set, function, may, known, confidence, implies, define, case, class, since, example, follows, general, problem, give, dependence] [full, parameter, inference, size, log] [missing, observed, analysis, following, can, factorization, also] [data, conditional, chain, random, estimator, consistent, section, distribution, underlying, true, estimation, sample, independent, functional, given, statistical, practical, joint, discrete, particular] [model, simple, form, complex, markov, inverse, mar] [using, approach, shown, used, work]
Algorithms and matching lower bounds for approximately-convex optimization
Andrej Risteski, Yuanzhi Li


[probability, number, definition] [convex, function, algorithm, will, lower, diameter, bound, show, problem, minx, set, exists, every, consider, theorem, proof, max, since, lip, poly, lipschitz, define, upper, optimal, let, arbitrary, query, defined, case, even, give, want, lemma, idea, bandit, prove, outside, notion, obtain, implies, hope] [log, gradient, optimization, sampling, standard, factor] [can, one, following, high, need, small, approximately, also, algorithmic, order, min, linear, real, still] [point, polynomial, radius, theoretic, section, two, construction, result, positive, random, sample, construct, given] [ball, time, information, value, simple, extend, optimizing] [body, unit, use, natural, work, neural]
A Pseudo-Bayesian Algorithm for Robust PCA
Tae-Hyun Oh, Yasuyuki Matsushita, In Kweon, David Wipf


[represents, basic, number, knowledge, represent, either, probabilistic] [ratio, algorithm, will, function, convex, problem, may, set, bound, cost, even, least, assume, case, minimization, equivalent] [bayesian, log, objective, optimization, method, although, standard, iid] [can, rank, matrix, outlier, rpca, ieee, via, sparse, pcp, support, recovery, principal, also, relative, quite, global, subspace, penalty, success, zgt, solving, solution, singular, norm, exact, proposition, first, existing, phase, nuclear, note, analysis] [robust, empirical, described, data, section, symmetry, given, true, estimation, practical] [prior, model, form, transition, new, design] [using, pattern, original, segmentation, performance, applied, used, motion, adopt, figure]
Verification Based Solution for Structured MAB Problems
Zohar S. Karnin


[probability, identification, number, graphical, structure, many] [arm, best, algorithm, problem, query, bandit, complexity, set, dueling, expected, mab, verification, optimal, least, learning, hexplore, hverify, confidence, conference, unimodal, advice, condorcet, general, regret, exist, obtain, round, provide, findbestarm, setting, verifybestarm, focus, known, since, function, let, yet, elaborate] [machine, log, stochastic, parameter, additional, pure] [min, can, one, linear, vector, following, sufficiently, high, first, indeed, assumption] [given, independent, section, associated, corresponding, two, based, random, result, provides] [exploration, reward, value, framework, failure, rather, information, extended] [candidate, using, identity, pair, international, several, different, output, consists, input]
Bayesian Optimization with Robust Bayesian Neural Networks
Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, Frank Hutter


[number, many, thus, probabilistic, obtained] [function, learning, set, algorithm, will] [bayesian, optimization, gradient, stochastic, bohamiann, monte, sghmc, dngo, ddpg, hyperparameters, method, carlo, mcmc, large, parallel, machine, supplementary, additional, evaluation, standard, posterior, scalable, scalability, parameter, fit] [can, matrix, via, noise, following, related, regression, good] [hamiltonian, based, estimate, well, equation, mean, robust, hmc, given, section, sample, robustness, distribution, data, random, embedding, two] [model, hyperparameter, uncertainty, found, required, reinforcement, optimized, acquisition, reward] [neural, deep, used, using, performance, network, scale, use, different, figure, residual, performed, proposed, dataset, adaptation]
The Multi-fidelity Multi-armed Bandit
Kirthevasan Kandasamy, Gautam Dasarathy, Barnabas Poczos, Jeff Schneider


[number, many, mth, probability] [fidelity, arm, will, bound, regret, lower, ucb, bandit, algorithm, strategy, set, consider, play, upper, confidence, setting, since, cost, optimal, might, learning, highest, online, problem, capital, expected, playing, may, played, theorem, cheaper, advertising, let, suboptimal, eliminate, bernoulli, lemma, smaller, ignore, notion, obtain, displaying, analyse] [due, expensive, step, approximate] [can, following, high, first, one, small, assumption, denote, much, low, note, analysis, also] [mean, given, two] [time, reward, design, goal, information, within, choice] [using, work, performance, higher, use, better]
Mapping Estimation for Discrete Optimal Transport
Michaël Perrot, Nicolas Courty, Rémi Flamary, Amaury Habrard


[coupling, probability, probabilistic, quality] [problem, learning, optimal, set, function, convex, consider, example, algorithm, let, case, define, choose, relaxation, cost, theorem, obtain] [method, gradient, optimization, supplementary, approximation, solve, term, technique, respect] [can, one, linear, regularization, regularized, first, formulation, solution, good, matrix, paper, min, following, arg, also, computation] [transport, transformation, barycentric, two, theoretical, section, kernel, empirical, based, space, given, corresponding, wasserstein, discrete, true, way, fixed] [target, new, corresponds, framework] [source, image, domain, using, map, mapping, adaptation, propose, learned, approach, jointly, used, use, learn, several, proposed, different, seen, training, color, dataset, original]
Dual Decomposed Learning with Factorwise Oracle for Structural SVM of Large Output Domain
Ian En-Hsu Yen, Xiangru Huang, Kai Zhong, Ruohan Zhang, Pradeep K. Ravikumar, Inderjit S. Dhillon


[structural, number, grows, labeling, many, find, graph, according] [algorithm, active, maximization, greedy, loss, learning, problem, complexity, set, oracle, max, since, constant, show, qmax, upper, binary, case] [factor, large, mjf, objective, method, dual, convergence, bcfw, step, size, factorwise, fmo, iteration, inference, gdmm, ssg, machine, approximate, requires, approximation, chineseocr, faster, augmented, descent, primal, sampling, efficient, processing, optimization] [can, one, via, linear, also, min, note, much, error, computational, arg, multiplier, solving, sublinear, support, analysis] [two, given, test, type] [time, direction, information] [structured, output, domain, training, prediction, figure, map, svms, approach, dataset]
Bayesian Optimization for Probabilistic Programs
Tom Rainforth, Tuan-Anh Le, Jan-Willem van de Meent, Michael A. Osborne, Frank Wood


[program, number, probabilistic, contains] [function, problem, query, unknown, surrogate, consider, observe, show, setting] [optimization, method, likelihood, inference, line, importance, sampling, additional] [error, first, comparison, optimum, note] [mean, maximum, corresponding, estimate, based, section, provides, transformation, distribution] [bopp, acquisition, demonstrate, target, potentially, optimizing, significant, implicit, take, adapting, spearmint, simply, prior, smac, automatically, tpe, unbounded, scaling, must, new, optimize, evaluated, assigns, alternative, transformed, metropolis, directly, hastings, therefore, giving, form, identical, prominent, statement, continuous, corresponds, along, previously, solid, subject, next, annealed] [using, figure, used, shown, approach, region, optimizer, top, fully]
A Non-generative Framework and Convex Relaxations for Unsupervised Learning
Elad Hazan, Tengyu Ma


[improper, definition, theory, probability, average] [learning, hypothesis, class, convex, set, algorithm, theorem, give, complexity, loss, let, rademacher, consider, close, compression, function, define, general, since, will, show, problem, relaxation, proof, constant, unknown, pac, exists, appendix, known, best] [efficient, respect, sampling, optimization] [can, spectral, error, dictionary, pca, decoding, linear, vmax, much, norm, min, following, main, arg, sparse, group, algebraic, matrix, though, component, vector, one, analysis, suppose, condition] [data, length, based, given, euclidean, distribution, kernel, bias, polynomial, section, random, space, generalization] [encoding, framework, information, new, allows] [unsupervised, reconstruction, generative, approach, without, learn, previous, using, neural, learned, domain]
Value Iteration Networks
Aviv Tamar, Sergey Levine, Pieter Abbeel, YI WU, Garrett Thomas


[graph] [learning, optimal, function, algorithm, show, rate, general, defined, may, loss, differentiable] [standard, iteration, additional, gradient, solve] [can, computation, also, success, note, vector] [based, section, random, test, true, particular, important] [policy, planning, vin, reactive, value, state, reward, goal, module, vins, continuous, plan, observation, mdp, reinforcement, model, action, obstacle, control, search, form, design, information, new, elevation, decision, within, trajectory] [network, image, trained, learn, using, domain, neural, training, cnn, layer, better, different, deep, task, figure, map, approach, similar, convolution, attention, mapping, performance, fully, convolutional, several, prediction, generalize, supervised]
Graph Clustering: Block-models and model free results
Yali Wan, Marina Meila


[clustering, pfm, graph, sbm, cluster, weighted, stable, node, indicator, community, political, block, ckk, detection, compatible, marina, lfr, edge] [theorem, let, will, bound, obtain, satisfy, set, since, whenever, proof, assume, show, algorithm, case, close, prove] [stochastic, sampling, large, processing] [matrix, assumption, spectral, can, also, proposition, recovery, perturbation, stability, error, existing, note, main, maxk, much, paper, frobenius, small] [data, result, well, construct, two, laplacian, distance, goodness, measure, based, given] [model, framework, information] [used, work, using, volume, neural, network]
Efficient Globally Convergent Stochastic Optimization for Canonical Correlation Analysis
Weiran Wang, Jialei Wang, Dan Garber, Dan Garber, Nati Srebro


[number, loop] [algorithm, least, complexity, obtain, appendix, theorem, show, set, problem, max, proof, dependence] [stochastic, log, svrg, appgrad, convergence, normalization, ccalin, gradient, optimization, cca, approximate, iterates, suboptimality, preconditioning, rdx, convergent, sgd, solve, step, batch, asvrg, objective, converges, variance, solved] [min, singular, phase, can, analysis, solution, condition, power, matrix, alternating, regularization, following, gap, linear, rdy, note, initialization, globally, solving, high] [canonical, two] [time, correlation, value, form] [using, use, training, work]
Deep Learning without Poor Local Minima
Kenji Kawaguchi


[conjecture, negative, number, theory, expression] [loss, theorem, function, learning, corollary, case, lemma, proof, let, every, assume, prove, set, conference, consider, show, general, problem, even, now, obtain] [saddle, optimization] [local, linear, can, global, hessian, critical, minimum, necessary, following, condition, matrix, semidefinite, choromanska, assumption, rdy, baldi, arbitrarily, unrealistic, conclude, entry, also, proposition, sketch, seven, note, one, noted, perturbation, first] [nonlinear, point, theoretical, section, corresponding, positive, practical, random] [model, information, write, value, statement, poor] [deep, neural, open, layer, previous, output, training, without, use, used, work, deeper, shallow]
Privacy Odometers and Filters: Pay-as-you-Go Composition
Ryan M. Rogers, Salil Vadhan, Aaron Roth, Jonathan Ullman


[valid, probability, definition, neighboring, basic, number, analyst] [privacy, composition, theorem, algorithm, loss, bound, adaptive, odometer, round, will, private, chosen, setting, give, randomized, differentially, define, show, adaptively, adversary, case, every, may, even, let, function, realized, advanced, prove, halt, satisfies, now, martingale, set, stopping, worse, upper, coin] [parameter, log, run, select, datasets] [can, note, following, also, one, high, view, global, first, computation, see] [differential, data, two, fixed, random, statistical, asymptotic, result, useful] [choice, time, must, next, response, future, allows] [filter, output, previous, sequence, use, used, without, similar, input]
On Regularizing Rademacher Observation Losses
Richard Nock


[equivalence, number, definition, iff, fact] [oost, rado, loss, learning, rados, equivalent, learner, example, algorithm, theorem, slope, let, proportionate, set, max, may, show, now, rademacher, regularizing, fix, since, even, maxj, obtain, minimization, defined, provide, nock, best] [boosting, step, efficient, logistic, fit, log, large, key, size, depend, end, boost, optimization, datasets, update] [regularized, can, regularization, linear, one, suppose, sparsity, sufficient, via, following, also, first, significantly, order, regularizer, accurate, popular, good] [two, exponential, exp, data, given, corresponding] [take] [weak, classifier, using, table, feature, supervised, domain, shown, different, training]
Reward Augmented Maximum Likelihood for Neural Structured Prediction
Mohammad Norouzi, Samy Bengio, zhifeng Chen, Navdeep Jaitly, Mike Schuster, Yonghui Wu, Dale Schuurmans


[edit, according, negative, probability, number, average, probabilistic] [learning, expected, function, consider, best, make, set] [sampling, likelihood, machine, dkl, gradient, optimization, inference, augmented, divergence, large, objective, log, variance, standard, approximate] [can, one, regularized, also, first, note] [maximum, given, conditional, distance, distribution, difference, two, test, sample, entropy, length] [reward, model, policy, exponentiated, rml, reinforcement, lrl, lrml, payoff, baseline, temperature, optimizing, framework, alternative, optimize, form] [output, training, neural, task, use, sequence, using, prediction, work, translation, bleu, structured, speech, different, approach, ground, truth, recognition, score, recurrent, table, several, deep, achieved, network, improves, supervision, augmentation, category]
The Sound of APALM Clapping: Faster Nonsmooth Nonconvex Optimization with Stochastic Asynchronous PALM
Damek Davis, Brent Edmunds, Madeleine Udell


[block, number, wjk, cache, linearized, many] [algorithm, problem, convex, function, let, theorem, will, class, provide, rate, expected, might, minimize, known, assume] [asynchronous, stochastic, sapalm, convergence, coordinate, gradient, nonconvex, parallel, optimization, method, nonsmooth, synchronous, proximal, iteration, distributed, large, speedup, firm, compute, counter, palm, size, xkj, lyapunov, objective, converges, iterates, update, stepsize, processing] [can, noise, assumption, linear, following, first, pca, processor, one, matrix, denote, low, sparse, global, see, alternating] [two] [time, read, model, delayed, write, information] [arxiv, preprint, using, use, sequence, neural, work, last, different]
Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling
Chengtao Li, Suvrit Sra, Stefanie Jegelka


[probability, path, rayleigh, partition, constraint, graph, exchange, probabilistic, total, shortest, ising, include, many] [set, fast, matroid, algorithm, bound, consider, since, show, let, may, general, appendix, theorem, special, observe, max, combinatorial, will, uniformly, learning, dependence, bounded, cardinality] [sampling, strongly, inference, convergence, end, log, determinantal, size, factor, efficient, dpp, constrained, machine] [can, one, first, also, certain, hence] [mixing, chain, uniform, point, distribution, psrf, sampler, sample, discrete, result, mix, two, random, journal, fixed, theoretical, construct, empirically, length] [time, markov, transition, simple] [flow, use, iter, ground, using, shown]
A scaled Bregman theorem with applications
Richard Nock, Aditya Menon, Cheng Soon Ong


[clustering, fact, flat, potential] [theorem, convex, lemma, ratio, learning, may, problem, since, online, appendix, loss, adaptive, bound, multiclass, function, binary, let, case, general, improvement, beat, now, differentiable] [divergence, log, reduction, approximation, step, dual, mirror, university, optimisation, update] [can, one, norm, lifting, vector, related, signal, suitable, ieee, matrix, via] [bregman, density, estimation, scaled, skm, manifold, sphere, two, drec, curved, given, hyperboloid, seeding, estimate, euclidean, exponential, geodesic, normalisation, distance, spherical, section] [new, significant] [table, perspective, map, using, three, use, figure, used, different]
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
Xi Chen, Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel


[categorical, variable, represent, present, regular] [learning, even, show, lower, bound, will, algorithm, maximizing, minimax, set] [latent, variational, posterior, method, auxiliary, easy] [can, one, noise, also, though] [mutual, distribution, data, two, random, discrete, true] [information, code, continuous, model, interpretable, learns, azimuth, research, prior, varying, simple, goal] [infogan, learn, generative, representation, disentangled, adversarial, generator, unsupervised, variation, face, digit, using, figure, use, network, gan, learned, neural, deep, able, preprint, pose, lighting, arxiv, different, generated, mnist, supervised, training, disentangle, dataset, explicitly, semantic, manipulating, work, identity, evaluate, visual, propose]
High Dimensional Structured Superposition Models
Qilong Gu, Arindam Banerjee


[sum, probability, number, structural, structure, interaction, present] [bound, let, theorem, general, lower, will, lemma, show, set, give, upper, complexity, satisfied, implies, appendix, inf, choose, constant, proof, convex, depends, consider, hold, defined, case, provide] [parameter, log, deterministic] [condition, error, can, matrix, component, following, superposition, width, high, norm, sparse, vector, recovery, analysis, coherence, noise, componentwise, also, one, cone, maxi, eigenvalue, dimension, suitable, characterization, restricted, linear, denote] [random, gaussian, given, estimation, section, sample, estimator, geometric, independent, two, based, result, statistical, geometry] [design, information, form, start, observation] [different, using, figure, structured, use]
Clustering Signed Networks with the Geometric Mean of Laplacians
Pedro Mercado, Francesco Tudisco, Matthias Hein


[clustering, signed, laplacians, negative, graph, lgm, lbn, lsn, pin, arithmetic, normalized, sym, block, ordering, presence, number, qsym, present, cluster, lam, informative, lsr, definition, obtained] [smallest, let, algorithm, best, show, consider, observe, expected, defined] [method, stochastic, expectation, large, compute, technique, size] [matrix, eigenvectors, can, spectral, pout, solution, small, analysis, computation, linear, error, existing, one, krylov, condition, related, first, eigenvalue, subspace, eigenvector, sparse, fraction, note] [mean, positive, geometric, laplacian, section, corresponding, two, based, definite, given] [model, whereas, extended, next] [use, proposed, network, figure, ground, table, truth, different]
Confusions over Time: An Interpretable Bayesian Model to Characterize Trends in Decision Making
Himabindu Lakkaraju, Jure Leskovec


[cluster, confusion, cot, maker, asthma, indicator, purity, insurance, bail, diagnostic, made, evaluating, mth, eth, present, obtained, possible] [label, learning, set, element, will, provide] [latent, bayesian, inference, sampling] [item, can, subspace, also, sampled, vector, matrix, one, following, row, error] [true, prototype, corresponding, denotes, distribution, associated, important, data, dirichlet, metric, discrete, provides, gender, estimated, multinomial] [decision, model, time, making, value, prior, interpretable, modeling, framework, individual, process, collective, temporal, inverse, human, research] [feature, predicting, using, figure, performance, table, single, evaluate, generative, use]
Learning Additive Exponential Family Graphical Models via \ell_{2,1}-norm Regularized M-Estimation
Xiaotong Yuan, Ping Li, Tong Zhang, Qingshan Liu, Guangcan Liu


[graphical, pairwise, structure, graph, among, probability, node] [learning, consider, assume, unknown, price, set, theorem, function, will] [sampling, parameter, convergence, approximation, size, likelihood, stock, approximate, method, gradient, log] [following, regularized, sufficient, assumption, can, high, recovery, error, analysis, sparse, order, via, computational, condition, restricted] [adefgm, estimator, estimation, joint, mle, exponential, data, nonparanormal, additive, random, exp, statistical, distribution, family, estimate, conditional, two, ugms, basis, ggm, dimensional, underlying, associated, sample, true, journal, gaussian, fst, result, annals, given, maximum] [model, information, complex] [proposed, using, learn, propose, performance]
Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games
Sougata Chaudhuri, Ambuj Tewari


[unique, partial, number, according] [algorithm, regret, learner, optimal, bound, set, gcb, online, problem, function, expected, pege, dependent, will, adversary, ranked, combinatorial, existence, exponentially, relevance, constant, defined, might, bandit, exploitation, since, oracle, even, theorem, achieve, dependence, let, get, confidence, conference, classic, strategy] [log, size, end, large, full, stochastic, requires] [can, assumption, global, gap, cpm, ranking, linear, second, one, paper, first, note, monitoring, user, vector, also, small] [distribution, observable, space, finite, estimation, independent, estimate, transformation, fixed, infinite, two, practical] [action, feedback, reward, exploration, model, framework, game, time, continuous, move, information, within, rmax] [top, work]
Achieving the KS threshold in the general stochastic block model with linearized acyclic belief propagation
Emmanuel Abbe, Colin Sandon


[community, probability, detection, graph, threshold, vertex, number, conjecture, nonbacktracking, abp, propagation, block, definition, adjacent, average, achieving, enough, degree, likely, assign, expect, sum, achievability, detect, cycle, presence, linearized, subtract] [algorithm, let, will, set, general, prove, proof, drawn, define, version, return, exists, least] [stochastic, large, solves, step, compute, initial] [one, matrix, also, approximately, can, spectral, symmetric, eigenvector, order, following, vector, eigenvalue, sparse, high, dominant, first] [length, two, random, equal, distribution, described, bias, positive, way, based] [model, belief, value, simply, snr, new] [different, instead, approach, part, use, multiple]
Simple and Efficient Weighted Minwise Hashing
Anshumali Shrivastava


[number, weighted, average, many, consistency] [scheme, hashing, algorithm, hash, fast, will, theorem, return, binary, known, show, idea, constant, define, cost] [minwise, sampling, unweighted, requires, ioffe, end, faster, datasets, method, around, compute, unbiased, log, size, computing, green, oxford, approximation, costly] [can, vector, real, exact, one, existing, computation, sparsity, also, error, component, first, note, need, running] [random, data, given, independent, procedure, two, permutation, based, biased, associated, consistent, equation] [time, required, value, simple, range] [using, similarity, accuracy, proposed, generate, used, different, input, work, figure, vision, table, red, effective, computer, per, dataset]
Learning Sparse Gaussian Graphical Models with Overlapping Blocks
Seyed Mohammad Javad Hosseini, Su-In Lee


[grab, graphical, block, gene, overlapping, number, constraint, cancer, structure, clustering, expression, variable, average, assignment, connected, aml, ith, assigned, many, belong, present, called, densely] [learning, algorithm, problem, show, set, convex, known, consider, lemma, element] [method, parameter, log, optimization, standard, size, dual, coordinate, descent] [matrix, can, regularization, lasso, following, sparse, solution, first, one, sparsity, synthetic, group, det, significantly, existing, need, diagonal] [given, data, estimate, gaussian, based, covariance, two, associated, random, denotes, estimation, journal] [prior, maximize, subject, fig, model] [network, use, learn, novel, overlap, used, jointly, similar, learned, work, figure, multiple]
FPNN: Field Probing Neural Networks for 3D Data
Yangyan Li, Soeren Pirk, Hao Su, Charles R. Qi, Leonidas J. Guibas


[connected, represented, informative, thus, number, many] [learning, since, set, complexity, even] [batch, size, designed, michael, efficient] [can, note, one, also, computation, computational] [data, distance, point, two, gaussian, normal, random, important, associated, testing, well, space] [range, long, grid, directly, sensor, occupancy, advantage, information] [probing, field, input, shape, layer, fpnn, performance, object, convolutional, table, neural, fully, filter, figure, fpnns, resolution, deep, classification, cnns, used, output, task, feature, different, higher, representation, network, training, transform, understanding, dotproduct, approach, dataset, image, evaluate, voxel, trained, volumetric, recognition, cnn]
Homotopy Smoothing for Non-Smooth Problems with Lower Complexity than O(1/\epsilon)
Yi Xu, Yan Yan, Qihang Lin, Tianbao Yang


[number, present, developed, strong] [smoothing, function, algorithm, bound, convex, complexity, homotopy, set, problem, let, optimal, considered, max, smoothed, constant, property, loss, lower, minimization, finding, will, learning, show, theorem, inequality, rate, make, logb, implies] [iteration, convergence, optimization, parameter, proximal, gradient, method, smooth, objective, siam, faster, accelerated, dual, term, large, strongly, machine, approximation, university] [error, can, local, solving, linear, solution, min, denote, condition, cone, assumption, also, sparse, norm, one, analysis, see, following, running, accurate, first] [result, family, given, point, two, smoothness] [sharpness, value] [proposed, work, different, relatively, yang, style]
DISCO Nets : DISsimilarity COefficients Networks
Diane Bouchacourt, Pawan K. Mudigonda, Sebastian Nowozin


[probabilistic, scoring, coefficient, rule, strictly, represent] [loss, function, proper, example, best, learning, pointwise, diversity, consider, defined] [objective, supplementary, evaluation, posterior] [noise, can, require, one, order, first] [distribution, given, data, two, true, based, estimate, test, equation, sample, kernel, euclidean, random, conditional, mean, section] [model, uncertainty, value] [disco, pose, depth, training, hand, image, input, use, adversarial, convolutional, generative, output, dissimilarity, figure, deep, prediction, cgan, candidate, different, neural, using, gan, table, single, architecture, network, dataset, train, oberweger, fed, gneiting, referred, several, mirza, presented, used, similar, employed, dense, task]
Protein contact prediction from amino acid co-evolution using convolutional networks for graph-valued images
Vladimir Golkov, Marcin J. Skwark, Antonij Golkov, Alexey Dosovitskiy, Thomas Brox, Jens Meiler, Daniel Cremers


[structure, coupling, many, number, present] [function, learning, set, problem, conference, considered] [contact, protein, amino, method, plmconv, acid, metapsicov, evolutionary, plmdca, homologs, size, residue, inference, pressure, additional, homologous, tertiary, potts] [can, one, order, also, accurate, computational, matrix, regularization, related] [data, based, positive, way, length] [value, information, predictive, new, potentially] [prediction, sequence, network, convolutional, input, neural, different, map, position, multiple, using, window, used, predicted, performance, training, predict, feature, proposed, layer, use, able, deep, higher, alignment, figure, table, propose]
Fast Distributed Submodular Cover: Public-Private Data Summarization
Baharan Mirzasoleiman, Morteza Zadimoghaddam, Amin Karbasi


[number, find, summary, many, centralized, subset, social, threshold, spark] [set, fast, submodular, algorithm, greedy, private, problem, cover, summarization, personalized, utility, function, smaller, fastcover, public, covering, least, dominating, friendster, massive, define, discover, coverage, mapreduce, even, smallest, finding, return, round, instance] [size, distributed, machine, marginal, large, central, run, factor, parameter, objective, gps] [solution, can, user, recommendation, one, movie, also, high, running, note, good, small, recommender] [data, provides, maximum] [value, location, information, time, desired, truly, gain] [performance, dataset, used, single, use, using, network, including, consists, similarity]
Tractable Operations for Arithmetic Circuits of Probabilistic Models
Yujia Shen, Arthur Choi, Adnan Darwiche


[psdd, probabilistic, circuit, arithmetic, psdds, vtree, variable, compilation, graphical, multiplication, polytime, instantiation, compiling, sdd, definition, decomposable, multiply, probability, sentential, summing, represent, iff, compile, node, darwiche, choi, called, boolean, obtained, compiled, whose, number, diagram] [learning, algorithm, theorem, show, every, class, will, known, consider] [factor, inference, size, tractable, bayesian, deterministic] [can, also, one, see, product, following, support, first, algebraic] [two, distribution, based, given, section] [decision, time, operation, markov, model, belief, tabular] [figure, using, input, used, reasoning, representation, network, including, available, proposed]
Variational Autoencoder for Deep Learning of Images, Labels and Captions
Yunchen Pu, Zhe Gan, Ricardo Henao, Xin Yuan, Chunyuan Li, Andrew Stevens, Lawrence Carin


[number, developed] [learning, set, svm, element, lower, consider] [variational, latent, inference, bayesian, method, caltech, stochastic, posterior, sampling, datasets, vae, parameter, approximate] [error, also, vector, can, tensor, first] [test, distribution, joint, based, associated, proportion, provided, data, gibbs, section, two] [model, time, new, code] [image, cnn, labeled, training, generative, deep, used, caption, imagenet, neural, classification, using, pooling, dgdn, table, layer, encoder, use, recognition, decoder, trained, network, convolutional, unsupervised, validation, captioning, deconvolutional, unpooling, googlenet, performance, mcem, activation, novel, better, accuracy, figure, single, employed, feature, unlabeled, available, map]
Convergence guarantees for kernel-based quadrature rules in misspecified settings
Motonobu Kanagawa, Bharath K. Sriperumbudur, Kenji Fukumizu


[weighted, degree, lattice, knowledge, probability, describe] [rate, let, case, theorem, worst, setting, optimal, general, achieve, assumed, even, consider, defined, function, satisfies, known, adaptive, assume, provide, show, will] [convergence, smooth, carlo, deterministic, bayesian, monte] [assumption, can, error, order, one, also, see, certain, power, note, small, following, product, analysis] [kernel, integrand, space, korobov, section, quadrature, rkhs, distribution, wkor, smoothness, numerical, sobolev, random, uniform, given, qmc, bach, reproducing, result, rkhss, integral, construction, belongs, theoretical, mpn, denotes, statistical, important, hilbert, integrands] [integration, form, misspecified, therefore, state] [use]
Structure-Blind Signal Recovery
Dmitry Ostrovsky, Zaid Harchaoui, Anatoli Juditsky, Arkadi S. Nemirovski


[structure, sum, present, constraint] [oracle, let, problem, set, theorem, consider, risk, appendix, price, may, exists, convex, bound, unknown, least, known, assume, adaptive, implies, close, function, pointwise, interval, minimax, satisfies] [constrained, optimization, computed, parameter, operator] [signal, can, recovery, linear, one, assumption, lasso, small, penalized, subspace, norm, dimension, noise, recover, see, recovering, harmonic, note, error, still, suppose, high, solution] [estimator, given, fourier, equation, polynomial, estimation, nonparametric, gaussian, difference, random, length, numerical, corresponding, family, theoretical, specified] [left, simple, right] [filter, denoising, use, using, image, proposed, adaptation, approach, transform, hidden]
Collaborative Recurrent Autoencoder: Recommend while Learning to Fill in the Blanks
Hao Wang, Xingjian SHI, Dit-Yan Yeung


[ctr, probabilistic, average] [learning, will, set, scheme, since, provide, show] [bayesian, rrn, datasets, supplementary, hyperparameters, boost, hybrid] [can, collaborative, recommendation, user, note, matrix, vector, also, recommender, one, rating, first, filtering, see] [section, two, word, distribution, equation, based, robust, topic, denotes] [model, information, modeling, implicit, directly] [crae, recurrent, use, denoising, deep, sequence, generation, cdl, used, different, wildcard, autoencoder, content, citeulike, pooling, input, ikw, training, using, representation, like, figure, performance, neural, svdfeature, table, rnn, score, deepmusic, bleu, generative, outperform, propose, jointly, able, recall, network, output, shown, gate, task]
Learning the Number of Neurons in Deep Networks
Jose M. Alvarez, Mathieu Salzmann


[number, thus, total, obtained, removing, influence] [learning, set, loss, general, algorithm, remaining] [initial, reduction, size, parameter, method, constructive, batch, gradient, proximal, operator, large, additional] [group, can, also, regularizer, note, sparsity, first, relative, gap, percentage, one, linear, sparse] [selection, test, two, generalization, reduces] [model, automatically, individual, effectively, saving, determine, behavior] [deep, network, approach, layer, using, memory, original, accuracy, convolutional, training, neural, used, bnetc, consists, architecture, three, compact, per, imagenet, figure, different, param, table, effective, trained, recognition, icdar, last, overcomplete, single, similar, shallow]
Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes
Chris Junchi Li, Zhaoran Wang, Han Liu


[diffusion, theory, number, negative, neighborhood] [algorithm, online, let, learning, consider, case, theorem, problem, implies, weakly, upper] [sgd, stochastic, nonconvex, convergence, optimization, gradient, ode, approximation, converges, log, processing, method, iteration, descent, initial, around, objective, iterates, appropriate, especially, sde] [local, tensor, phase, analysis, solution, unstable, global, can, following, via, sparse, matrix, desirable, first, component, conclude, second, wkk, solving, escaping, alternating, proposition, noise, escape] [statistical, exp, differential, based, stationary, section, equation, independent, random] [markov, information, process, time, within, characterize, towards, system, value] [arxiv, preprint, neural, weak, different, using, three, figure, spatial, understanding]
Coupled Generative Adversarial Networks
Ming-Yu Liu, Oncel Tuzel


[number, constraint, contains, edge, attribute] [learning, drawn, let, function, set, problem] [marginal, showed] [can, also, one, note, via, first] [joint, distribution, corresponding, two, conditional, given, transformation, usps] [model, share, decode, framework] [cogan, image, generative, different, discriminative, training, used, domain, deep, depth, color, figure, without, generation, adversarial, pair, face, dataset, generated, network, performance, input, learn, digit, neural, task, using, trained, achieved, gan, unsupervised, several, last, mnist, correspondence, pixel, applied, work, convolutional, sharing, adaptation, semantics, including, learned, randomly, agreement, rgbd, generate, gans]
Incremental Variational Sparse Gaussian Process Regression
Ching-An Cheng, Byron Boots


[probability, number, structure] [learning, function, max, problem, online, set, algorithm, general, conference, equivalent] [variational, inducing, stochastic, gpr, posterior, inference, gradient, ascent, approximate, log, hyperparameters, sarcos, mirror, update, approximation, incremental, full, ivsgprada, ivsgpr, batch, machine, subproblem, solved, solve, parametrization, datasets, large, divergence, processing, size, objective, performing, solves, step, dual, vsgprsvi] [sparse, can, regression, subspace, error, first] [gaussian, kernel, covariance, denotes, space, rkhs, basis, distribution, fixed, data, manifold, mean, finite, journal] [process, information, prior, therefore, change] [training, natural, approach, neural, used, several, using, use, propose, international, representation, better, dataset]
Hierarchical Clustering via Spreading Metrics
Aurko Roy, Sebastian Pokutta


[clustering, xtij, ultrametric, graph, definition, constraint, ultrametrics, tree, linkage, spreading, induced, partitioning, hierarchy, introduced, cut, flat, vertex, equivalence, subset, thus, iff, joseph, possible] [cost, function, algorithm, let, lemma, set, every, feasible, combinatorial, theorem, relaxation, since, optimal, problem, studied, round, annual, will, arbitrary, defined, study, satisfies] [approximation, log, supplementary, size, factor, approximate, method] [following, solution, also, can, note, characterization, error, rounding, symposium, xij, one, denote, condition] [data, polynomial, given, corresponding, based, metric] [hierarchical, ball, time] [similarity, using, use, pair, work, approach, used, ground, several, sequence, natural]
Finite Sample Prediction and Recovery Bounds for Ordinal Embedding
Lalit Jain, Kevin G. Jamieson, Rob Nowak


[constraint, probability, pgd, possible, fact] [let, theorem, function, since, set, loss, define, consider, known, learning, show, will, sup, convex, problem, conference, defined, least, assume, case, observe] [log, logistic, gradient, optimization, operator, machine] [matrix, hlt, can, ordinal, norm, nuclear, gram, linear, dik, denote, link, paper, rank, kxi, see, noisy, error, centered, dij, solution, recover, following, orthogonal, magnitude, solving, recovery, snh, note, subspace, symmetric, item] [distance, embedding, given, euclidean, space, corresponding, kernel, data, two, random, section, result, multidimensional, based] [new, observation, information] [prediction, using, like, work, triplet, international]
Structured Matrix Recovery via the Generalized Dantzig Selector
Sheng Chen, Arindam Banerjee


[probability, present] [theorem, bound, general, constant, sup, assume, set, let, satisfies, define, defined, convex, upper, essentially, proof, lemma, will, absolute, implies, class, bounded] [stochastic, deterministic, dual] [norm, matrix, recovery, can, invariant, unitarily, measurement, restricted, width, analysis, error, following, spectral, compatibility, generic, one, chaining, completion, via, condition, noise, sparse, centered, symmetric, rest, seminorm, owl, regularized, high, denote, generalized, vector, dantzig, certain] [gaussian, geometric, random, given, exp, estimation, section, estimator, two, metric, associated, based, gauge, functional] [trace, model] [structured, using, used, unit, work, bounding, use, shown, different]
Online Bayesian Moment Matching for Topic Modeling with Unknown Number of Topics
Wei-Shou Hsu, Pascal Poupart


[number, basic, according, david, discovered, actual, resulting, contains, total] [online, learning, algorithm, set, conference, will, since, exponentially] [bayesian, posterior, likelihood, variational, inference, latent, log, compute, sampling, approximate, update, approximating, expectation] [can, first, also, spectral, exact, denote, global] [dirichlet, distribution, ddm, moment, tdm, test, data, topic, two, hdp, estimate, degenerate, alpha, uniform, well, lda, ohdp, corpus, hdps, fixed, nonparametric, word] [model, prior, hierarchical, found, modeling, new, directly, process, whereas, experimental, inferred] [used, matching, figure, using, different, text, use, dataset, propose, proposed, similar, computer, shown, tested, able]
Efficient Second Order Online Learning by Sketching
Haipeng Luo, Alekh Agarwal, Nicolò Cesa-Bianchi, John Langford


[dag, number, represents] [algorithm, online, regret, set, learning, appendix, bound, setting, since, loss, return, bounded, even, show, example, theorem, rate, round, constant, lower, study] [update, rad, newton, stochastic, datasets, gradient, method, step, compute, quadratic, efficient, size, siam, stepsize, develop, adagrad, deterministic, due, convergence] [matrix, sketch, order, sparse, sketching, can, diagonal, first, linear, second, invariant, error, running, sketched, one, def, projection, condition, synthetic, son, also, require, computational, small, vector, row, assumption, still, note, computation] [random, journal, two, data, fixed] [time, frequent] [using, similar, performance, work, three]
Adaptive Skills Adaptive Partitions (ASAP)
Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor


[probability, number, definition] [learning, set, define, defined, adaptive, function, optimal, expected, now, may, algorithm, will, conference, derive, return, general, lifelong] [gradient, solve, objective, update, supplementary, parameterized, convergence] [can, vector, generalized, also, need, solution, matrix, necessary] [two, given, space, well, distribution, polynomial, refer, together] [skill, asap, policy, hyperplane, state, sps, framework, location, reward, action, mdp, automatically, trajectory, reinforcement, learns, misspecified, robocup, executing, striker, agent, hyperplanes, current, defender, ball, temporally, goal, reused, continuous, therefore, move, timothy, misspecification, flipped] [figure, learned, using, feature, domain, learn, different, multiple, task, shown, able, performance, single, seen, building, used]
Combining Fully Convolutional and Recurrent Neural Networks for 3D Biomedical Image Segmentation
Jianxu Chen, Lin Yang, Yizhe Zhang, Mark Alber, Danny Z. Chen


[structure, hierarchy, level] [may, learning, known, will, context, contextual, strategy] [size, method, university, evaluation] [can, one, component, generalized, much, also, first] [two, slice, result, section, based, six] [information, framework, new, neuron, exploit, along, neuronal] [segmentation, image, deep, fcn, different, biomedical, rnn, feature, convolutional, neural, using, layer, recurrent, training, input, fungus, lstm, performance, combining, map, network, four, scale, preprint, submodule, arxiv, sequence, convolution, architecture, used, clstm, output, approach, extracted, applied, propose, region, work, fully, original, dataset, voxel, use, figure, dame]
Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning
Wouter M. Koolen, Peter Grünwald, Tim van Erven


[probability, many, theory] [regret, loss, bernstein, learning, online, theorem, bound, hedge, will, case, squint, setting, fast, excess, expected, show, convex, let, consider, erven, implies, best, koolen, rate, oco, set, risk, inequality, may, function, lemma, hinge, example, appendix, audibert, round, bounded, learner, now, metagrad, adaptive, make, constant, equivalent, vtf, absolute, get] [stochastic, machine, optimization, expectation] [condition, can, order, also, high, main, linear, one, following, assumption, see] [distribution, section, statistical, family, result, point, two, data, infinite] [expert, van, automatically, allows] [sequence, use, compared, classification, using, prediction, used]
beta-risk: a New Surrogate Risk for Learning from Weakly Labeled Data
Valentina Zantedeschi, Rémi Emonet, Marc Sebban


[number, introduced] [learning, weakly, surrogate, label, risk, loss, algorithm, svm, instance, problem, function, set, wellsvm, show, will, convex, setting, class, defined, hypothesis, derive, optimal, depends, margin, context, permissible, minimizing, ell] [machine, standard] [can, one, matrix, iterative, formulation, linear, generic, first, vector, noisy, order, noise, min, initialization] [empirical, data, section, two, proportion, provided, given, positive, based, classical, corresponding, mean, gaussian, address] [new, information, deal, state] [labeled, different, using, supervised, training, unlabeled, used, learn, supervision, propose, fully, approach, figure, neural, weak]
Statistical Inference for Cluster Trees
Jisu KIM, Yen-Chi Chen, Sivaraman Balakrishnan, Alessandro Rinaldo, Larry Wasserman


[cluster, tree, merge, definition, pruning, tph, partial, dmm, clustering, valid, pruned, connected, gvhd, suitability, eldridge, find, height, topology, level, department, rule, contains] [confidence, set, appendix, consider, lemma, function, unknown, defined, will, focus, implies, exists, sup, define] [inference, bootstrap, convergence, university] [can, paper, order, see, first, also, mouse, computational, usa, denote] [density, metric, distortion, statistical, two, data, sample, empirical, construct, bandwidth, estimation, well, ring, kernel, positive, insignificant, estimated, functional, statistically, length, true, interleaving, mickey, estimate, distance, way, based, important, carnegie] [control, solid, information] [figure, several, use, three, propose, work, top]
Kronecker Determinantal Point Processes
Zelda E. Mariet, Suvrit Sra


[probability, partial, structure, obtained, monotonic, average] [learning, algorithm, set, cost, fast, problem, may, obtain, provide, show, will] [icard, dpp, kronecker, sampling, ron, determinantal, stochastic, large, dpps, efficient, iteration, updating, due, size, increase, faster, log, update, full, iterates, showed, gradient, approximate, key, mcmc, machine, initial] [can, matrix, also, product, one, exact, synthetic, real] [kernel, point, positive, definite, data, two, important, application, given] [time, model, pps, future] [training, using, dataset, ground, table, neural, evaluate, performance, shown, used, memory]
Learning feed-forward one-shot learners
Luca Bertinetto, João F. Henriques, Jack Valmadre, Philip Torr, Andrea Vedaldi


[number, possible, contains] [learning, function, problem, consider, best, class, case] [parameter, large, size, method, evaluation, objective, university, usually, oxford] [can, one, second, order, also, factorized, first, linear, error, solving, small] [given, two, space, embedding, distance] [tracking, model, dynamic, new, simple, baseline] [siamese, learnet, using, exemplar, deep, single, object, convolutional, layer, different, network, discriminative, neural, predicted, architecture, filter, image, output, learn, prediction, visual, character, training, predict, recognition, generative, approach, input, use, shared, trained, work, predicts, stream, performance, applied, frame, classification, figure]
A Probabilistic Framework for Deep Learning
Ankit B. Patel, Minh Tan Nguyen, Richard Baraniuk


[probabilistic, number, many, passing, configuration, path] [algorithm, learning, class, max, will, appendix, set, since, focus] [inference, latent, factor, processing, machine, key] [can, also, via, linear, principled, order, formulation, sparse, one] [mixture, test, corresponding] [model, information, new, argmax, framework, hierarchical, template, activity] [drmm, deep, nuisance, training, rendering, generative, dcns, neural, layer, dcn, preprint, arxiv, image, rmm, object, variation, task, work, different, labeled, figure, unsupervised, discriminative, performance, several, using, rendered, multiple, classification, table, convolutional, recognition, rfm, supervised, shallow, reconstruction, network, approach, pixel, relu]
Linear Feature Encoding for Reinforcement Learning
Zhao Song, Ronald E. Parr, Xuejun Liao, Lawrence Carin


[number, represented, theory, block, indicator, probability, represents] [function, algorithm, set, learning, problem, theorem, optimal, expected, since, may, provide, example] [approximation, approximate, method, additional] [linear, error, can, good, one, sufficient, following, also, sampled, projection, see, supplemental, matrix] [random, selection, two, fixed, point, based, given, space, result, dimensional, needed] [value, raw, policy, state, reinforcement, next, bellman, action, encoding, model, reward, encoded, pendulum, parr, mdp, angle, controlled, taking, inverted, extend, framework] [feature, encoder, using, training, deep, representation, approach, use, work, used, performance, decoder, figure, previous, prediction, recent, neural, single, learn]
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Tim Salimans, Diederik P. Kingma


[average] [max] [] [global, dimension, noise, whitening] [gaussian, type] [pool, raw] [relu, leaky, conv, network, dropout, neural, input, layer, architecture, table, softmax, output, rgb]
Dual Learning for Machine Translation
Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tieyan Liu, Wei-Ying Ma


[message, association, according, quality] [learning, will, set, algorithm, may, conference] [machine, dual, parallel, gradient, log, beginning, warm, method, processing] [can, first, second, computational, following, one] [two, data, based, section, statistical, versus, france] [model, reward, game, agent, reinforcement, search, feedback, pseudo, communication, policy, baseline] [translation, sentence, language, monolingual, bilingual, neural, using, nmt, used, training, trained, improve, natural, source, mechanism, outperforms, table, translate, beam, deux, original, bleu, performance, train, smid, recurrent, use, learn, middle, generate, back, proposed, work, generated, aligned]
Exponential expressivity in deep neural networks through transient chaos
Ben Poole, Subhaneil Lahiri, Maithreyi Raghu, Jascha Sohl-Dickstein, Surya Ganguli


[propagation, thus, number, grows, theory] [exponentially, function, consider, complexity, notion] [large, compute, processing] [can, vector, linear, one, principal, analysis, iterative, phase, high, highly, first, small] [length, curvature, fixed, point, random, curve, dimensional, manifold, chaotic, geometry, space, circle, tangent, boundary, metric, theoretical, exponential, euclidean, expressivity, gaussian, riemannian, distribution, mean, radius, curved, two, growth, gauss, chaos, nonlinear, hli, sigmoidal, quantitatively, measure] [extrinsic, across, information, decision, activity, neuron, simple, correlation, focused, evolution] [deep, neural, layer, input, map, network, depth, shallow, hidden, feedforward, understand, unit, figure, weight, preprint, nonlinearities]
Convex Two-Layer Modeling with Latent Structure
Vignesh Ganapathiraman, Xinhua Zhang, Yaoliang Yu, Junfeng Wen


[negative, graph, structure, total, number] [convex, learning, max, set, problem, optimal, function, relaxation, will, assume, let, since, algorithm, hull, setting] [latent, optimization, inference, extreme, objective, quadratic, term, gradient, method, convy, operator, machine, recently] [can, min, linear, first, via, rank, error, cvx, mrr, one, occluded, local, vector, arg, also, small, inpainting, support, hard] [conditional, given, random, two, test, word, based] [model, framework, therefore, value] [structured, using, output, discriminative, letter, deep, training, matching, polar, layer, image, unsupervised, map, achieved, representation, weight, patch, performance, prediction, proposed, used, transliteration, higher]
Reconstructing Parameters of Spreading Models from Partial Observations
Andrey Lokhov


[dmp, psi, node, rec, number, spreading, probability, transmission, partial, diffusion, fact, hts, infection, directed, cascade, involving, tree, fdmp, propagation, infected] [case, problem, learning, algorithm, conference, complexity, even, set, general, defined, will, since] [likelihood, initial, marginal, optimization, large, step, free, energy, full, inference] [can, observed, missing, recovery, one, exact, assumption, small, accurate] [data, given, equation, section, corresponding, well, equal, based, fixed, important] [time, information, dynamic, model, state, complex, observation] [network, reconstruction, using, activation, hidden, figure, international, used, original, approach, use, part]
Minimizing Regret on Reflexive Banach Spaces and Nash Equilibria in Continuous Zero-Sum Games
Maximilian Balandat, Walid Krichene, Claire Tomlin, Alexandre Bayen


[probability, definition, sum, strong, subset] [regret, set, convex, function, let, online, general, learning, bound, uniformly, theorem, play, nash, strategy, banach, consider, essentially, modulus, will, dom, problem, reflexive, algorithm, lower, example, case, obtain, upper, minimizing, show, proper, assume, repeated, duality, corollary, make, defined, class, continuity] [dual, averaging, strongly, log, semicontinuous, method, additional, supplementary] [assumption, suppose, regularizer, can, denote, sublinear, analysis, also, explicit, following, convexity] [space, metric, section, finite, measure, empirical, given] [player, reward, continuous, payoff, game, action, decision, corresponds, choice] [sequence, compact]
Agnostic Estimation for Misspecified Phase Retrieval Models
Matey Neykov, Zhaoran Wang, Han Liu


[variable, program, theory, knowledge] [algorithm, convex, least, class, function, optimal, case, constant, let, even, satisfies, considered, obtain, since, index, will, exists, assume, inequality, problem, implies, learning, lemma] [log, step, size, full, parameter, requires, refined, solve, standard, quadratic] [mpr, phase, can, retrieval, condition, vector, following, second, sparse, also, first, one, init, link, noisy, proposition, twf, tpm, principal, solving, high, regression, dimension, regularized, linear, semidefinite, sims, analysis, via, sufficiently, addition, satisfying] [estimate, sample, estimation, procedure, two, data, random, based, journal, given, section, gaussian, result, statistical] [model, direction, suggested, information] [proposed, approach, figure, using, single, propose, applied]
Yggdrasil: An Optimized System for Training Deep Decision Trees at Scale
Firas Abuzaid, Joseph K. Bradley, Feynman T. Liang, Andrew Feng, Lee Yang, Matei Zaharia, Ameet S. Talwalkar


[tree, ggdrasil, partitioning, split, worker, vertical, number, node, lanet, master, horizontal, xgb, sorted, uncompressed, yggdrasil, bitvector, spark, bitvectors, runtime, yahoo, assigned, ttruth, cache, leaf, total] [learning, cost, optimal, algorithm, set, oost, compression, best] [distributed, iteration, large, compute, requires, faster, machine, implementation, discretization, sequential] [can, sparse, one, via, order, arg, sufficient, compressed, linear, need] [data, two, well, random, section, scan] [communication, decision, time, encoding, continuous, optimized, across, rather, new] [training, feature, depth, memory, using, figure, single, mnist, candidate, without, better, deep, dataset, impact, performance, perform, several]
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism
Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, Walter Stewart


[reverse, number, represented] [learning, contribution, risk, will, case, index, context, appendix, set, label] [step, size, logistic, machine] [can, vector, one, order] [clinical, patient, two, given, embedding, interpretability, test, data, comparable, electronic, specific, word, dpm] [time, visit, retain, model, predictive, ehr, medical, information, health, heart, interpretable, esl, modeling, disease, medication, temporal, healthcare, failure, interpretation, wemb, skin, making] [attention, use, rnn, using, input, prediction, sequence, neural, accuracy, figure, generate, recurrent, hidden, generating, predict, layer, mechanism, single, training, preprint, arxiv, rnns, traditional, softmax, used, table, mlp, predicting, generation, different]
Mixed vine copulas as joint models of spike counts and local field potentials
Arno Onken, Stefano Panzeri


[number, probability, tree, thus, obtained, normalized] [margin, function, algorithm, cumulative, will, complexity] [mixed, copula, likelihood, vine, method, lfps, apply, clayton, sampling, parameter, quadratic, full, inference, standard, efficient, fit, derivative] [can, one, denote, analysis, second, condition, need] [discrete, independent, multivariate, entropy, statistical, distribution, data, density, estimate, mutual, based, section, two, joint, gaussian, statistic, construct, estimation, corresponding, journal, sample, denotes, estimated, population] [model, continuous, information, spike, framework, activity, simulated, stimulus, concurrent, brain] [neural, network, different, fully, pair, used, using, field, approach, input]
Optimal Cluster Recovery in the Labeled Stochastic Block Model
Se-Young Yun, Alexandre Proutiere


[misclassified, number, sbm, probability, lsbm, cluster, community, block, detection, clustering, present, average, edge, graph, partition, obtained, independently, possible, degree, social] [algorithm, asymptotically, label, set, theorem, let, consider, assume, binary, optimal, general, exists, since, proof, problem, max, provide, expected, show, prove] [stochastic, log, supplementary] [can, recovery, condition, spectral, high, exact, observed, first, denote, item, necessary, matrix, also, minimal, accurate, column, sparse, sufficient, singular, one, symmetric] [random, two, corresponding, proportion, true, positive, reference, section, estimate, given] [model, establish, observation] [using, part, hidden, performance, pair, labeled, work, generated, consists, without]
Deep Learning Games
Dale Schuurmans, Martin A. Zinkevich


[constraint, vertex, number, theory] [learning, set, regret, nash, protagonist, antagonist, convex, problem, utility, algorithm, appendix, loss, since, function, equilibrium, online, chooses, will, dvt, defined, strategy, best, zannis, conference, zanni, define, consider, playing, appears, finding, even, differentiable, considered, competitive, folded] [gradient, optimization, stochastic, reduction, constrained, parameter, machine, standard, descent] [can, one, also, global, first, interesting, kkt, critical] [given, joint, point, two, data, affine] [action, game, expert, allows, model, simple, information, choice, alternative] [training, deep, supervised, neural, network, output, using, input, figure, activation, investigate, used, matching, conducted, international, feedforward]
Synthesis of MCMC and Belief Propagation
Sung-Soo Ahn, Michael Chertkov, Jinwoo Shin


[loop, zloop, annealing, number, cycle, probability, worm, ising, partition, graph, bethe, describe, external, graphical, wmin, interaction, pairwise, propagation, science, called, strength, computable, adding] [algorithm, set, theorem, scheme, general, function, binary, case, consider, known, since, strategy, problem] [mcmc, approximation, log, full, inference, approximating, computing, experiment, efficient, sampling, end] [generalized, following, can, first, one, error, via, also, running, popular] [section, journal, estimating, provides, statistical, estimate, sample, given, random, basis, described, distribution, sampler, polynomial] [series, design, model, simulated, allows, belief, information, experimental] [using, generate, proposed, planar, use, approach, propose]
Disentangling factors of variation in deep representation using adversarial training
Michael F. Mathieu, Junbo Jake Zhao, Junbo Zhao, Aditya Ramesh, Pablo Sprechmann, Yann LeCun


[obtained, variable, belonging, number] [learning, class, set, label, show, observe, access, loss] [approximate, posterior, interpolation, latent, evaluation, likelihood, variational, dec, vae, supplementary, log] [can, one, component, solving, still] [specified, given, conditional, data, two, test, described, procedure, corresponding] [model, information, goal] [unspecified, generative, training, variation, using, deep, able, different, figure, used, adversarial, neural, trained, image, norb, use, learned, disentangle, network, dataset, generated, arxiv, representation, swapping, preprint, unseen, approach, proposed, learn, generate, hidden, object, disentanglement, sprite, visualization, work, yann, supervision, gan, separate, style, without, shown, identity]
Adaptive Neural Compilation
Rudy R. Bunel, Alban Desmaison, Pawan K. Mudigonda, Pushmeet Kohli, Philip Torr


[program, probability, number, present, possible, array, find, compilation] [will, learning, algorithm, set, problem, differentiable, make, element, loss, show, example, version, learnable] [optimisation, machine, efficient, gradient, solve, initial, descent, step, supplementary] [can, first, one, also, generic, matrix, success, good, need] [given, two, data, associated, based, section, biased, type, way] [value, model, controller, instruction, compiler, simple, register, linked, irt, written, required, tape, code, stop, jez, read, execution, design, target, head, correct, found] [neural, use, output, input, memory, figure, learned, using, perform, representation, able, different, work, original, computer, presented, without, learn]
A state-space model of cross-region dynamic connectivity in MEG/EEG
Ying Yang, Elissa Aminoff, Michael Tarr, Kass E. Robert


[describe, indicates, among] [dependence, set, may, assume] [log, step, method, standard, distributed, processing] [can, linear, connectivity, also, one, error, noise, leading, localization, relative, norm, denote, real, matrix, first, solution, diagonal] [given, estimate, mean, data, two, space, independent, point, estimation, covariance, gaussian, estimating] [model, activity, time, roi, dynamic, mne, sensor, meg, evc, across, within, ppa, information, state, along, lagged, trial, cortical, right, brain, feedback, current, simulated, effect, cortex, interest] [source, using, figure, use, visual, neural, applied, flow, scene, previous, different, multiple]
Fast and Provably Good Seedings for k-Means
Olivier Bachem, Mario Lucic, Hamed Hassani, Andreas Krause


[quality, clustering, probability, total, cluster, number, science] [algorithm, set, complexity, expected, fast, even, guarantee, theorem, let, consider, provide, uniformly, bound, optimal, competitive, since] [free, proposal, step, log, full, requires, provably, kdd, key, approximation, initial, sampling, rna] [solution, error, computational, can, preprocessing, first, sampled, one, main, good, denote, also] [data, chain, ssumption, seeding, distance, length, distribution, random, theoretical, quantization, two, bachem, center, sample, afk, section, based] [markov, web] [using, table, propose, compared, without, different, outperforms, used]
Single-Image Depth Perception in the Wild
Weifeng Chen, Zhao Fu, Dawei Yang, Jia Deng


[agree, many, possible] [algorithm, will, query, loss, set, yet, learning, function] [method, datasets, large, full] [relative, ordinal, can, error, one, symmetric, still, existing, sampled, first, significantly] [metric, point, two, random, closer, data, estimation, sample, estimate, test, estimating, well] [new, unconstrained, human, system, intrinsic] [depth, network, image, using, zoran, dataset, nyu, single, training, trained, deep, figure, wild, eigen, train, consists, work, per, predicted, use, used, perception, superpixel, learn, performance, segmentation, approach, input, pair, convolutional, kinect, prediction, semantic, recent, preprint, arxiv, neural, outperforms, predicts]
Bayesian optimization for automated model selection
Gustavo Malkomes, Charles Schaff, Roman Garnett


[base, number, probabilistic, potential] [function, set, will, learning, conference, consider, define, active, expected, case, problem, focus, defined, best, arbitrary] [bayesian, machine, log, hyperparameters, optimization, compute, method, select, datasets, latent, processing, appropriate, approximation, iteration, especially] [can, also, via, note, regression, observed, one] [kernel, given, space, gaussian, two, distance, mean, covariance, infinite, hellinger, data, described, selection, distribution, automated, construct, squared, associated, random, grammar, fixed, well] [model, evidence, prior, search, process, hyperparameter, explain, information, complex, acquisition, simple, observation, modeling, value, framework] [dataset, training, used, neural, use, using, work, per, candidate, perform, similarity, approach, proposed, novel]
Learning Influence Functions from Incomplete Observations
Xinran He, Ke Xu, David Kempe, Yan Liu


[influence, diffusion, incomplete, node, dic, retention, seed, dlt, edge, social, number, probability, cascade, cic, complete, graph, activated, improper, mae, notice, wuv, independently] [learning, function, set, pac, algorithm, learnability, let, problem, loss, rate, active, proper, theorem, class, will, focus, even, appendix, provide, learnable, consider] [parameter, efficient, likelihood, log] [can, missing, observed, also, error, following, one, synthetic, linear] [distribution, data, estimation, independent, random, true, result, sample, two] [model, information, goal, process, time, observation] [activation, use, using, learned, learn, network, dataset, figure, training, approach, three]
MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild
Gregory Rogez, Cordelia Schmid


[number, normalized, average, probability, resulting] [learning, algorithm, show, set] [large, method, constrained, compute, blending, requires, size] [synthetic, error, real, can, also, cmu, existing, first] [estimation, data, joint, corresponding, two, given, considering, associated, distance] [human, model, new, information] [pose, training, image, using, different, dataset, trained, approach, synthesis, used, lsp, classifier, generate, annotated, cnn, use, mocap, deep, rendering, pixel, body, convolutional, motion, capture, train, similar, classification, engine, table, performance, single, compare, better, camera, figure, cnns, neural, source, captured, outperforms, recent, work, evaluate, map, aligned, text, mosaic]
Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences
Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, Michael I. Jordan


[probability, neighborhood, number, many] [algorithm, function, theorem, learning, least, will, problem, existence, scheme, constant, even, annual, show, conference, setting, consider, case, uniformly, optimal, provide, general, focus] [likelihood, gradient, convergence, saddle, converges, method] [local, initialization, can, global, one, critical, first, order, main, vector, arbitrarily, component, also, small] [mixture, gaussian, population, random, bad, true, two, maximum, result, point, mean, spherical, finite, gaussians, sample, converge, annals, section, statistical, srebro, distribution, gmms, favorable, estimation, practical, cgap, gmm, given, data, density] [model, form, search] [initialized, open, recent, work]
Correlated-PCA: Principal Components' Analysis when Data and Noise are Correlated
Namrata Vaswani, Han Guo


[number, enough, possible, block] [set, bound, let, problem, assume, algorithm, will, define, complexity, theorem, upper, example, least, lower, smaller, index, study, consider, since, hold, get, proof, interval] [log, large, compute, supplementary] [assumption, matrix, pca, one, can, small, also, first, evd, itt, subspace, noise, following, eigenvectors, principal, condition, denote, eigenvalue, pcp, solution, need, satisfying, sparse, error, much, singular, observed, solving, incoherence, mutually, completion, zero, nonzero] [data, given, sample, robust, result, covariance, true, mean, refer, generalization] [time] [video, using, use, top, used, sequence, moving, neural]
Sample Complexity of Automated Mechanism Design
Maria-Florina F. Balcan, Tuomas Sandholm, Ellen Vitercik


[hierarchy, payment, split, describe, theory, greater] [revenue, complexity, auction, reserve, bidder, function, optimal, class, combinatorial, set, valuation, allocation, bundle, show, learning, conference, prove, vcg, ama, theorem, proof, bound, consider, tuomas, grand, bundling, rademacher, every, let, may, price, annual, designer, study, mbarps, reva, mbarp, upper, defined, interval, general, derive, expected, analyze, morgenstern, problem] [log, deterministic, convergence, machine, mixed, supplementary, optimization] [can, item, one, much, following, vector, symposium] [sample, automated, two, section, uniform, distribution, space, fixed] [design, determine, simple, must] [mechanism, used, work]
Multimodal Residual Learning for Visual QA
Jin-Hwa Kim, Sang-Woo Lee, Donghyun Kwak, Min-Oh Heo, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang


[number, block, multiplication] [learning, conference, since, may, function, idea] [method, extra, machine, processing] [linear, vector, explicit, first, can, overall, dimension] [joint, based, given, embedding, section] [model, information, effect, alternative, effectively] [visual, residual, attention, question, deep, tanh, neural, using, feature, mapping, multimodal, image, preprint, arxiv, table, used, figure, identity, shortcut, mrn, output, pretrained, attentional, vqa, input, use, learn, dataset, international, representation, language, visualization, shown, various, spatial, multiple, vision, accuracy, novel, recent, propose, answering, rnn, mechanism, performance]
Gaussian Processes for Survival Analysis
Tamara Fernandez, Nicolas Rivera, Yee Whye Teh


[covariate, probability, number, proportional, knowledge, variable] [function, set, algorithm, observe, scheme, interval, choose, consider, since, just, let] [inference, bayesian, method, parameter, approximation, sampling, line, tractable, processing] [can, first, following, proposition, analysis, one, need, second, noisy, overall, denote, synthetic] [survival, gaussian, data, hazard, random, covariates, sample, given, kernel, nonparametric, point, distribution, treatment, cox, censoring, weibull, rejected, centred, exponential, independent, type, accepted, jump, way, length, estimator, parametric, associated, gamma, patient, annals, two] [model, process, time, prior, poisson, baseline, intensity, right, expert] [score, use, figure, using, scale, neural, perform, used]
Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections
Xiaojiao Mao, Chunhua Shen, Yu-Bin Yang


[average, block, contains] [learning, achieve, show, observe, achieves, obtain, best, even] [size, evaluation, gradient, large, method] [can, ieee, recover, noise, much, recovering, existing, second, sparse, one, corrupted, symmetric, also, first, handle] [two, based, testing] [model, framework, passed] [image, network, convolutional, skip, denoising, deep, deconvolution, deconvolutional, using, training, performance, psnr, different, better, ssim, table, neural, single, restoration, shown, convolution, feature, proposed, fully, input, layer, use, figure, propose, clean, deeper, used, multiple, residual, output, cscn, slightly, without, trained, dnn, train, work]
Long-term Causal Effects via Behavioral Game Theory
Panagiotis Toulis, David C. Parkes


[causal, assignment, assigned, thus, definition, possible, according, theory, structural] [algorithm, set, every, assume, let, defined, revenue, strategy, reserve, expected, price, known, auction, play] [experiment, step, inference, objective, latent, method, university] [assumption, can, observed, also, denote, vector, one, matrix] [population, two, data, treatment, denotes, estimate, sample, given, estimation, conditional, space, journal, statistical, estimated, random] [policy, agent, model, behavioral, game, effect, time, behavior, experimental, new, temporal, across, action, payoff, economic, multiagent, information, economy, value, baseline, assuming, framework, therefore, evolve, american, infer, rapoport, dynamical] [figure, approach, different, used, evaluate]
Hardness of Online Sleeping Combinatorial Optimization Problems
Satyen Kale, Chansoo Lee, David Pal


[graph, path, called, theory] [set, online, sleeping, regret, problem, combinatorial, learning, algorithm, nline, loss, instance, round, hardness, learner, awake, adversary, agnostic, dnf, algosco, hortest, since, now, pac, every, let, chooses, disjunction, least, define, property, inimum, show, satisfies, prove, algdisj, bipartite, index, richness, maximal, heaviness, label, implies, known, upper, lower, kanade, consider, sla] [optimization, efficient, size, reduction, stochastic, parameter] [hard, ranking, exactly, note, running, one] [given, two, result, fixed, computationally] [action, time, decision, policy] [ground, figure, shown, open, input, available, matching, labeled]
A forward model at Purkinje cell synapses facilitates cerebellar anticipatory control
Ivan Herreros, Xerxes Arsiwalla, Paul Verschure


[rule, contains, level, interaction, actual, basic, transmission] [learning, adaptive, optimal, will, even, scheme, algorithm] [gradient, smooth, descent, update, parallel] [error, can, signal, linear, note, one, matrix, computational, also, min] [reference, basis, given, conditioning, based, transport, difference, provides] [control, cerebellar, eligibility, feedback, model, anticipatory, trace, system, time, cfpc, motor, cerebellum, reactive, controller, purkinje, trial, forward, current, plant, response, pursuit, cell, module, target, counterfactual, delay, fiber, impulse, synaptic, predictive, anatomical, within, receives, indicate, simulation, behavior, eye] [output, architecture, use, layer, performance, generate, using, input, filter, weight, neural, task, figure, position]
Infinite Hidden Semi-Markov Modulated Interaction Point Process
matt zhang, Peng Lin, Peng Lin, Ting Guo, Yang Wang, Yang Wang, Fang Chen


[probability, ancestor, interaction, represent, number, variable, developed] [set, function, defined, drawn, consider] [latent, inference, energy, sampling, bayesian, stochastic, step, consumption] [can, via, following, first, also, synthetic, one] [point, hmm, emission, kernel, conditional, infinite, distribution, data, measure, hdp, sample, resampling, space, associated, sampler, distance, hamming, dirichlet, two, given, random, section] [state, process, event, triggering, model, arrival, intensity, hawkes, transition, time, temporal, particle, correlation, failure, poisson, new, markov, emitted, smc, observation, trading, prior, duration, information, pipe] [proposed, hidden, used, use, sequence, background, part, performance, compared]
Improved Techniques for Training GANs
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, Xi Chen


[quality, find, number, thus, ian, becomes] [learning, loss, nash, may, cost, virtual, function, equilibrium, distinguish, class, label] [minibatch, batch, normalization, gradient, objective, descent, standard, large] [can, real, one] [data, point, metric, well, discrimination, described, sample, distribution, given, test] [model, human, new, player] [training, generative, using, discriminator, generator, generated, arxiv, adversarial, preprint, inception, deep, gan, pmodel, gans, use, feature, classifier, neural, score, matching, figure, layer, image, output, network, generate, approach, table, proposed, work, dataset, lunsupervised, single, trained, learn, able, labeled, train, unsupervised, collapse, supervised, used, shown]
Showing versus doing: Teaching by demonstration
Mark K. Ho, Michael Littman, James MacGlashan, Fiery Cushman, Joe Austerweil, Joseph L. Austerweil


[probability, number, possible, department] [learning, show, function, even, concept, optimal, will, best, example, learner, algorithm, conference, choose, set, chooses] [experiment, bayesian, standard, university, parameter] [can, one, signal, also, condition, row, first] [people, two, given, well, true, distribution, maximum] [reward, model, teaching, showing, pedagogical, goal, irl, safe, behavior, reinforcement, tile, teacher, inverse, agent, observer, pdoing, state, expert, human, teach, trajectory, simply, colored, policy, action, won, value, pobserving, planning, demonstration, told, location, infer, grid, worth] [better, shown, figure, task, learned, international, three, learn, calculated]
A Communication-Efficient Parallel Algorithm for Decision Tree
Qi Meng, Guolin Ke, Taifeng Wang, Wei Chen, Qiwei Ye, Zhi-Ming Ma, Tieyan Liu


[number, tree, attribute, voting, according, split, histogram, thus, probability, ctr, node, ltr, find, parallelize, larger, total, bestsplit, leftsum, gbdt, informativeness, becomes] [algorithm, best, will, since, cost, lower, theorem, learning, bound, achieve, choose, even, smaller] [parallel, machine, size, select, large, sequential, big, boosting] [local, can, global, one, globally, also, order, need, comparison, small, analysis, see, following, rank, good, note, much, high] [data, point, theoretical, based, quantized, two] [decision, communication, information, gain, new, process, communicate, speed] [training, accuracy, used, different, table, classification, using, volume, figure, proposed, similar, conducted]
Unsupervised Learning from Noisy Networks with Applications to Hi-C Data
Bo Wang, Junjie Zhu, Armin Pourshafeie, Oana Ursu, Serafim Batzoglou, Anshul Kundaje


[number, community, genomic, genome, interaction, detection, clustering, node, structure, denoise, chromosome, chromatin, ctcf, incorporate, subset] [problem, may, minimize, set, function, algorithm, confidence, general, learning, appendix, will, cost] [optimization, method, respect, objective, solve, protein, due] [can, matrix, noisy, noise, denoised, order, one, also, link, local, small, addition, confident, missing, observed, regulatory, high, alternating, highly, detecting, histone] [data, important] [framework, information, biological, baseline, subject, determine, value, human] [network, using, used, resolution, different, sij, figure, use, map, denoising, approach, multiple, ground, performance, weight, stanford, capture, unsupervised, single, binding]
Identification and Overidentification of Linear Structural Equation Models
Bryant Chen


[identification, identified, identify, causal, graph, structural, identifiable, directed, pearl, node, admissible, recursive, definition, coefficient, earl, criterion, bidirected, independence, removing, constraint, tian, allowed, edge, descendant, developed, chen, ian, discovery, dormant, path, shpitser, recursively, testable, overidentifying, identifying, resulting, allow, brito, discovered, many, connected, verma, ezy] [set, will, may, algorithm, conference, theorem, show, let, since, satisfies, equivalent, function, exists, obtain] [method, additional] [linear, can, decomposition, error, also, matrix, first] [equation, two, given, covariance, distribution] [model, artificial, uncertainty, head, direct] [using, figure, able, applied, use]
Equality of Opportunity in Supervised Learning
Moritz Hardt, Eric Price, None, Nati Srebro


[threshold, false, attribute, definition, parity, possible, intersection, horizontal, thus, getting] [optimal, roc, notion, loss, will, might, binary, function, fairness, rate, learning, outcome, always, max, convex, consider, since, satisfies, case, depends, satisfy] [oblivious, requires, big] [can, group, also, fraction, linear, profit, accurate, one, require, def] [predictor, equalized, odds, protected, equal, opportunity, positive, derived, fico, demographic, true, two, data, bayes, curve, joint, loan, distribution, conditional, point, well, people, based, deriving, result, construct, given, random, specified] [target, white, within, goal, information] [score, different, figure, training, supervised, using, used, prediction, use, better, single]
Combinatorial semi-bandit with known covariance
Rémy Degenne, Vianney Perchet


[possible, number, structure, coefficient, find] [regret, bound, algorithm, let, case, problem, lemma, combinatorial, bounded, bandit, subgaussian, general, lower, setting, will, arm, upper, confidence, appendix, ivt, show, theorem, combes, proof, smaller, consider, least, get, pulled, choosing, set, online, unknown, learning, kveton] [log, stage, term, supplementary, parameter, stochastic, processing, variance, due] [matrix, linear, can, one, analysis, suppose, regression, also, diagonal, maxi, union, following] [independent, covariance, positive, maximum, equal, given, two, finite, mean, estimator] [action, information, event, reward, design, exploration, goal, form] [using, used, use, neural, different]
Iterative Refinement of the Approximate Posterior for Directed Belief Networks
Devon Hjelm, Ruslan R. Salakhutdinov, Kyunghyun Cho, Nebojsa Jojic, Vince Calhoun, Junyoung Chung


[directed, graphical, many, average, number] [learning, adaptive, algorithm, conference, general] [refinement, posterior, approximate, variational, importance, air, inference, monte, sampling, carlo, gradient, variance, lowerbound, stochastic, log, irvi, refined, step, latent, likelihood, additional, initial, sequential, sbn, reweighted, rws, university, sigmoid, machine, autoregressive, darn, unbiased] [can, iterative, also] [estimate, test, sample, discrete, well, conditional, true, density, provided, procedure] [model, belief, demonstrate] [recognition, network, generative, using, trained, training, used, neural, better, deep, effective, arxiv, preprint, approach, use, available, improve, mnist, train, international, final, layer, work, improves, figure]
Unifying Count-Based Exploration and Intrinsic Motivation
Marc Bellemare, Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton, Remi Munos


[probability, number, proportional, den] [learning, may, conference, consider, problem, theorem, notion, general, will, ratio, now, define, maximizing, since, defined, provide, optimistic] [progress, posterior, approximation, machine] [also, assumption, related, can, one, zero, error, note] [density, empirical, distribution, theoretical, result, derived, particular, two, lim] [model, exploration, information, intrinsic, motivation, gain, atari, reinforcement, state, agent, bonus, salient, visit, event, across, policy, environment, uncertainty, must, tabular, artificial, arcade, game, reward, transition, van, recoding, motivated] [prediction, sequence, used, score, use, figure, neural, approach, international, deep, training, without, performance]
Bayesian Optimization with a Finite Budget: An Approximate Dynamic Programming Approach
Remi Lam, Karen Willcox, David H. Wolpert


[programming, number, configuration, possible] [function, algorithm, optimal, budget, set, utility, problem, greedy, defined, expected, best, define, worst, strategy, discount, improvement, consider, maximizes, conference] [optimization, objective, bayesian, iteration, posterior, approximate, nested, initial, fmin, evaluation, lookahead, expensive, variance, stage, rolling, computed, characterized, machine, approximation, respect, auxiliary] [can, global, gap, note, following, solving, also, formulation, one, existing] [mean, finite, given, gaussian, space, journal, statistical, distribution] [rollout, design, reward, dynamic, state, control, value, heuristic, median, system, policy, next, simulated, information, model, new] [using, use, training, used, performance, propose, several, evaluate, proposed, achieved, table, approach]
Linear Relaxations for Finding Diverse Elements in Metric Spaces
Aditya Bhaskara, Mehrdad Ghadiri, Vahab Mirrokni, Ola Svensson


[subset, number, quality, many, incorporate, clique, obtained, total] [diversity, set, algorithm, let, matroid, will, consider, xir, maximization, randomized, show, chosen, now, greedy, problem, constant, implies, lemma, cardinality, known, bound, every, study, theorem, relevance, general, since] [approximation, objective, factor, size, step, large, additional, picked] [can, rounding, also, one, solution, linear, note, ieee, local, suppose, denote, planted, much, main] [metric, two, section, well, distance, data, space, result, selection] [search, value, maximize, goal, new, web] [figure, used, better, different, dataset, feature, output, computer, pair]
f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization
Sebastian Nowozin, Botond Cseke, Ryota Tomioka


[reverse, probabilistic] [function, learning, class, pearson, provide, algorithm, show, bound, general, set, lower, convex, case, minimization, defined, special] [variational, divergence, objective, log, method, saddle, approximate, gradient, supplementary, conjugate, likelihood] [can, one, also, linear, see, note] [sample, distribution, estimation, two, true, given, density, squared, kde, hellinger, kernel, mixture, point, random, data, estimate, nguyen, difference, test] [model, information, framework] [generative, neural, gan, training, using, table, use, generator, mnist, output, train, used, network, work, activation, trained, different, figure, proposed, deep, approach, input, natural, final, representation, learned, goodfellow, three, layer]
Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction
Jacob Steinhardt, Gregory Valiant, Moses Charikar


[reliable, constraint, probability, block, number, crowdsourcing, detection, influence, community, quality, graph, according, actual, worker, among, possible] [algorithm, let, set, will, obtain, lemma, setting, online, even, general, may, show, assume, concentration, problem, remaining, least, proof, rate, consider, conference] [stochastic, key, large, involves] [raters, matrix, can, item, rating, nuclear, norm, noisy, rater, peer, mij, also, row, recover, vector, assumption, note, suppose, denote, semirandom, proposition, fraction, one, via, following, sparse, good, much, technical, arbitrarily, necessary, recovering, main] [section, given, selected, random, true, robust, result, independent] [model, information, goal] [using, work, randomly, output, amount, evaluate]
Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm
Qiang Liu, Dilin Wang


[find] [set, algorithm, function, learning, general, since, let, consider, case, problem] [variational, gradient, inference, descent, stein, bayesian, log, method, monte, transforms, carlo, smooth, size, iteration, stochastic, divergence, posterior, expectation, sgld, parallel, sequential, optimization, large, initial, jacobian, requires, step, approximate, derivative, full, calculate] [can, also, matrix, perturbation, zero, one, see, small, via, need, kernelized] [distribution, kernel, mean, density, gaussian, random, functional, data, test, rbf, pbp, reduces, empirical, theoretical, procedure, discrepancy, independent] [particle, target, form, simple, new, except, purpose, take, prior, inverse] [using, use, identity, map, figure, neural, different, transform, preprint, arxiv, used, single]
Batched Gaussian Process Bandit Optimization via Determinantal Point Processes
Tarun Kathuria, Amit Deshpande, Pushmeet Kohli


[probability, subset] [regret, function, algorithm, est, ucb, choose, maximization, set, chosen, bound, bandit, theorem, consider, problem, will, appendix, proof, defined, improvement, cumulative, show, just, confidence, greedy, case, upper, expected, choosing, let] [batch, optimization, sampling, dpp, bucb, size, bayesian, posterior, determinantal, iteration, fastxml, large, method, batched, machine, compute, extreme, due, experiment] [via, can, one, also, first, arg, much, popular] [kernel, point, based, gaussian, maximum, entropy, two, provided, mean, selected, given, selection] [process, information, search, value, robot, next] [better, performance, using, presented, work, dataset, perform]
The Power of Optimization from Samples
Eric Balkanski, Aviad Rubinstein, Yaron Singer


[number, many, obtained, influence, constraint, subset] [submodular, set, algorithm, function, optimal, obtain, monotone, contribution, lemma, learning, show, loss, since, element, best, bound, let, assume, greedy, expected, consider, cardinality, property, case, learnable, defined, polynomially, exists, concentration, impossibility, known, return, theorem, least, inequality, proof, tight, drawn, feasible, bounded] [log, approximation, optimization, marginal, size] [good, first, can, solution, analysis, second, main, also, high, small] [curvature, random, bad, distribution, given, two, result, construct, data] [value, model, poor, information, goal, optimize, decision] [learned, figure, using, work, better, consists, learn, used]
Single Pass PCA of Matrix Products
Shanshan Wu, Srinadh Bhojanapalli, Sujay Sanghavi, Alexandros G. Dimakis


[number, spark] [algorithm, optimal, show, set, let, appendix, theorem, now, smaller, case, streaming, may, complexity, ratio, observe, online, annual] [approximation, compute, size, step, pass, sampling, large, log, distributed, datasets, machine, standard] [matrix, spectral, can, norm, rank, low, error, product, sketch, lela, column, one, sketching, also, sampled, dot, rescaled, first, qij, synthetic, cone, computes, pca, kai, note, kbj, entry, vector, svd, symposium, analysis] [two, given, embedding, sample, random, data, important] [information, angle, desired] [figure, better, using, compared, perform, single, three, use, dataset, performance, preprint, arxiv, outperforms, compare]
Efficient Nonparametric Smoothness Estimation
Shashank Singh, Simon S. Du, Barnabas Poczos


[probability, number] [since, theorem, let, general, may, function, bound, case, class, minimax, smoothing, learning] [inner, parameter, processing, efficient, convergence, variance, large, divergence] [can, error, norm, computational, denote, related, one, also, order, suppose] [sobolev, distance, estimator, density, test, estimation, nonparametric, estimated, asymptotic, estimating, section, statistical, squared, true, kernel, functionals, estimate, smoothness, functional, mean, theoretical, data, fourier, computationally, multivariate, bias, testing, journal, denotes, based, distribution, space, sample, gaussians, statistically, suggest, uniform] [information, time] [work, figure, neural, use, used, different, using]
Sequential Neural Models with Stochastic Layers
Marco Fraccaro, Søren Kaae Sønderby, Ulrich Paquet, Ole Winther


[average, obtained, structure, report] [learning, dependence, set, will, depends, show] [inference, stochastic, variational, srnn, posterior, approximation, deterministic, latent, log, blizzard, music, timit, parameterization, sequential, ssm, approximate, polyphonic, depend, parameterized, term, elbo, monte, compute, likelihood, resq, vrnn, storn, intractable] [can, also, one, see] [distribution, data, test, mean, nonlinear, gaussian, space, section, true, two, done, given] [model, state, time, prior, modeling, information, therefore, predictive, raw, temporal, uncertainty, future] [network, neural, using, recurrent, figure, rnn, sequence, generative, used, speech, hidden, use, improve, table, rnns, shown, residual, deep, output]
Stochastic Structured Prediction under Bandit Feedback
Artem Sokolov, Julia Kreutzer, Stefan Riezler, Christopher Lo


[pairwise, criterion, structure, present, partial, number, obtained] [learning, algorithm, loss, bandit, constant, function, expected, complexity, convex, optimal, minimization, set, best, smallest, lipschitz, smaller, since, sfo, will] [stochastic, convergence, gradient, machine, optimization, objective, standard, development, evaluation, respect, variance, chunking, iteration, full, update, large, analyzed] [preference, can, norm, analysis, following, interactive, sampled, note] [data, squared, given, asymptotic, numerical, practical, based, distribution, result, exponential, estimate] [feedback, information, model, speed, form, motivated] [task, structured, output, prediction, translation, feature, use, performance, training, approach, input, predicted, table, using, presented, sequence, gold, seen, three]
Bayesian Intermittent Demand Forecasting for Large Inventories
Matthias W. Seeger, David Salinas, Valentin Flunkert


[forecasting, negative, larger] [risk, function, learning, smoothing, defined, algorithm, day, even] [likelihood, latent, large, intermittent, log, stock, inference, negbin, approximation, bayesian, term, approximate, forecast, ets, stage, optimization, inner, bursty, quadratic, expectation, machine, method, run, logistic, compute, posterior, implementation] [demand, linear, generalized, item, related, short, can, running, one] [gaussian, section, data, exponential, well, automated, given, laplace, two, space] [time, state, kalman, series, range, medium, poisson, model, mode, prior] [use, work, transfer, figure, used, prediction, novel, dataset, several, outperforms, training, approach]
Automatic Neuron Detection in Calcium Imaging Data Using Convolutional Networks
Noah Apthorpe, Alexander Riordan, Robert Aguilar, Jan Homann, Yi Gu, David Tank, H. Sebastian Seung


[false, david, detection, many, detect, number] [may, learning, active, precision, set, algorithm] [datasets, processing] [calcium, imaging, also, noise, one, analysis, require] [basis, test, automated, true, selection, two] [human, roi, time, series, temporal, cell, neuron, cortex, took, activity, brain] [network, image, convnet, output, ground, dataset, convolutional, pixel, truth, mec, accuracy, figure, conv, neural, training, input, using, use, trained, score, used, original, per, recall, transiently, supervised, annotation, convnets, video, stack, shown, visual, predicted, single, inactive, computer, col, traditional]
Minimax Optimal Alternating Minimization for Kernel Nonparametric Tensor Learning
Taiji Suzuki, Heishiro Kanagawa, Hayato Kobayashi, Nobuyuki Shimizu, Yukihiro Tagami


[number, normalized] [function, algorithm, minimization, learning, optimal, minimax, problem, convex, conference, achieves, set, complexity, theorem, bounded, let, rate, bound] [method, convergence, optimization, processing, bayesian, initial, parameter, term, factor] [tensor, amp, alternating, linear, analysis, assumption, can, regression, solution, multitask, one, error, also, regularization, component, following, consumer, completion, computational, restaurant, rank, good, suppose, matrix, decomposition, sufficiently, optimality] [true, estimator, kernel, nonparametric, data, statistical, gaussian, procedure, given, theoretical, estimation, nonlinear, based, bayes, space, rkhs, generalization, sample] [model, information, process] [different, shown, task, several, updated, neural, performance, using, input, international, similarity, used]
“Congruent” and “Opposite” Neurons: Sisters for Multisensory Integration and Segregation
Wen-Hao Zhang, He Wang, K. Y. Michael Wong, Si Wu


[probabilistic, connected, whose, present, number, becomes, strength] [concentration, optimal, since, rate] [processing, posterior, bayesian, decentralized, large] [can, vector, via, first, computational, one, also] [two, distribution, given, mean, gaussian, journal, exp, length] [opposite, congruent, cue, multisensory, integration, information, module, von, brain, neuron, disparity, model, segregation, vestibular, heading, neuroscience, reciprocal, mstd, direction, preferred, neuronal, sensory, stimulus, measured, geometrical, tuning, direct, experimental, firing, nature, indirect, prior, whether, interpretation, vip, whereas, reciprocally, form, implement, offset, circular, integrates] [visual, neural, connection, different, network, feedforward, similar, input, predicted, single, multiple, achieved]
Direct Feedback Alignment Provides Learning in Deep Neural Networks
Arild Nøkland


[path, connected, disconnected] [learning, will, let, provide, minimize, loss, even, show, theorem] [update, gradient, method, machine, initial, derivative, steepest] [error, can, signal, zero, first, also, symmetric, matrix, see, order, one, initialization, asymmetric] [random, fixed, test, principle, data] [feedback, forward, direction, direct, biologically, target, neuron, indicate, descending, experimental, directly, learns] [hidden, layer, training, dfa, tanh, network, neural, deep, used, output, weight, able, trained, different, alignment, mnist, input, figure, calculated, plausible, table, activation, performed, convolutional, dropout, work, using]
End-to-End Kernel Learning with Supervised Convolutional Kernel Networks
Julien Mairal


[present, obtained, introduced, number] [learning, set, function, may, loss, consider, algorithm, obtain, scheme, defined, idea, since, appendix, rate, make, every] [gradient, respect, optimization, large, size, stochastic, method, parameter] [linear, also, matrix, one, can, local, subspace, error, projection, indeed, onto] [kernel, data, given, two, rkhs, classical, space, test, positive, definite, gaussian] [model, optimizing, multilayer, hierarchical] [image, convolutional, network, neural, deep, training, map, using, use, pooling, layer, approach, without, classification, supervised, svhn, perform, previous, unsupervised, representation, natural, pixel, achieved, prediction, spatial, single, several, consists]
On Multiplicative Integration with Recurrent Neural Networks
Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, Ruslan R. Salakhutdinov


[number, block, many, level] [learning, general, achieves, show, almost, best] [gradient, term, size, large, extra, optimization] [can, order, also, second, one, first, formulation, following, product, vanilla, computational, matrix, note] [additive, test, two] [integration, model, information, time, simple, baseline, evaluated, state, effect, design] [multiplicative, rnn, building, neural, recurrent, compared, different, preprint, arxiv, using, hidden, performance, validation, bottom, table, use, training, bpc, character, speech, language, wxk, better, lstm, referred, rnns, including, lstms, initialized, final, task, top, outperforms, reported, gating, yoshua, figure, several, perform, without]
Selective inference for group-sparse linear models
Fan Yang, Rina Foygel Barber, Prateek Jain, John Lafferty


[present] [set, interval, confidence, rate, hypothesis, case, lemma, theorem, let, give, since, coverage, setting, might, show] [inference, method, compute, size, develop, quadratic] [group, can, regression, truncated, linear, lasso, hard, also, projection, thresholding, iterative, sparse, subspace, see, main, lee, signal, first, vector, via] [selection, dirl, selected, given, fixed, testing, null, test, stepwise, distribution, selective, true, loftus, taylor, data, conditioning, result, procedure, section, normal, injury, density, percentile, two, death, physically, income, jonathan, interested, statistical, grouped] [model, forward, event, whether, response, information, write, design] [used, like, work, perform, using, region, without]
Quantum Perceptron Models
Ashish Kapoor, Nathan Wiebe, Krysta Svore


[probability, number, find, thus, misclassified, possible, either] [version, algorithm, theorem, learning, set, complexity, provide, margin, query, assume, bound, online, will, define, let, since, problem, now, mistake, may, finding, access, consider, case, make, follows, known, lemma, element, oracle, exists, proof] [machine, unitary] [vector, can, also, one, need, amplitude, following, first, computational, computation] [quantum, perceptron, space, classical, data, given, two, distribution, sample, marked, corresponding, statistical, point, separated, amplification, address, needed] [search, state, model, therefore, hyperplane, hyperplanes, new, scaling] [training, using, unit, feature, use, used, computer, traditional, figure, instead, preprint, arxiv]
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds
Hongyi Zhang, Sashank J. Reddi, Suvrit Sra


[number] [theorem, convex, algorithm, complexity, problem, dependence, function, learning, corollary, analyze, fast, set, show, bounded, rate, defined, proof] [optimization, gradient, stochastic, svrg, convergence, geodesically, variance, nonconvex, computing, dominated, strongly, reduction, ifo, sectional, method, parallel, expx, rsvrg, siam, batch, key, inner, machine] [analysis, matrix, linear, global, vector, also, can, reduced, note, following, first, leading, solving, one, eigenvector, psd, symmetric, solution] [riemannian, manifold, exponential, curvature, geodesic, space, definite, nonlinear, transport, positive, two, euclidean, geometric, mean, tangent, journal, metric] [required, new, option] [map, using, accuracy, neural, use, shown, work, different, similar]
Fairness in Learning: Classic and Contextual Bandits
Matthew Joseph, Michael Kearns, Jamie H. Morgenstern, Aaron Roth


[probability, definition, number, many, attainable] [algorithm, fair, arm, regret, learning, bandit, kwik, contextual, fairness, bound, confidence, round, problem, set, active, will, classic, class, lemma, show, kwikt, optimal, prove, theorem, unknown, expected, least, lower, learner, setting, case, choose, context, function, interval, known, dependence, unfair, every, upper, study, now, let, play, implies, rjt, uti, general, constant, sti, give, may] [stochastic, machine, michael, initialize] [can, linear, technical, also, one, high, arg] [section, construct, distribution, polynomial, random, important] [reward, must, history, payoff, time, feedback] [without, use, prediction, sequence]
Swapout: Learning an ensemble of deep architectures
Saurabh Singh, Derek Hoiem, David Forsyth


[number, block, average, represent, ensemble] [set, schedule, may, show, bernoulli, learning, general] [stochastic, inference, method, deterministic, batch, gradient, standard, performs, normalization, replace] [can, width, error, one, note, regularization] [random, corresponding, comparable, increasing, well, mean, equation, two] [model, form, experimental, early, others] [swapout, network, dropout, resnet, layer, residual, training, deep, use, trained, performance, depth, unit, table, using, convolutional, different, randomly, neural, wider, used, work, skip, better, output, outperforms, several, improves, similar, pooling, dropping, input, improve, outperform, relatively, recent]
A primal-dual method for conic constrained distributed optimization problems
Necdet Serhat Aybat, Erfan Yazdandoost Hamedani


[topology, graph, node, among, definition] [let, convex, algorithm, set, defined, rate, consider, theorem, dom, function, problem, define, optimal, conference, will, private, assume, svm, lemma] [distributed, consensus, optimization, convergence, xki, argmin, decentralized, pda, primal, constrained, iteration, computing, converges, saddle, computed, university, central, rmi, step, rki, angelia, iterate, machine] [can, following, local, ieee, suppose, denote, vector, also, connectivity, linear, min, assumption, solving, solution, note, one, signal, conic, global, matrix, computational] [given, point, denotes, section, journal, corresponding] [communication, dynamic, decision, information] [using, network, proposed, static, generated, sequence]
Minimax Estimation of Maximum Mean Discrepancy with Radial Kernels
Ilya O. Tolstikhin, Bharath K. Sriperumbudur, Prof. Bernhard Schölkopf


[probability, present, theory, independence, fact] [minimax, theorem, lower, rate, bound, proof, set, get, optimal, dependence, let, learning, will, sup, conference, bounded, known, inequality, constant, inf, case, defined, class, exist, problem] [method, machine, distributed] [can, following, condition, note, optimality, need, main, second, first, order] [kernel, mmdk, estimation, based, gaussian, mmd, two, empirical, mean, result, radial, space, section, estimator, independent, universal, distribution, borel, shi, embedding, random, finite, measure, test, reproducing, dimensional, hilbert, sample, positive, journal, nonparametric, distance] [] [using, work, neural, used]
What Makes Objects Similar: A Unified Multi-Metric Learning Approach
Han-Jia Ye, De-Chuan Zhan, Xue-Min Si, Yuan Jiang, Zhi-Hua Zhou


[base, linkage, social, number, lml] [learning, set, instance, loss, function, convex, get, clearly, general, property, will] [latent, gradient, operator, datasets, supplementary, usually, proximal] [can, one, local, regularizer, global, unified, locality, related, ambiguous, overall, also, lovs] [metric, distance, based, data, specific, type, word, common, two] [framework, others, form] [different, semantic, similarity, multiple, mnn, similar, used, feature, single, learned, pair, semantics, lrgs, spatial, triplet, classification, discovering, table, dissimilar, performance, physical, various, pattern, hidden, figure, reflects, using, weak, approach, visualization]
A Bayesian method for reducing bias in neural representational similarity analysis
Mingbo Cai, Nicolas W. Schuck, Jonathan W. Pillow, Yael Niv


[structure, average, obtained] [assume, ratio, exists] [standard, bayesian, method, experiment, likelihood, variance, averaging] [matrix, can, noise, one, recovered, condition, also, analysis, imaging, low, diagonal, noisy, much] [covariance, bias, data, estimated, true, estimation, estimate, two, gaussian, corresponding, point, based, space] [rsa, fmri, representational, activity, fig, voxels, simulated, correlation, snr, design, directly, model, time, encoding, response, across, cognitive, temporal, princeton, cortex, brain, human, individual, measured, experimental, neuroscience, sensory, roi, autocorrelation, monkey, process] [similarity, neural, different, pattern, similar, visual, structured, task, using, used, map, spatial, higher, figure, learned, representation, applied, shared]
Learning in Games: Robustness of Fast Convergence
Dylan J. Foster, zhiyuan li, Thodoris Lykouris, Karthik Sridharan, Eva Tardos


[number, probability, strong, social, theory, many, include] [regret, property, learning, cost, algorithm, hedge, price, fast, anarchy, bound, optimal, consider, satisfy, show, bandit, rate, realized, achieve, expected, even, shifting, setting, costi, satisfies, appendix, loss, utility, round, class, welfare, conference, online, cti, optimistic, will, repeated, turnover, function, best, close, minimization, rvu, implies, annual] [approximate, convergence, smooth, approximation, factor, expectation, full, requires, efficient, log] [low, can, small, proposition, satisfying, high, analysis, also] [population, section, converge, efficiency, result, well, smoothness] [game, player, action, information, feedback, dynamic, time, simple, new] [using, use, sequence, improve]
Optimal spectral transportation with application to music transcription
Rémi Flamary, Cédric Févotte, Nicolas Courty, Valentin Emiya


[fundamental, sum, number] [problem, cost, will, set, conference, optimal, every, may, defined, learning] [music, divergence, optimisation, method, fit, respect, energy] [spectral, matrix, frequency, harmonic, note, can, dictionary, transportation, ost, unmixing, oste, dirac, plca, transcription, cij, one, vector, small, musical, min, entry, transported, also, real, ostg, piano, local, first, explained, order, amplitude, pitch, ieee, decomposition] [section, data, measure, sample, given, particular, tij, based, distribution, metric, estimate, two] [template, time, new, form, placed, target, simple, model] [using, proposed, performance, consists, approach, international, frame, three, compared, used, without, audio, recognition]
Efficient Neural Codes under Metabolic Constraints
Zhuo Wang, Xue-Xin Wei, Alan A. Stocker, Daniel D. Lee


[constraint, monotonic, theory, david, department] [optimal, cost, function, case, problem, max, general, scheme, provide, active, rate, constant] [efficient, energy, university, due, detailed, processing, large] [noise, can, solution, magnitude, low, one, see, observed, linear] [mutual, curve, gaussian, distribution, two, fisher, uniform, population, mean, difference, limit, redundancy] [metabolic, response, coding, information, tuning, model, stimulus, neuron, code, ktotal, poisson, rmax, prior, range, dash, analytical, time, sensory, framework, limited, alan, simple, take, nature, assuming, advantage, substantial, early, focused, current, kthre] [neural, input, visual, single, different, pair, previous, natural, similar, using, red, available]
Multi-view Anomaly Detection via Robust Probabilistic Latent Variable Models
Tomoharu Iwata, Makoto Yamada


[detection, number, probabilistic, find, indicates, acm, represents, variable, obtained, probability, assignment] [instance, rate, since, precision, conference, private, set, every, assume] [latent, method, parameter, inference, likelihood, select, draw] [anomaly, auc, vector, can, view, projection, inconsistent, low, xnd, pcca, analysis, nth, anomalous, inconsistency, movie, ocsvm, matrix, noise, one, ieee, snd] [data, dimensionality, given, based, gaussian, two, mixture, estimated, joint, dirichlet, robust, space, canonical] [model, across, information, observation, process, assumes, infer, integrating, value, correlation] [proposed, different, using, used, generated, multiple, single, use, figure, performance, calculated, international, score, shared]
Eliciting Categorical Data for Optimal Aggregation
Chien-Ju Ho, Rafael Frongillo, Yiling Chen


[interface, aggregation, partition, report, probability, number, scoring, elicit, eliciting, categorical, threshold, often, truthful, ppd, aggregating, elicitation, correctly, aggregate, knowledge, voting, payment, strictly] [optimal, conference, binary, proper, consider, problem, obtain, choose, setting, focus, learning, lemma, will, set, get, assume, drawn] [posterior, bayesian, method, full] [can, principal, also, one, observed, noisy, global, error] [data, sample, distribution, given, two, space, robust, theoretical] [information, agent, belief, model, design, maximize, prior, whether, answer, simple, goal, heuristic, cell, framework, count, bonus] [ground, prediction, figure, truth, question, work, single, task, predicting, using, use]
The non-convex Burer-Monteiro approach works on smooth semidefinite programs
Nicolas Boumal, Vlad Voroninski, Afonso Bandeira


[theory, constraint, number] [theorem, almost, cost, optimal, set, convex, problem, satisfy, bound, function, may, hold, general, show, bounded, feasible] [optimization, smooth, approximate, extreme, convergence, solve, university, size, method, quadratic, siam, step] [local, global, optimality, critical, rank, matrix, can, min, one, necessary, also, optimum, via, synchronization, globally, linear, burer, spurious, main, phase, sdp, hessf, note, rtr, computational, see, hessian] [point, riemannian, space, positive, journal, mathematical, two, result, tangent, manifold, robust, nonlinear, mathematics, important] [search, form] [compact, arxiv, preprint, using]
Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods
Andrej Risteski, Yuanzhi Li


[ising, partition, pairwise, ferromagnetic, graph, degree, many, fact, programming, science] [will, theorem, relaxation, function, case, constant, prove, convex, get, since, consider, provide, general, known, bound, lemma, implies, define, show, hardness, problem, upper, wish, satisfies, annual, max, learning] [variational, factor, approximation, log, optimization, due, approximate, calculating, machine, efficiently, term, standard] [can, rounding, one, following, provable, matrix, note, first, paper, main, symposium, computational] [entropy, distribution, maximum, exponential, principle, mean, given, based, section, comparable, exp, provides, family, journal, covariance, functionals] [information, value, model, time, optimizing] [multiplicative, work, produce, using, previous]
Deep Neural Networks with Inexact Matching for Person Re-Identification
Arulkumar Subramaniam, Moitreya Chatterjee, Anurag Mittal


[normalized, connected, partial, false, number] [learning, conference, set, every] [inexact, large, challenging] [ieee, also, first, one, small, existing, second] [two, metric, fused, test, given, space, corresponding] [model, search, correlation, grid, across] [person, matching, image, training, deep, feature, ahmed, relu, computer, layer, dataset, wider, conv, different, pattern, neural, performance, vision, input, convolution, maxpool, similarity, multiple, international, qmul, representation, architecture, output, network, fully, using, use, table, propose, viewpoint, proposed, work, learn, gallery, used, similar, labeled, pair, illumination]
A Powerful Generative Model Using Random Weights for the Deep Image Representation
Kun He, Yan Wang, John Hopcroft


[chinese, create, quality, purely, science, weighting] [loss, may, even, lower] [gradient, select, method, argmin, university] [high, first, row, following, success, noise] [random, reference, based, well] [new, prior, model, white] [image, deep, using, vgg, style, network, ranvgg, texture, trained, neural, representation, visualization, convolutional, content, weight, untrained, feature, layer, figure, gatys, pretrained, natural, use, work, original, artistic, reconstruction, inversion, several, different, training, input, architecture, higher, pooling, slightly, compare, perceptual, cnn, similar, better, transfer, understanding, synthesize, inverting, reconstruct, synthesized, impact, understand, generative, activation, alexnet, three, compared, synthesis, mahendran, starry]
Optimistic Bandit Convex Optimization
Scott Yang, Mehryar Mohri


[average, structural] [loss, regret, convex, let, algorithm, bound, fbt, will, bandit, assume, lipschitz, lemma, smoothed, gbt, ftrl, barrier, function, ptimistic, set, regt, bounded, play, since, bco, always, learner, theorem, online, dekel, admits, kxt, cost, optimistic, round, surrogate, saha, best, improved, arbitrary, idea, rft, flaxman, feasible, get, admit, known, learning, subgradient] [gradient, optimization, key, stochastic, variance, around, showed, factor] [can, following, denote, one, also, analysis, dimension] [point, estimate, computationally, result, polynomial, given] [time, action, averaged, information, predictable, upon, new, closed] [use, sequence, work, predicting, original, using, prediction]
A Probabilistic Programming Approach To Probabilistic Data Analysis
Feras Saad, Vikash K. Mansinghka


[probabilistic, cgpms, cgpm, programming, logpdf, simulate, nan, member, bayesdb, create, zrk, satellite, composable, directed, cluster, orbital, metamodeling, variable, purely, override, crosscat, program, categorical, launch, law, panel, platform, subset, acyclic, ample] [return, query, learning, assume, composite, may, algorithm, define, function, set, general] [latent, importance, bayesian, log, machine, inference, likelihood, sampling, latents] [analysis, can, computational, paper, missing] [data, statistical, population, joint, section, sample, random, density, distribution, numerical, type, well, given, multivariate] [model, trace, baseline, infer, new, evidence, code] [using, generative, input, output, figure, table, discriminative, language, network, weight, used, generate]
Learning under uncertainty: a comparison between R-W and Bayesian approach
He Huang, Martin Paulus


[stable, probability, thus, likely, contingency, influence] [learning, rate, optimal, will, function, lower, chosen, expected, always] [bayesian, parameter] [low, optimality, high, can, condition, also, frequency, significantly, linear] [estimation, based, estimated, positive, consistent, two, data, maximum] [model, volatility, belief, decision, reward, dbm, expert, stationarity, environmental, choice, inverted, behavioral, lose, poor, change, wsls, option, simulation, fig, shift, simulated, behavior, value, correlation, search, examine, explaining, differ, rewarded, across, target, environment, significant, information, positively, current] [three, different, shown, using, figure, shape, use, accuracy, visual, approach, higher, relationship, task, used]
How Deep is the Feature Analysis underlying Rapid Visual Categorization?
Sven Eberhardt, Jonah G. Cader, Thomas Serre


[number, correctly, possible, total] [complexity, may, confidence, set, best, fast, considered, function] [processing, experiment, computed] [also, analysis, computational] [data, underlying, well, maximum, corresponding, boundary] [human, model, correlation, decision, animal, individual, time, response, behavioral, issn, longer, found, experimental, answer, presentation, correct, hierarchical, timing, stimulus, psychophysics, whether, past] [visual, accuracy, deep, categorization, rapid, depth, shown, figure, image, intermediate, used, recognition, object, using, layer, higher, neural, convolutional, deeper, different, recent, vision, feature, natural, imagenet, fixation, performance, computer, classification, increased, cross, top, work, paradigm]
Optimal Architectures in a Solvable Model of Deep Networks
Jonathan Kadmon, Haim Sompolinsky


[number, rule, recursion, wide, level, cluster, total, average] [optimal, learning, will, function, assume, show, may, contribution, binary, set, consider, simplicity] [processing, initial, size, variance, due, large, central, convergence] [error, noise, noisy, noiseless, one, sparsity, load, can, lopt, linear, also, matrix, signal] [fixed, mean, two, limit, finite, random, infinitely, infinite] [model, neuron, sensory, information, simple, activity, system, behavior, state, time] [layer, readout, deep, input, network, neural, single, overlap, different, performance, intermediate, perform, architecture, figure, field, classification, using, feedforward, use, understanding, depth, andrew, training, pattern, seen]
Coresets for Scalable Bayesian Logistic Regression
Jonathan Huggins, Trevor Campbell, Tamara Broderick


[subset, number, clustering, thus, obtained, expect, probability, many] [algorithm, bound, let, upper, lemma, streaming, setting, theorem, may, show, conference, expected, since, constant, lower, provide, smaller, idea] [bayesian, inference, logistic, posterior, mcmc, variational, parameter, size, large, ebspam, scalable, approximation, distributed, likelihood, datasets, full, standard, sampling, monte, machine, processing, langevin, develop] [small, regression, can, synthetic, also, proposition, computational, one, via] [data, coreset, sensitivity, coresets, random, subsampling, mean, construction, given, inary, maximum, bayes, indep, section, important] [time, prior, choice, artificial, model, demonstrate, information] [used, dataset, using, use, performance, approach]
Sampling for Bayesian Program Learning
Kevin Ellis, Armando Solar-Lezama, Josh Tenenbaum


[program, ample, rogram, edit, list, sat, probability, number, solver, constraint, manipulation, counting, string, tilt, programming, often, many, structure, either, recursive, editing, parity, assignment, thus, probabilistic, ashish, held] [learning, set, problem, algorithm, uniformly, learner, upper, satisfy, will, least, lower] [sampling, posterior, approximate, log, processing, bayesian] [can, sketch, also, small, one, see, much, arbitrarily] [sample, random, distribution, space, length, consistent, data] [model, new, correct, search, time, decision, take, framework, prior, goal, write] [figure, text, approach, work, synthesis, like, using, use, neural, learn, description, used, different, synthesizing, training]
Bayesian optimization under mixed constraints with a slack-variable augmented Lagrangian
Victor Picheny, Robert B. Gramacy, Stefan Wild, Sebastien Le Digabel


[constraint, valid, wmin, panel, many, involving, variable] [inequality, problem, optimal, function, show, improvement, may, best, set, expected, let, surrogate, observe, unknown, known, version] [optimization, slack, constrained, mixed, objective, blackbox, bayesian, method, albo, supplementary, progress, augmented, involves, lagrangian, implementation, step, efi, fmin, initial, comparators, ycj, nlopt, monte, standard] [can, global, via, one, local, plot, formulation, first, solution, see, min, solving] [random, distribution, two, statistical, gaussian, mean, section, provided, proportion, mathematical, numerical, empirical] [equality, search, new, predictive, closed, literature, value, unconstrained, design, acquisition, modeling] [original, figure, several, work, input, shown]
Composing graphical models with neural networks for structured representations and fast inference
Matthew Johnson, David K. Duvenaud, Alex Wiltschko, Ryan P. Adams, Sandeep R. Datta


[graphical, structure, variable, probabilistic, represent] [learning, general, consider, algorithm, bound, provide, max, function] [variational, latent, inference, svae, conjugate, gradient, objective, log, iid, compute, lsvae, respect, efficient, fit, factor, stochastic, conjugacy, optimization, flexible, approximate] [can, linear, also, see, local, mouse] [family, exponential, data, discrete, gaussian, mean, nonlinear, section, gmm, two, manifold, density, estimate] [model, observation, modeling, dynamical, lds, state, prior, rich, framework, system, continuous, behavior] [natural, neural, structured, recognition, field, image, video, using, used, use, network, figure, generative, speech, depth, autoencoders, approach, deep, learn, preprint, arxiv]
Safe Policy Improvement by Minimizing Robust Baseline Regret
Mohammad Ghavamzadeh, Marek Petrik, Yinlam Chow


[probability, number, represents, represent] [problem, algorithm, improvement, optimal, regret, return, may, set, theorem, function, show, appendix, example, loss, uncertain, analyze, max, randomized, even, bound, will, guarantee, conference, improved, consider, minimizing] [optimization, approximate, deterministic, standard, energy, due, batch, computing] [error, can, solution, solving, min, following, arg, formulation, main, guaranteed, significantly, note, one, also, good, high] [robust, section, given, true, data, based, space] [policy, baseline, model, uncertainty, transition, state, mdp, safe, markov, decision, reward, action, arbitrage, simple, conservative, directly] [performance, approach, use, using, better, figure, learned, propose, used]
Examples are not enough, learn to criticize! Criticism for Interpretability
Been Kim, Oluwasanmi O. Koyejo, Rajiv Khanna


[number, subset, represent, present] [function, submodular, set, let, greedy, consider, known, may, algorithm, study, theorem, learning, corollary, case, submodularity, cost, max, selecting, problem, class, show, example, selects] [bayesian, optimization, respect, large] [condition, can, also, diagonal, matrix, local, linear, following, denote] [criticism, kernel, prototype, data, given, selection, interpretability, nearest, proto, two, measure, selected, well, witness, important, positive, sample, useful, mmd, maximum, discrepancy, space, together] [model, human, complex, raw, framework, subject, interpretable] [using, performance, dataset, image, compared, approach, improve, use, work, similarity, applied, used, shown]
A Locally Adaptive Normal Distribution
Georgios Arvanitidis, Lars K. Hansen, Søren Hauberg


[cluster, structure, number, probabilistic, find, clustering, denoted, definition, negative] [consider, learning, adaptive, algorithm, least, defined, function, define, problem, near, constant, changing, conference, known, since, scheme] [likelihood, standard, compute, gradient, normalization, monte, carlo, machine] [can, linear, local, matrix, synthetic, first, principal, low, analysis] [land, riemannian, distribution, data, manifold, mean, normal, metric, mixture, distance, covariance, maximum, sleep, locally, gaussian, space, nonlinear, density, point, tangent, geodesic, measure, logarithm, gmm, eeg, corresponding, exp, euclidean, entropy, given, logx, estimate, kernel, estimation] [model, intrinsic, subject] [using, use, learned, figure, smoothly, generative, work, better, neural]
One-vs-Each Approximation to Softmax for Scalable Estimation of Probabilities
Michalis Titsias RC AUEB


[probability, obtained, categorical, sum, notice, number, pairwise, probabilistic] [bound, lower, remaining, since, function, set, learning, consider, multiclass, class, cost, conference, maximizing, label, wish, will] [log, stochastic, likelihood, ove, approximate, large, efk, variational, method, parameter, sgd, gradient, optimization, bouchard, approximation, maximized, scalable, efm, size, softmaxk, subsample, bibtex, full] [exact, can, also, one, following, small, error, product] [data, estimation, based, maximum, estimated, associated, subsampling, section, test, denotes] [value, form, model, evolution, maximize, simple] [softmax, training, classification, figure, neural, score, scale, single, language, using, several, input, shown, dataset]
Sub-sampled Newton Methods with Non-uniform Sampling
Peng Xu, Jiyan Yang, Farbod Roosta-Khorasani, Christopher Ré, Michael W. Mahoney


[block, partial, number, solver, among] [algorithm, complexity, problem, learning, convex, defined, lemma, randomized, achieve, dependence, scheme, assume, consider, show, least] [sampling, newton, convergence, approximation, machine, size, iteration, method, standard, logistic, approximate, optimization, log, augmented, michael] [leverage, matrix, can, ssn, condition, hessian, local, norm, linear, solution, see, following, error, order, one, tsolve, running, plevss, rnormss, exact, tconst, second] [uniform, section, given, well, two, ridge, distribution, construct, independent, based, locally] [time, form, information] [used, table, using, different, score, work, figure, preprint, arxiv, better, use, shown, per, performance]
Guided Policy Search via Approximate Mirror Descent
William H. Montgomery, Sergey Levine


[constraint, manipulation, obtained, either] [learning, cost, algorithm, bound, since, show, conference, case, appendix, surrogate, unknown, convex, provide] [step, method, size, mirror, descent, dkl, optimization, gradient, convergence, initial, requires, dual, sampling, approximate, variant, iteration, end] [local, can, global, also, linear, small, projection, analysis, arg, following] [nonlinear, section, based] [policy, search, guided, prior, mdgps, peg, trajectory, state, epi, badmm, new, typically, simple, directly, form, adjustment, corresponds, reinforcement, optimize, lqr, complex, robotic, control, hole] [using, supervised, use, neural, work, deep, performance, task, train, used, shown, different, international, better, per]
Diffusion-Convolutional Neural Networks
James Atwood, Don Towsley


[graph, node, dcnns, diffusion, cora, pubmed, relational, probabilistic, present, structure, breadth, mutag, graphical, definition, wjk, represented, ked, citation, either] [learning, set, label, provide, defined, function] [standard] [can, matrix, also, one, tensor, via, power, paper] [kernel, data, exponential, section, well, conditional, proportion, test, mean, curve, basis] [model, search, information, baseline, offer] [classification, accuracy, dcnn, performance, representation, neural, several, training, input, work, figure, prediction, validation, network, learned, used, convolutional, table, effective, including, consists, using, applied, deep, labeled, dataset, implemented, investigate, predict]
Solving Marginal MAP Problems with NP Oracles and Parity Constraints
Yexiang Xue, zhiyuan li, Stefano Ermon, Carla P. Gomes, Bart Selman


[number, probability, saa, lbp, sum, represent, counting, probabilistic, strength, parity, coupling, xor, weighted, carla, pairwise, cascade, replicates, solver, find, node] [algorithm, problem, set, max, function, will, conference, theorem, return, bound, let, case, show, learning, upper, binary, lemma, optimal, since, alexander] [marginal, mixed, optimization, log, inference, approximation, intractable, machine, approximate, solve] [can, one, suppose, solution, first, also, exact, solving] [two, independent, sample, fixed, provides, result, well, random, selected] [mmap, found, median, information, time, value, belief, artificial] [map, figure, different, network, randomly, approach, use, performance, neural, original, proposed, international]
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
Tolga Bolukbasi, Kai-Wei Chang, James Y. Zou, Venkatesh Saligrama, Adam T. Kalai


[fact, many, crowd, number, seed, present, identify, find, quality, normalized] [set, learning, algorithm, will, appendix, show, since, consider, define, might, may] [machine, parameter, step, standard] [also, can, one, vector, subspace, onto, projection, first, much, significantly] [gender, word, embedding, bias, embeddings, two, neutral, analogy, specific, grandfather, closer, given, debiasing, grandmother, female, well, woman, male, news, useful, section, preserving, described] [direction, indirect, implicit, whether, direct, across, exhibit] [figure, pair, use, used, language, original, different, unit, similar, generated, evaluate, man, computer, using, trained, shown, ten, classifier, natural, reducing]
The Limits of Learning with Missing Data
Brian Bullins, Elad Hazan, Tomer Koren


[number, attribute, probability, enough, attainable] [let, loss, algorithm, lower, learning, theorem, observe, absolute, set, learner, regressor, proof, show, prove, general, setting, may, example, precision, bound, exists, since, function, subgradient, conference, hinge, lemma, hazan, expected, lao, convex, consider, arbitrary, follows, koren, achieve, choose, always, complexity] [machine, supplementary, efficient] [regression, can, missing, linear, one, main, first, see, certain, arbitrarily, error, low, whereby, min, hard, gap, vector, necessary] [distribution, squared, two, data, given, result, sample, ridge, limit] [limited, information, determine, observation, framework] [classification, training, work, output, shown, similar, international, used, per]
Leveraging Sparsity for Efficient Submodular Data Summarization
Erik Lindgren, Shanshan Wu, Alexandros G. Dimakis


[graph, runtime, number, threshold, subset, probability, representative] [set, greedy, algorithm, problem, function, submodular, sparsified, optimal, element, benefit, facility, sparsification, since, lsh, consider, quickly, personalized, covering, get, will, sensitive, instance, constant, learning, summarization, show, chosen, guarantee, theorem, lower, fast, smaller, lazy] [approximation, size, log, stochastic, method, large, approximate, optimization, factor, datasets, parameter] [matrix, can, one, solution, see, sparsity, pagerank, much, exact, also, small, largest, dot, product, locality, analysis, following, movielens] [neighbor, nearest, data, random, sample] [value, location, time, take] [using, use, similarity, figure, entire, work, used, feature, top, better]
A Comprehensive Linear Speedup Analysis for Asynchronous Stochastic Parallel Optimization from Zeroth-Order to First-Order
Xiangru Lian, Huan Zhang, Cho-Jui Hsieh, Yijun Huang, Ji Liu


[node, child, number] [algorithm, rate, set, function, convex, learning, bound, upper, bounded, constant, may, let, best, theorem, show, special, define, agarwal, since] [asynchronous, stochastic, speedup, gradient, parallel, convergence, sgd, central, scd, descent, asgd, parameter, nonconvex, optimization, zeroth, coordinate, aszd, ascd, method, variance, music, machine, objective, distributed, large, iteration, blending, comprehensive, requirement, black] [linear, analysis, can, following, first, order, generic, running, existing, proved, note, one, component, also, much, matrix, ensure] [data, test, result, two, consistent, provides, random, estimate] [model, value, read, time] [different, using, neural, network, table, single, novel, use, computer, deep, including]
On Robustness of Kernel Clustering
Bowei Yan, Purnamrita Sarkar


[clustering, number, cluster, consistency, present, misclassified] [algorithm, theorem, show, lemma, will, upper, relaxation, proof, assume, bound, let, constant, weakly, rate, max, problem, now, learning, appendix, get, bounded, arbitrary, function] [log, machine, strongly] [matrix, sdp, analysis, one, eigenvectors, also, spectral, can, main, semidefinite, error, first, zero, norm, high, outlier, misclustered, eigenvalue, largest, fraction, singular, following, svd, blockwise] [kernel, two, data, consistent, based, robust, robustness, gaussian, section, separation, distance, population, inliers, result, random, corresponding, inlier] [model, value] [used, top, different, accuracy, performance, shown, use, proposed, without]
Binarized Neural Networks
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, Yoshua Bengio


[number, arithmetic, many, possible, either, multiplication] [binary, learning, algorithm, precision, function] [full, gradient, size, faster, reduce, stochastic, energy, method, computing, parameter, processing, batch, consumption, bnn, run, lead, normalization, machine] [can, first, also, matrix, noise, power, running, high] [kernel, two, section, test, point, based, estimator, quantization, dedicated] [forward, research, time, specifies] [neural, using, training, deep, binarized, memory, gpu, bnns, binarization, accuracy, arxiv, use, weight, work, performance, preprint, network, previous, imagenet, achieved, layer, without, hardware, updated, mnist, impact, mlp, svhn, trained, classification, drastically, last]
A Probabilistic Model of Social Decision Making based on Reward Maximization
Koosha Khalvati, Seongmin A. Park, Jean-Claude Dreher, Rajesh P. Rao


[social, number, probability, average, probabilistic, larger, among, total] [round, function, expected, public, optimal, set, playing, contribution, outcome, make, maximizing, best] [beta, initial, fit] [group, error, one, can, also, first, partially, much] [based, two, distribution, data, space, given, observable] [model, reward, player, belief, state, decision, pomdp, game, action, prior, descriptive, making, value, mdp, individual, pgg, behavior, cooperation, brain, subject, human, cooperativeness, trial, framework, dilemma, markov, found, others, modeled, cognitive, next, transition, current, behavioral, modeling, fmri, activity, left] [neural, using, previous, different, predicted, shown, figure, computer, use]
The Robustness of Estimator Composition
Pingfan Tang, Jeff M. Phillips


[definition, many, ith, either, greater, number] [set, composite, define, composition, theorem, make, since, may, will, even, defined, obtain, ranked, get, know, consider, implies, give, example] [gradient, machine] [can, analysis, suppose, need, first, order, remark, main, real, much, anomalous, following, condition, onto] [breakdown, estimator, point, data, pflat, robust, given, robustness, percentile, asymptotic, result, two, finite, common, formal, sample, wingspread, section, company, important, space, changed] [median, new, value, target, change, move, individual, another, process, raw] [use, person, without, single, several, table, different, understanding]
Higher-Order Factorization Machines
Mathieu Blondel, Akinori Fujino, Naonori Ueda, Masakazu Ishihata


[degree, number, recursion, negative, evaluating, distinct] [algorithm, learning, cost, let, defined, since, function, minimizing, even, set, conference, define] [hofms, anova, computing, gradient, compute, hofm, machine, efficient, objective, adagrad, coordinate, although, descent, stochastic, reduce, end] [can, factorization, also, main, order, link, need, denote, similarly, regression, sparse] [kernel, polynomial, section, data, two, positive, well] [model, new, therefore, time, advantage, simply, directly] [training, using, feature, use, table, prediction, learn, approach, per, network, propose, instead, shared, proposed, compared, different, dataset, neural]
Wasserstein Training of Restricted Boltzmann Machines
Grégoire Montavon, Klaus-Robert Müller, Marco Cuturi


[probability, possible, obtained] [set, binary, learning, optimal, expected, consider, defined, smoothing, conference, function, best, let] [gradient, parameter, standard, dual, machine, size, large, variance, respect, term, divergence, approximation] [can, completion, one, restricted, observed, also, small, low, high, first, vector, noise, error] [wasserstein, distance, data, rbm, boltzmann, distribution, sample, metric, two, hamming, empirical, given, practical, equation, kde, bias, true, pcd, euclidean, explanatory, kantorovich, point] [model, state, simple, effect] [figure, training, generated, denoising, learned, image, neural, using, shown, used, original, performance, work, different]
Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles
Stefan Lee, Senthil Purushwalkam Shiva Prakash, Michael Cogswell, Viresh Ranjan, David Crandall, Dhruv Batra


[smcl, ensemble, diverse, mcl, member, often, standing, riding, perched, many, present, specialization, tree] [oracle, loss, learning, conference, example, set, class, algorithm, provide, show, expected] [stochastic, size, gradient, standard, method, descent] [can, error, existing, ieee, one, group, also] [based, classical, people] [model, choice, research, search, producing, information] [multiple, training, deep, iou, figure, trained, image, produce, network, vision, top, task, single, neural, bird, computer, segmentation, train, performance, approach, using, beam, sitting, work, dey, different, classification, net, wave, convolutional, man, predicted, structured, output, accuracy, semantic, international, cnn, shown, captioning, dataset]
Fast Active Set Methods for Online Spike Inference from Calcium Imaging
Johannes Friedrich, Liam Paninski


[constraint, obtained, gurobi, possible, homogeneous] [algorithm, set, active, online, fast, problem, convex, function, will] [method, inference, optimization, step, run, parameter, supplementary, updating, update, warm, constrained] [calcium, imaging, fluorescence, oasis, can, one, solution, noise, nonnegative, also, order, sparsity, need, interior, signal, nat, formulation, running, violation, simultaneous, magnitude, autocovariance, sparse, high, isotonic] [data, estimate, equation, true, point, well, based] [time, spike, pool, activity, value, process, forward, model, whereas, move, simulated, research, optimizing, trace, spiking, series, track, new] [neural, figure, deconvolution, updated, compared, train, using, back, approach, used]
General Tensor Spectral Co-clustering for Higher-Order Data
Tao Wu, Austin R. Benson, David F. Gleich


[clustering, graph, probability, walk, cluster, number, partition, partitioning, find, represents, cut] [algorithm, will, set, case, chosen, call, general, let, may, show, appendix] [method, stochastic, standard, large, compute, datasets] [tensor, spectral, conductance, can, matrix, group, vector, following, eigenvector, square, one, gtsc, fiedler, surfer, sweep, rectangular, tsc, also, first, planted, sparse, spacey, multilinear, nonnegative, flight, solution] [random, data, chain, two, distribution, biased, stationary, section, based, procedure] [markov, transition, state, process, new, stop, form, information, english, model, time] [use, using, generalize, work, figure, several]
Maximal Sparsity with Deep Networks?
Bo Xin, Yizhou Wang, Wen Gao, David Wipf, Baoyuan Wang


[many, structure, number, likely, probability, allow, degree] [will, may, learning, problem, optimal, consider, constant, function, algorithm, drawn, even, least, loss, set, uniformly, general] [iid, operator] [can, sparse, iht, support, dictionary, recovery, existing, iterative, maximally, linear, matrix, via, ieee, following, photometric, rip, nonzero, one, much, also, computational, signal, success, quite, whereby, coherent] [estimation, given, practical, fixed, data, distribution, section, described, true] [correlation, model, information, across] [network, training, deep, using, use, surface, layer, original, different, dnn, classification, figure, performance, lighting, lstm, unit, viewed, arxiv, residual, neural, preprint]
Stochastic Three-Composite Convex Minimization
Alp Yurtsever, Bang Cong Vu, Volkan Cevher


[strong, present, average, variable, basic, denoted, probability, indicator] [algorithm, learning, convex, rate, problem, function, minimization, lipschitz, set, monotone, theorem, composite, minimize, almost, hold, best, assume, consider, uniformly, constant, define, optimal] [stochastic, method, gradient, splitting, optimization, convergence, proximal, deterministic, operator, smooth, processing, parameter, compute, standard, portfolio, expectation, strongly, unbiased, end, machine, markowitz] [can, remark, restricted, convexity, vector, see, solving, note, suppose, projection, solution, denote, following, one] [random, given, section, point, numerical, positive, empirical, squared, two] [template, information, framework, continuous, evidence, subject] [sequence, use, using, figure, neural]
A Sparse Interactive Model for Matrix Completion with Side Information
Jin Lu, Guannan Liang, Jiangwen Sun, Jinbo Bi


[interaction, unique] [algorithm, rate, problem, complexity, adaptive, theorem, max, let, conference, may, convex] [method, full, log, sampling, convergence, latent, machine, optimization, parameter, standard, datasets, solve] [matrix, side, can, rank, recovery, completion, low, row, observed, missing, interactive, ladmm, sparse, rmse, condition, exact, linear, synthetic, analysis, imc, singular, corrupted, solution, norm, column, ieee, sufficient, nuclear, dirtyimc, high, require, first, min] [two, sample, data, theoretical, given, inductive, space, empirical, journal, true] [model, information, new] [feature, use, approach, randomly, figure, performance, proposed, three, compared, used, recent, without, different, generated, international]
MetaGrad: Multiple Learning Rates in Online Learning
Tim van Erven, Wouter M. Koolen


[master, theory] [learning, regret, convex, online, metagrad, algorithm, loss, theorem, bound, version, rate, surrogate, adaptive, fast, may, slave, rft, appendix, vtu, logarithmic, consider, general, function, offline, annual, will, let, depends, implies, bounded, get, satisfy, easier, kgt, hinge, bernstein, guarantee, koolen] [full, machine, stochastic, gradient, method, step, parameter, optimization, strongly, kuk] [can, diagonal, also, condition, analysis, running, related, one, main, first, optimum, projection, matrix] [section, data, point, exponential, two, based, fixed, covariance, corresponding] [grid, van, time, therefore, prior, simultaneously, expert, direction] [prediction, using, like, domain, work]
Attend, Infer, Repeat: Fast Scene Understanding with Generative Models
S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, koray kavukcuoglu, Geoffrey E. Hinton


[number, variable, probabilistic, structure] [learning, show, will, consider, every, appear] [inference, air, latent, gradient, draw, likelihood, variational, posterior, dair, however, amortized, log] [can, one, also, via, first, note] [data, two, specified, well, given, random, distribution] [model, framework, learns, form, interpretable, therefore, inferred, demonstrate, continuous, infer, inverse, policy, count, prior] [network, scene, generative, using, object, neural, image, training, zpres, deep, recurrent, use, learn, used, different, structured, figure, pose, trained, attention, multiple, produce, task, alex, single, learned, approach, accuracy, supervised, unsupervised, koray, perform, geoffrey, identity]
Dimension-Free Iteration Complexity of Finite Sum Optimization Problems
Yossi Arjevani, Ohad Shamir


[number, degree, whose, definition, many, sum] [lower, bound, oracle, complexity, problem, convex, algorithm, function, theorem, rate, class, learning, bounded, derive, setting, set, tight, may, scheme, attained, proof, minimization] [optimization, iteration, stochastic, oblivious, convergence, iterates, gradient, machine, sdca, accelerated, dual, coordinate, proximal, cli, depend, deterministic, method, smooth, fsm, descent, arjevani, parameter, sampling, rlm, equipped, processing, approximation, ohad] [can, linear, following, one, min, computational, first, also] [given, section, finite, point, well, based, multivariate] [form, information, new, framework, current] [approach, using, neural, without, different, preprint, arxiv]
An Efficient Streaming Algorithm for the Submodular Cover Problem
Ashkan Norouzi-Fard, Abbas Bazzi, Ilija Bogunovic, Marwa El Halabi, Ya-Ping Hsieh, Volkan Cevher


[number, graph, vertex, acm, subset, find, possible, representative] [algorithm, set, streaming, cover, submodular, greedy, offline, problem, utility, oracle, theorem, active, dominating, smallest, optimal, consider, lazy, function, studied, cost, conference, massive, achieve, least, ratio, lower, bicriteria, tight, may, defined, element, best] [size, approximation, large, pass, requires, efficient, log, factor, distributed, machine, select, optimization, run, designed] [can, solution, ssc, fraction, small, one, first, phase, also, note, certain] [data, given, selection, random, section, corresponding, kernel, two, selected] [goal] [memory, single, performance, international, using, dataset, stream, figure]
Matching Networks for One Shot Learning
Oriol Vinyals, Charles Blundell, Tim Lillicrap, koray kavukcuoglu, Daan Wierstra


[many] [set, learning, class, function, defined, label, example, make] [batch, large, full, ets] [support, can, also, one, much, following, related, note] [data, test, given, section, metric, based, embedding, nearest, two, distribution, word, well] [model, new, baseline, form, modeling, simple] [matching, training, neural, task, cosine, imagenet, deep, trained, language, used, work, fine, attention, network, omniglot, inception, classifier, memory, classification, sequence, use, image, arxiv, preprint, convolutional, table, performance, using, lassifier, atching, sentence, recent, lstm, like, setup, single, google, softmax, siamese, accuracy, classify, vision]
Pairwise Choice Markov Chains
Stephen Ragain, Johan Ugander


[probability, pairwise, thus, partition, theory, definition, number, transitivity] [set, let, rate, show, implies, property, regularity, case, special, known, choosing, chosen, least] [stochastic, logit, datasets, nested, supplementary, inference, likelihood] [matrix, one, can, qij, also, proposition, error, see] [uniform, axiom, discrete, chain, data, empirical, distribution, expansion, stationary, independent, selection, multinomial, random, two, well, maximum, strict, journal, given] [model, choice, pcmc, mnl, markov, contractible, exhibit, qji, mmnl, transition, sfwork, btl, time, elimination, defines, rum, ttest, closed, communicating, yellott, sfshop] [work, learned, figure, recent, using, train, different]
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
Matteo Turchetta, Felix Berkenkamp, Andreas Krause


[probability, constraint, path, say] [lemma, set, algorithm, assume, know, function, least, consider, learning, unknown, may, since, confidence, return, every, theorem, prove, known, problem, implies, define, induction, exists, let, depends, proof, follows, achieve, conference, stuck, smallest] [step, optimization] [can, order, one, high, also, see] [gaussian, given, finite, distance, two] [safe, safety, ret, state, exploration, mdp, unsafe, safely, model, rreach, explore, agent, action, exploring, reachability, visiting, transition, reinforcement, reachable, therefore, goal, starting, rnret, considers, rsafe, rover, priori, within, information, system, control, rret, afe] [without, use, used, feature, using, able]
Combining Low-Density Separators with CNNs
Yu-Xiong Wang, Martial Hebert


[number, block, thus] [learning, set, problem] [large, standard, optimization, additional, sampling] [can, leading, also, first, linear, still, generic] [space, data, specific, two, sample] [lds, target, action, new, across, decision] [layer, activation, unsupervised, unlabeled, using, cnn, cnns, top, figure, training, use, recognition, novel, feature, learn, performance, scene, deep, labeled, different, image, separator, train, approach, improve, used, transferability, visual, neural, generate, category, convolutional, per, imagenet, supervised, discriminative, flickr, unit, dataset, classification, learned, shown, work, original]
Threshold Learning for Optimal Decision Making
Nathan F. Lepora


[threshold, rule, thus, find, many, distinct, probability, average, greater, number] [learning, function, optimal, cost, problem, ratio, considered, algorithm, rate, consider] [optimization, bayesian, method, log, standard, stochastic, variance, gradient, sequential, converges, sampling, challenging] [can, one, error, sampled, linear, comparison] [two, mean, gaussian, type, equal, data, derived, bayes, journal] [decision, reward, reinforce, model, time, making, acquisition, reinforcement, policy, exhaustive, trial, process, evidence, choice, optimize, sprt, animal, basal, experimental, psychological, sensory] [neural, figure, single, performance, used, use, approach, accuracy, forced, learn, perceptual, unit, multiple]
Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences
Daniel Neil, Michael Pfeiffer, Shih-Chii Liu


[regular, number] [learning, every, rate, even, drawn, conference, since] [sampling, standard, update, ron, asynchronous, processing, gradient, epoch, faster] [can, sampled, first, note, frequency, phase, one, high] [data, two, important] [time, state, cell, model, period, long, new, longer, event, sensor, neuron, timing, controlled] [lstm, phased, input, gate, recurrent, network, neural, video, output, accuracy, rnn, training, using, memory, audio, layer, different, three, task, sine, presented, use, plstm, open, figure, preprint, arxiv, rnns, trained, previous, rhythmic, used, speech, deep, mfccs, stream, vision, digit, shown, oscillation, work]
Maximizing Influence in an Ising Network: A Mean-Field Optimal Solution
Christopher Lynn, Daniel D. Lee


[external, stable, ising, influence, interaction, opinion, social, focusing, unique, structure, magnetization, block, jij, number, find, rise, viral, strength, represents, theory, marketing, susceptibility, spread, node, among, present, resulting, ferromagnetic] [optimal, theorem, budget, algorithm, exists, maximization, consider, maximizing, problem, concave, set, existence, since, general, max, even] [smooth, gradient, ascent, efficiently, apply, calculate, machine] [one, can, high, low, phase, solution, exact, critical, plot, sufficient, also, arg] [given, positive, boltzmann, random, statistical, maximum, mean, common] [system, transition, model, individual] [field, network, figure, performance, shown]
Scalable Adaptive Stochastic Optimization Using Random Projections
Gabriel Krummenacher, Brian McWilliams, Yannic Kilcher, Joachim M. Buhmann, Nicolai Meinshausen


[dag, combination, root] [loss, algorithm, learning, regret, randomized, opt, setting, adaptive, fast, consider, rate, convex, dependence] [rad, adag, full, gradient, ada, variance, optimization, proximal, stochastic, approximate, iteration, faster, approximation, update, term, compute, reduction, processing, standard, step, variant, efficient, method, machine] [matrix, can, order, projection, low, diagonal, computational, second, first, svd, rank, high, square, following, also] [random, data, test, computationally, section, particular, practical] [information, inverse, range] [neural, training, using, similar, convolutional, performance, used, use, deep, preprint, arxiv, propose, effective, figure, dense, several, work, different]
Probabilistic Linear Multistep Methods
Onur Teymur, Kostas Zygalakis, Ben Calderhead


[probabilistic, number, present] [will, function, now, appendix, give, problem, consider, make, constant, define, proof, case, since, deviation] [method, deterministic, convergence, bayesian, ode, additional, initial, standard, posterior, university] [error, order, can, truncation, solution, local, plot, linear, also, computational, proposition, first, following, formulation, analysis] [given, numerical, gaussian, integrator, basis, multistep, differential, mean, distribution, polynomial, conditional, chua, difference, family, derivation, particular, way, covariance, statistical, csab, mathematics, equal, section, corresponding, estimate] [process, value, required, model, prior, uncertainty, new, therefore] [used, using, use, generate, figure, approach]
Spectral Learning of Dynamic Systems from Nonequilibrium Data
Hao Wu, Frank Noe


[potential, number, diffusion, denoted, obtained] [learning, algorithm, equilibrium, appendix, function, problem, theorem, general, will] [stochastic, operator, distributed, parameter, compute] [spectral, can, assumption, error, comparison, analysis, computational, related, order, singular] [ooms, data, estimation, binless, nonequilibrium, oom, length, observable, based, two, empirical, statistical, molecular, space, alanine, hmm, density, identically, metastable, discrete, lim, estimated, gaussian, independent, distribution, given, limit, finite, estimate, procedure] [observation, continuous, process, markov, dynamic, state, modeling, simulation, trajectory, time, predictive, starting] [proposed, feature, hidden, neural, without, used, shown, generated, figure]
Learning Deep Parsimonious Representations
Renjie Liao, Alex Schwing, Richard Zemel, Raquel Urtasun


[clustering, cluster, connected, number, chw, report, assigned] [learning, set, show, since, loss, may, case, online, every] [update, compute, gradient, method, objective, standard, end, size] [regularization, matrix, one, can, note, first, via, also, tensor, error, vector] [sample, based, center, test, data, generalization] [model, within, current, framework, baseline, new] [neural, deep, layer, representation, network, convolutional, different, use, fully, learned, parsimonious, table, using, spatial, applied, dataset, preprint, unsupervised, arxiv, training, feature, shown, investigate, classification, approach, trained, work, receptive, image, compare, input, architecture, unfolding, recent]
Learning User Perceived Clusters with Feature-Level Supervision
Ting-Yu Cheng, Guiguan Lin, xinyang gong, Kang-Jun Liu, Shan-Hung Wu


[] [] [] [] [] [] []
Conditional Image Generation with PixelCNN Decoders
Aaron van den Oord, Nal Kalchbrenner, Lasse Espeholt, koray kavukcuoglu, Oriol Vinyals, Alex Graves


[den, vertical, quality] [every, learning, class, conference, blind] [machine, processing, masked, latent, autoregressive, aaron] [can, also, one, row] [conditional, given, embeddings, two, distribution, conditioning, embedding] [model, information, new, van, current, modelled, another] [image, pixelcnn, neural, gated, pixel, figure, convolutional, deep, single, arxiv, preprint, different, using, used, generative, network, generate, performance, training, use, trained, able, conditioned, layer, shown, receptive, decoder, imagenet, field, deepmind, lstm, pixelrnn, natural, original, similar, google, pixelcnns, person, ivo, karol, table, international, instead, encoder, input, modelling, stack, generation, daan, improve, generated, visual]
On Valid Optimal Assignment Kernels and Applications to Graph Classification
Nils M. Kriege, Pierre-Louis Giscard, Richard Wilson


[assignment, strong, hierarchy, graph, vertex, valid, base, histogram, intersection, path, induces, induced, subtree, rooted, tree, edge, root, definition, obtained, unique, kmax, according, number, rise, called, colour, ultrametric] [optimal, set, every, let, function, defined, class, since, problem, theorem, consider, general, may, element, show, will, obtain, provide] [inner, computed, refinement] [can, following, note, one, see, computation, restricted, related, product, dirac, first] [kernel, data, two, derived, associated, positive, given, section] [time, new] [feature, convolution, classification, figure, used, similarity, approach, shown, weight, using, applied, proposed, similar, different, image, three]
Can Active Memory Replace Attention?
Łukasz Kaiser, Samy Bengio


[basic, number, markovian] [active, learning, will, focus, set, problem, might, every, since, provide, dependence] [machine, size, standard, processing, compute, step, pure] [can, one, good, tensor, first, also, matrix] [length, test, embedding] [model, extended, state, baseline, information, another, new, teacher, long, change] [neural, memory, attention, output, gpu, use, used, translation, image, figure, recurrent, using, input, bleu, network, mechanism, better, convolutional, training, previous, sfin, task, shape, lstm, different, sentence, deep, gru, decoder, residual, single, generative, language, part, cgrud, symbol, alex, ivo, table, perplexity, rnn, score, similar, generated, natural, cgru, convolution, whole, recent, source]
Consistent Kernel Mean Estimation for Functions of Random Variables
Carl-Johann Simon-Gabriel, Adam Scibior, Ilya O. Tolstikhin, Prof. Bernhard Schölkopf


[variable, consistency, probabilistic, probability, programming] [set, function, provide, theorem, let, general, may, show, bounded, bound, upper, defined, assume, rate, even, learning, proof, case] [convergence, monte, carlo, apply, approximate, size, converges, university, machine] [can, reduced, also, good, product, one, need, main, first] [random, kernel, estimator, section, mean, embedding, sample, joint, kme, two, distribution, embeddings, kxy, expansion, consistent, space, based, estimate, hilbert, finite, rkhs, data, journal, theoretical, positive, given, provides] [continuous, rather, information] [using, input, used, applying, representation, multiple, approach, different, use, work, figure]
Cooperative Graphical Models
Josip Djolonga, Stefanie Jegelka, Sebastian Tschiatschek, Andreas Krause


[pairwise, partition, graphical, edge, strength, strong, bethe, variable, graph, many, cut, linearized, resulting, disagreement] [problem, submodular, function, upper, lower, convex, bound, obtain, will, set, algorithm, even, polytope, example, minimization, general, learning, defined, binary] [inference, log, variational, trwbp, approximate, marginals, optimization, approximation, pmap, marginal, computing, method, linearization, compute, energy, solved, tractable, strongly] [can, one, vector, sdp, error, via, also, first, linear, solving, still] [entropy, random, family, estimate, result, two, discrete, exp, journal] [model, cooperative, new, optimize, belief] [figure, using, use, map, different, used, image]
Optimal Tagging with Markov Chain Optimization
Nir Rosenfeld, Amir Globerson


[probability, vertex, many, adding, subset, describe, walk, monotonicity, social] [set, problem, optimal, will, algorithm, greedy, may, maximizing, show, now, prove, every, general, choose, submodular, assume, conference, since, consider, function, choosing] [optimization, objective, efficient, method, computing] [can, qij, tagging, tag, item, one, absorbing, pagerank, first, user, browsing, linear, denote, interesting, solving, matrix, recommendation, also, link, via, collaborative, decomposition, hence, related] [chain, given, random, based, corresponding, true, two, distribution, transient] [transition, markov, reaching, new, state, information, reach, optimizing, system, search, model, maximize] [task, using, used, use, work, add, different]
Online Convex Optimization with Unconstrained Domains and Losses
Ashok Cutkosky, Kwabena A. Boahen


[average, number, knowledge] [lmax, rescaledexp, regret, algorithm, loss, online, convex, learning, maxt, setting, bound, achieve, theorem, oco, adadelta, max, ftrl, pistol, klmax, learner, subgradient, set, problem, conference, prove, lower, optimal, let, lemma, mmax, francesco, adversary, rtj, every, unknown, strategy, annual, element, kxt, adaptive] [optimization, adagrad, machine, stochastic, log, processing, gradient, update] [can, invariant, min, require, linear, first, order, analysis, following, vector, one, much, computational] [exp, adam, exponential, space, two, generalization] [hyperparameter, unconstrained, information, prior, value, showing, time] [neural, scale, using, without, figure, classification, use, arxiv, preprint]
Stochastic Gradient Richardson-Romberg Markov Chain Monte Carlo
Alain Durmus, Umut Simsekli, Eric Moulines, Roland Badeau, Gaël RICHARD


[obtained] [theorem, algorithm, rate, show, bound, set, function, observe, optimal, consider, proof, assume, defined, let] [sgld, sgrrld, stochastic, step, gradient, mse, langevin, extrapolation, convergence, size, variance, monte, posterior, run, bayesian, carlo, converges, supplementary, sgrrhmc, large, parallel, distributed, monitor] [order, can, matrix, first, factorization, also, observed, computation, following, one, synthetic] [bias, two, distribution, estimator, gaussian, chain, given, data, based, random, fixed, numerical, mean, equation, brownian, true] [choice, model, therefore, time, markov] [sequence, using, figure, use, proposed, different, performance, shown, accuracy, applied, similar, approach, novel, applying]
Launch and Iterate: Reducing Prediction Churn
Mahdi Milani Fard, Quentin Cormier, Kevin Canini, Maya Gupta


[resulting, probability, report] [set, learning, will, bound, ratio, expected, svm, problem, may, let, case, loss] [operator, mcmc, machine, initial, datasets, large, reduction, run] [can, see, one, regression, stability, note, also, sampled, suppose, first, regularization] [churn, chain, rcp, wlr, test, stabilization, random, pwin, fixed, diffs, two, acc, distribution, empirical, unnecessary, nomao, statistical, ridge, needed, statistically, diplopia, measure, testing, generalization] [model, markov, change, new, baseline, effect, future, towards] [training, classifier, accuracy, different, trained, figure, using, train, proposed, used, use, previous, table, without, dataset, compared, classification, feature, neural, candidate, shown]
Dense Associative Memory for Pattern Recognition
Dmitry Krotov, John J. Hopfield


[number, many, rule, configuration, thus] [function, case, learning, set, problem, will, contribution, even] [energy, update, large, standard, capacity, step] [one, can, small, computational, order, linear, error, power, sign, also] [test, two, equal, prototype, family, limit, polynomial, given, data, well, section] [model, state, corresponds, information, neuron, behavior, backpropagation, transition, simple] [memory, network, neural, associative, activation, rectified, hidden, pattern, used, classification, layer, deep, different, visible, higher, training, stored, recognition, dense, image, using, feature, use, similar, output, performance, feedforward, presented, compared, store, several, input, digit, better, various]
Distributed Flexible Nonlinear Tensor Factorization
Shandian Zhe, Kai Zhang, Pengyuan Wang, Kuang-chih Lee, Zenglin Xu, Yuan Qi, Zoubin Ghahramani


[number, subset, contains] [binary, set, algorithm, tight, learning, function, will, bound, online, obtain, may, optimal, complexity, lower] [latent, distributed, variational, inference, large, educe, gradient, inftuckerex, log, elbo, inducing, size, inftucker, gigatensor, develop, efficient, full, shuffling, subtensors, dintucker, optimization, standard, parallel, flexible, tgp, scalability] [tensor, can, factorization, tucker, nonzero, zero, following, small, decomposition, also, first, sparse, computational, xij, sampled, matrix] [data, nonlinear, gaussian, covariance, kernel, infinite, based, given, space, introduce, mean] [model, process, continuous, prior, modeling, mode, exploit] [use, using, used, fully, training, proposed, performance, perform, performed, figure]
DECOrrelated feature space partitioning for distributed sparse regression
Xiangyu Wang, David B. Dunson, Chenlei Leng


[number, subset, false, partitioning, runtime, partition, consistency, variable] [theorem, algorithm, will, rate, set, since, strategy, chosen] [deco, size, full, decorrelation, parallel, fitting, distributed, stage, log, partitioned, mse, datasets, large, due, convergence, machine, parameter, processing, depend, step] [lasso, can, error, via, one, regression, first, condition, sparse, matrix, dimension, also, linear, computational, electricity] [data, sample, space, selection, statistical, section, estimation, random, journal, two, ridge, independent] [model, framework, time, information, new, eye, form, correlation] [performance, feature, using, dataset, approach, different, table, neural, compare, three, used]
SURGE: Surface Regularized Geometry Estimation from a Single Image
Peng Wang, Xiaohui Shen, Bryan Russell, Scott Cohen, Brian Price, Alan L. Yuille


[edge, pairwise, inside, consistency, indicates, present] [confidence, since, show, learning, defined, upper, set, fast, improvement, loss] [inference, evaluation, term, implementation] [can, regularization, also, first, compatibility, comparison] [normal, joint, geometry, estimation, two, random, given] [information, within, whether, inferred] [planar, depth, surface, dcrf, network, image, plane, using, prediction, map, pixel, affinity, output, semantic, single, training, better, neural, use, dense, cnn, predicted, convolutional, approach, nyu, used, ground, truth, deep, eigen, four, figure, regularize, cnns, bilateral, segmentation, input, depicted, adopt, propose, back, entire, feature, layout, different, improves, evaluate, explicitly]
Can Peripheral Representations Improve Clutter Metrics on Complex Scenes?
Arturo Deza, Miguel Eckstein


[number, coefficient, detection, hit, regular, total, create, present, edge, influence] [will, rate, function, max, since, loss, every] [computed, away, experiment, size] [can, global, also, one, high] [journal, metric, difference, entropy, mean, measure, distance, section, point] [model, target, human, search, value, information, time, across, roi, retinal, observer, location, correlation, orientation, response] [clutter, feature, congestion, foveated, map, image, peripheral, score, fixation, deg, visual, eccentricity, pifc, architecture, pooling, forced, pyramid, used, scale, different, fovea, figure, final, crowding, input, proposed, region, subband, periphery, using, previous, original, ffc, pff, vision, simoncelli, representation, color, texture, freeman, dense]
Learning Transferrable Representations for Unsupervised Domain Adaptation
Ozan Sener, Hyun Oh Song, Ashutosh Saxena, Silvio Savarese


[consistency, rule, enforce, among] [learning, label, algorithm, problem, since, show, function, class, loss] [method, transductive, large, batch] [can, one, following, first, order, also, arg] [nearest, data, metric, neighbor, two, well, based, point, transformation, given, joint] [target, model, shift, state, framework, cyclic] [domain, source, unsupervised, adaptation, using, mnist, transduction, deep, feature, dataset, reject, classification, digit, use, learn, figure, object, discriminative, office, similarity, jointly, different, svhn, learned, proposed, network, image, fully, structured, task, table, neural, ethod, labelled, supervised, training, art, evaluate, convolutional, recognition, accuracy, webcam, input, without, transfer]
Linear dynamical neural population models through nonlinear embeddings
Yuanjun Gao, Evan W. Archer, Liam Paninski, John P. Cunningham


[variable, find, structure, distinct] [rate, function, class, algorithm, may, learning] [latent, inference, variational, approximate, posterior, log, reduction, likelihood, large, fitting] [linear, can, noise, analysis, also, much, recover] [data, nonlinear, population, true, sample, gaussian, distribution, dimensionality, result, mean, testing, fitted] [model, pflds, plds, time, predictive, dynamical, neuron, observation, spike, stimulus, aevb, poisson, gcflds, activity, lds, macaque, across, orientation, zrt, state, flds, system, lapem, simulation, count, grating, upon, firing, reaching, primary, simulated, recorded, xrti, kalman] [neural, use, generative, training, performance, approach, capture, using, preprint, arxiv, figure, compare, compared]
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
Jakob Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson


[message, among, partial, many] [learning, differentiable, day, optimal, consider, since, binary, set, setting, chooses] [parameter, select, gradient, evaluation] [can, one, also, noise, order, first] [two, discrete, independent, space, important, address] [communication, agent, dial, action, rial, centralised, protocol, dqn, state, decentralised, policy, environment, reinforcement, interrogation, across, nocomm, receives, mat, essential, room, uat, switch, communicate, reward, another, share, must, time, cooperative, multiagent, execution] [deep, network, figure, neural, sharing, mnist, preprint, using, arxiv, learn, performance, language, recurrent, training, without, learned, approach, input, different, task, used]
Finite-Sample Analysis of Fixed-k Nearest Neighbor Density Functional Estimators
Shashank Singh, Barnabas Poczos


[probability, fact, strictly] [bound, lemma, rate, learning, conference, function, since, may, show, let, known, supported, constant, bounded, theorem, define, lower, class] [variance, machine, divergence, processing, log, convergence, additional, due] [can, ieee, analysis, error, also, require, one, suppose, assumption, via, support] [bias, estimator, density, estimation, entropy, based, functional, neighbor, mutual, functionals, estimating, estimate, sample, consistent, positive, shannon, correction, boundary, nonparametric, multivariate, asymptotic, given, increasing, random, mean, finite, fixing, alfred, nearest] [information, continuous, form, statement] [approach, neural, international, using, table, used, work, several]
Statistical Inference for Pairwise Graphical Models Using Score Matching
Ming Yu, Mladen Kolar, Varun Gupta


[graphical, scoring, rule, graph, edge, structure, number, valid, probabilistic] [let, confidence, asymptotically, set, coverage, learning, theorem, rate, case, constant, will, lemma, since, consider] [inference, log, parameter, large, step] [can, first, assumption, min, regularized, one, following, vector, via, linear, see, sparse, matrix, arg, high, condition, suitable] [exponential, estimator, gaussian, estimation, density, conditional, procedure, selection, family, distribution, data, normal, dimensional, consistent, given, statistical, empirical, based, asymptotic, estimated, construct, well, robust, random, sample, corresponding, signaling] [model, literature, form, next, event] [score, matching, using, work, use, table, used, arxiv, figure]
New Liftable Classes for First-Order Probabilistic Inference
Seyed Mehran Kazemi, Angelika Kimmig, Guy Van den Broeck, David Poole


[theory, lifted, recursion, wfomc, probabilistic, rule, clause, prvs, den, predicate, grounding, unary, relational, contains, srl, lvs, guy, weighted, number, david, logic, assigned, prv, represent, transitivity, dan, liftable, formula, recursively, birthday, called, branch, logical, compiling, clausal] [set, let, case, example, consider, every, learning, problem, binary, show, finding, class, lemma, may, assume] [inference, size, compute, calculating] [can, one, symmetric, also, first, analysis, exactly, suppose, decomposition, following] [two, population, exponential, true, polynomial, given, statistical, done, random] [van, time, model, markov, new, individual] [domain, using, ground, applying, unit, different, without]
Structured Sparse Regression via Greedy Hard Thresholding
Prateek Jain, Nikhil Rao, Inderjit S. Dhillon


[overlapping, number] [algorithm, greedy, let, set, theorem, convex, problem, show, even, least, obtain, provide, learning, function, lemma, appendix, setting, general, arbitrary, proof, conference, case, consider] [log, approximate, method, standard, processing, operator, solve, optimization] [group, iht, can, sparse, sparsity, regression, projection, hard, existing, following, thresholding, error, arg, min, also, condition, via, signal, vector, iterative, linear, recovery, gomp, note, one, suppose, require, restricted, ieee, exact, significantly, onto, analysis, high, sog, synthetic] [result, data, based, selection, given, well] [information, model, time] [using, similar, neural, international, overlap, used, structured, figure]
Fundamental Limits of Budget-Fidelity Trade-off in Label Crowdsourcing
Farshad Lahouti, Babak Hassibi


[crowdsourcing, worker, number, probability, kic, average, incidence, memoryless, labeling, crowdsourcer, fundamental, taskmaster, crowd, possible, clustering, valid, level, identify, indicates] [query, may, case, consider, theorem, rate, problem, set, function, follows, scheme, unknown, budget, algorithm, known, appendix, label, show] [large, inference, iid, optimization, processing] [error, can, item, one, following, min, suitable, overall, small, noisy, certain, analysis] [two, distortion, theoretic, given, discrete, described, section, joint, remain] [skill, coding, code, information, model, design, feedback, within, optimized, purpose] [channel, source, performance, decoder, figure, presented, output, different, per, neural, dataset, input, used, work, task, using]
Safe and Efficient Off-Policy Reinforcement Learning
Remi Munos, Tom Stepleton, Anna Harutyunyan, Marc Bellemare


[thus, cut, coefficient, sum, possible, definition, notice] [learning, greedy, algorithm, consider, theorem, will, defined, since, case, online, function, arbitrary, proof, assume, conference, depends, best, may, make, lemma, deduce, general, close, show] [operator, convergence, variance, evaluation, full, machine, importance, sampling, efficient, around, converges] [assumption, can, contraction, product, first, low, need, min] [fixed, point, sample, result, infinite, estimate, given, finite, provided] [policy, control, target, behaviour, trace, increasingly, value, reinforcement, txe, bellman, action, choice, write, glie, safe, time, temporal, replay] [sequence, using, score, use, mapping, international, single, deepmind]
Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA
Aapo Hyvarinen, Hiroshi Morioka


[number, monotonic, structure, obtained] [learning, function, case, since, will, blind, general, corollary, constant, show, theorem, proof, assume, define] [method, machine, logistic] [linear, can, also, one, mlr, component, note, analysis, see, vector] [nonlinear, data, ica, tcl, extractor, independent, segment, given, based, nonstationary, nonstationarity, modulated, principle, point, identifiability, equal, well, mixing, mean, estimation, transformation, distribution, chance] [model, time, temporal, must, new, meg, series] [feature, used, source, using, neural, unsupervised, generative, different, deep, classification, figure, use, layer, hidden, spatial, training, performance, trained, learn, similar, network, higher]
Sparse Support Recovery with Non-smooth Loss Functions
Kévin Degraux, Gabriel Peyré, Jalal Fadili, Laurent Jacques


[probability, identifiable, stable, lagrange, constraint, sharp, thus, consistency] [case, lemma, loss, proof, theorem, let, problem, general, function, will, supported, assume, since, show, hold, define, now, set, instance, observe, convex, satisfied, provide, special] [dual, smooth, solve, large] [support, can, noise, solution, condition, stability, sparse, recovery, note, vector, sensing, one, compressed, analysis, main, first, certificate, restricted, also, small, order, norm, matrix, following, injectivity, subspace, see, ieee, sign, observed, min, require, sparsity, signal] [section, result, corresponding, data, theoretical, particular, journal, important, noting, associated, specific, correspond, equal] [extended, information, model] [different, using, figure, able]
Learning values across many orders of magnitude
Hado P. van Hasselt, Arthur Guez, Matteo Hessel, Volodymyr Mnih, David Silver


[normalized, many] [learning, function, algorithm, consider, may, conference, defined, adaptive, changing, loss, lower, now, depends] [normalization, update, clipping, sgd, step, unnormalized, gradient, size, large, appropriate, machine, straightforward, processing] [can, magnitude, much, proposition, first] [data, important, squared, true, mean] [dqn, reinforcement, target, clipped, change, reward, new, shift, atari, adapt, normalize, mnih, median, van, thereby, action, time, information, heuristic, unclipped, hasselt, policy] [neural, scale, double, deep, using, without, different, performance, output, input, network, international, single, tune, used, use, natural, figure, work, layer]
Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much
Bryan D. He, Christopher M. De Sa, Ioannis Mitliagkas, Christopher Ré


[variable, probability, number, conjecture] [will, show, theorem, prove, bound, set, lazy, best, always] [sampling, factor, augmented, method, faster, distributed, due] [order, can, conductance, one, good, relative, onto, small, also, analysis, much, matrix] [scan, systematic, mixing, random, two, bridge, gibbs, island, chain, tmix, distribution, permutation, space, efficiency, mix, polynomial, true, mass, stationary, section, needed, sample, corresponding, statistical, asymptotic, discrete, slower, sampler, result, way, nsf] [model, state, time, markov, transition, effect, move, information, choice, must] [using, different, sequence, single, pyramid, used, figure, several, compared]
Stochastic Gradient Methods for Distributionally Robust Optimization with f-divergences
Hongseok Namkoong, John C. Duchi


[] [algorithm, convex, regret, problem, loss, bound, give, show, appendix, let, function, inf, risk, provide, minimization, learning, confidence, optimal, set, theorem, define, case, obtain, lemma, may, bandit, proof, now, conference, calibrated, differentiable, sup, online] [gradient, descent, stochastic, optimization, log, machine, objective, mirror, divergence, method, standard, efficient, dual, university, variance, duchi, step, convergence, update, strongly, distributionally, term, compute] [can, first, following, vector, formulation, solving, plot, solution, high, computational] [robust, empirical, section, procedure, journal, provides, sample, point, data, test, url] [time, identical, uncertainty, choice] [use, approach, using, performance, classification, figure]
Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation
Ilija Bogunovic, Jonathan Scarlett, Andreas Krause, Volkan Cevher


[according, potential, theory] [algorithm, set, confidence, function, cost, gchk, setting, provide, regret, since, truvar, best, maximizers, choosing, dependence, learning, lower, bound, choose, define, chosen, corollary, version, smaller, consider] [variance, optimization, posterior, bayesian, supplementary, log, reduction, performs, respect, heteroscedastic] [noise, can, truncated, also, one, via, unified, synthetic, significantly, high, following, lake, small] [point, theoretical, data, mean, estimation, result, gaussian, lse, given, two, based, kernel, selected] [time, within, search, target, found, value, process, choice, typically, directly, track] [use, figure, previous, domain, performance, instead, different, using, better, similar, three]
Scaled Least Squares Estimator for GLMs in Large-Scale Problems
Murat A. Erdogdu, Lee H. Dicker, Mohsen Bayati


[number, find] [algorithm, least, cost, let, theorem, may, assume, rate, problem, constant, bound, show, provide, general, defined] [glm, ols, convergence, glms, method, step, optimization, bfgs, lbfgs, gradient, agd, proportionality, term, logistic, university, batch, cubic, quadratic] [regression, linear, error, can, denote, lasso, proposition, computational, matrix, second, order, min, first, vector, generalized, via, approximately] [mle, estimator, section, random, test, estimating, normal, gaussian, distribution, scaled, mean, covariance, covariates, estimate, corresponding, two, based, data, maximum] [design, time, model, right] [performance, used, use, proposed, dataset, relationship, figure, using, several]
Mixed Linear Regression with Multiple Components
Kai Zhong, Prateek Jain, Inderjit S. Dhillon


[clustering, neighborhood] [algorithm, complexity, will, theorem, convex, let, show, max, function, appendix, set, optimal, constant, define, problem, case, guarantee, learning, proof, assume, minimization, provide] [method, objective, convergence, compute, gradient, mixed, requires, converges, standard, descent, initial, strongly, step] [linear, tensor, subspace, can, initialization, computational, min, mlr, regression, global, denote, also, power, small, recovery, local, solving, sampled, still, solution, dimension, exact, zit, formulation, certain, one, analysis, via, following] [sample, data, random, two, point, resampling, independent, locally, empirical] [model, time, required] [use, different, using, multiple, ground, generated, table, propose, daniel, truth, proposed, dataset]
Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities
Ruitong Huang, Tor Lattimore, András György, Csaba Szepesvari


[constraint, thus, theory] [regret, ftl, learning, convex, online, let, bound, loss, will, algorithm, differentiable, theorem, assume, show, set, case, since, function, lemma, even, bounded, satisfies, achieve, lower, logarithmic, setting, constant, known, inequality, achieves, minimax, get, proof, leader, conference, expected, prove, corollary, adaptive, fast, consider, example] [stochastic, away, machine, university, strongly, processing, optimization] [linear, can, proposition, also, one, principal, small, note, paper, following, support, denote, norm, first, much, enjoys, vector] [curvature, data, result, boundary, curve] [information, angle, next, whether, long] [prediction, used, unit, neural, compact, work, previous]
Variational Inference in Mixed Probabilistic Submodular Models
Josip Djolonga, Sebastian Tschiatschek, Andreas Krause


[probabilistic, called, partition, quality, represent, describe] [submodular, modular, set, function, problem, algorithm, upper, bound, will, consider, binary, facility, diversity, conference, learning, best, considered, let, andreas, general, property, arbitrary] [inference, approximate, variational, mixed, fldc, flic, flid, optimization, large, attractive, latent, repulsive, approximation, djolonga, machine, solve, log, inner, psms, josip, efficient] [can, product, one, item, recommendation, following, via, first, note] [distribution, given, corresponding, based, section, point, procedure] [model, location, form, optimize, optimizing] [use, used, performance, different, proposed, similar, task, using, accuracy, ground, better, dataset, approach, international]
A scalable end-to-end Gaussian process adapter for irregularly sampled time series classification
Steven Cheng-Xian Li, Benjamin M. Marlin


[number, block] [algorithm, show, expected, learning, loss, since, set, defined, label] [lanczos, approximation, adapter, posterior, gradient, method, compute, interpolation, inducing, uac, sampling, marginal, approximate, irregularly, computing, irregular, ski, log, efficiently, conjugate, approximating, due, computed, wilson, cubic, expectation] [can, sparse, matrix, error, also, sampled, computation, regression, exact, subspace, linear, krylov, one, symmetric, square] [kernel, gaussian, data, random, covariance, section, described, given, mean, space, sample, based] [time, series, framework, meg, process, uncertainty, model] [using, classification, use, classifier, training, train, different, feature, applied, shown, neural, including, representation, proposed, work, output, approach, structured]
Neurons Equipped with Intrinsic Plasticity Learn Stimulus Intensity Statistics
Travis Monk, Cristina Savin, Jörg Lücke


[circuit, variable, average, negative, number, represents, rule] [learning, class, function, set, optimal, let, algorithm, may, show, appendix] [parameter, posterior, standard, bayesian] [can, also, one, preprocessing, link] [data, wcd, given, distribution, robust, mixture, mean, comp] [model, plasticity, intensity, excitability, intrinsic, neuron, synaptic, stimulus, gain, evolution, spiking, artificial, learns, maximize, ppg, role, process, dull, change, wcgen, gsm, observation, cortical, activity, hebbian] [neural, network, input, hidden, figure, generative, used, classification, using, learn, different, unsupervised, dataset, mnist, training, digit, use, work, performance, generate, learned, without, novel, generated]
Domain Separation Networks
Konstantinos Bousmalis, George Trigeorgis, Nathan Silberman, Dilip Krishnan, Dumitru Erhan


[htc, number] [loss, private, learning, set, class, function, let, label] [datasets, gradient, respect] [synthetic, error, also, can, one, real, orthogonality, subspace] [mmd, mean, data, two, squared, kernel, separation] [target, model, traffic, soft] [domain, shared, source, adaptation, unsupervised, representation, use, dataset, classification, used, object, training, classifier, trained, using, pose, mnist, image, reconstruction, dann, svhn, similarity, work, similar, background, neural, task, encoder, applied, like, dsn, deep, ground, network, truth, input, hsc, layer, hidden, different, labeled, recognition, validation, predict, without]
Online and Differentially-Private Tensor Decomposition
Yining Wang, Anima Anandkumar


[definition, probability, variable, number, level] [algorithm, privacy, theorem, private, online, differentially, learning, improved, show, complexity, streaming, bound, exists, guarantee, let, consider, bounded, sup, even, appendix, arbitrary, least, set] [method, step, end, efficient, latent, linearly, requires, objective] [tensor, power, noise, decomposition, analysis, can, matrix, one, assumption, perturbation, dimension, symmetric, also, first, order, noisy, spectral, condition, recovery, popular, orthogonal, via, component, linear, error, suppose, high, initialization, remark, norm, recovers, guaranteed, whitening] [data, robust, random, sample, important, practical, gaussian, moment, procedure, differential, journal, population] [model, simple] [memory, input, using]
Architectural Complexity Measures of Recurrent Neural Networks
Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan R. Salakhutdinov, Yoshua Bengio


[graph, number, node, directed, coefficient, definition, path, adding, edge, dependency, larger] [learning, set, consider, complexity, general, let, since, even, might] [term, size, sequential, step, large, extra, increase] [can, also, one, first, following, note, denote] [two, given, length, increasing, test, specific, nonlinear, measure, result] [time, model, connecting, long, cyclic, information, new, transition, baseline] [recurrent, skip, rnn, depth, neural, gun, feedforward, tanh, rnns, different, mnist, performance, architecture, figure, sequence, lstm, preprint, using, input, arxiv, hidden, table, shown, architectural, stacked, yoshua, previous, similar, multiple, unfolded, output, investigate]
An Online Sequence-to-Sequence Model Using Partial Conditioning
Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Ilya Sutskever, David Sussillo, Samy Bengio


[block, number, probability, thus, possible, partial] [best, learning, show, every, function, conference, context, let, since] [step, computed, log, end, size, compute, processing, timit, large] [can, one, vector, first, note] [two, equation, data, section] [model, time, state, next, target, current, information, across, emitted, brain] [transducer, neural, output, input, sequence, attention, using, encoder, speech, recurrent, used, lstm, last, layer, figure, alignment, network, recognition, produce, rnn, entire, produced, symbol, conditioned, three, work, deep, prediction, mechanism, impact, arxiv, preprint, trained, oriol, per, google, generated, softmax, seen, achieved, accuracy, task, table, dzmitry, yoshua]
Community Detection on Evolving Graphs
Stefano Leonardi, Aris Anagnostopoulos, Jakub Łącki, Silvio Lattanzi, Mohammad Mahdian


[cluster, node, clustering, evolving, probability, structure, block, graph, number, community, many, social, detection, cit] [algorithm, query, case, lemma, show, assume, may, every, theorem, constant, expected, even, now, bound, let, problem, give, strategy, study, exists, conference, optimal] [step, stochastic, log, size, large] [can, first, assumption, high, also, error, one, note, analysis, main, detecting, denote, second] [data, given, random, two, distribution, section, theoretical] [model, time, dynamic, evolution, new, information] [per, single, use, work, different, entire, using, input, used, network, probing, international, able]
Tight Complexity Bounds for Optimizing Composite Objectives
Blake E. Woodworth, Nati Srebro


[find, number, average, thus] [lower, algorithm, randomized, convex, oracle, prox, bound, function, access, complexity, consider, make, kxk, upper, appendix, defined, round, mlb, show, even, optimal, obtain, exists, theorem, since, problem, will, dependence, least, composite, provide, minimizing, query, define] [deterministic, log, gradient, optimization, accelerated, smooth, stochastic, strongly, distributed, term, svrg, large, iterates, machine, atyusha] [can, component, order, dimension, also, following, first, orthogonal, exact, much, sufficiently, need] [point, finite, construction, based] [must, optimizing, information, required] [using, previous, arxiv, preprint, similar, table, several, without, different, neural, presented, use]
Inference by Reparameterization in Neural Population Codes
Rajkumar Vasudeva Raju, Xaq Pitkow


[probabilistic, graphical, node, many, lbp, propagation, represent, pairwise, probability, loopy, distinct, tree, rule] [algorithm, general, since, set, will, even] [inference, marginal, distributed, quadratic, normalization, reparameterization, posterior, approximate, bayesian, processing, performs, update] [can, linear, one, local, sufficient, computation, noise] [nonlinear, population, distribution, two, joint, gaussian, true, important, multivariate, section, exp] [activity, information, belief, trp, model, ppcs, time, tst, neuroscience, brain, pseudomarginals, evidence, marginalization, nature, slow, beck, divisive, neuronal, sensory] [neural, network, different, natural, figure, using, performance, perform, use, encode, multiple, representation, work]
Data Poisoning Attacks on Factorization-Based Collaborative Filtering
Bo Li, Yining Wang, Aarti Singh, Yevgeniy Vorobeychik


[knowledge, complete, resulting] [attacker, minimization, algorithm, optimal, set, utility, function, problem, learning, uniformly, defined, since, availability, conference, consider] [gradient, optimization, compute, parameter, sgld, machine, posterior, computed, step, university] [malicious, matrix, attack, poisoning, can, collaborative, norm, nuclear, filtering, alternating, user, mij, one, solution, also, first, recommendation, formulation, avoid, rmse, completion, rating, kkt, rated, integrity, item, singular, min, approximately, popular, pga, via, projected, regularization] [data, normal, two, based, specific, random, section, robust] [system, model, research, prior, information, value] [using, use, different, figure, adversarial, original, similar, feature, effective]
k*-Nearest Neighbors: From Global to Local
Oren Anava, Kfir Levy


[number, weighted, thus, find, present, consistency] [optimal, algorithm, implies, bound, choosing, adaptive, best, theorem, assume, consider, may, latter, since, problem, cost, confidence] [standard, term, objective, yielding, datasets, log, method, requires, efficiently] [solution, first, note, following, order, running, condition, error, analysis, second, vector, regression, local, also, nonzero, noise] [data, given, kernel, nearest, neighbor, equation, distance, section, two, point, locally, metric, theoretical, well, based, estimation, regime, bandwidth] [value, decision, time, form, therefore, information, whereas, towards, new] [prediction, approach, weight, using, several, classification, different, per, used, dataset, three, text, table]
Learning Multiagent Communication with Backpropagation
Sainbayar Sukhbaatar, arthur szlam, Rob Fergus


[number, thus, block, graph, structure] [set, will, learning, get, now, call, function, show, expected, conference, let] [gradient, processing] [can, vector, also, one, linear, matrix, note, order, view] [two, discrete, given, well, independent, address, specific] [model, communication, agent, controller, time, form, state, take, reward, dynamically, baseline, reinforcement, multilayer, followed, control, nonlinearity, concatenation, communicate, another, simple, feed, rather, cooperating, extend, policy, abstract, information, within, scalar, simplify, commnet, sized, email, broadcast, module, range] [neural, input, network, used, output, single, architecture, layer, hidden, applied, consists, use, feedforward, viewed, using, final, different]
Combinatorial Energy Learning for Image Segmentation
Jeremy B. Maitin-Shepard, Viren Jain, Michal Januszewski, Peter Li, Pieter Abbeel


[kevin, merge, connected, pairwise, larger, structure] [learning, obtain, binary, combinatorial, may, defined, define, general] [energy, method, computed, large, step, size, initial, efficient, evaluation] [local, connectivity, can, error, also, existing, vector, sebastian] [based, two, corresponding, given, boundary, test, automated, true, well, data, type] [model, information, within, prior, box, neuronal, required, voxels, modeling, potentially, action] [segmentation, shape, image, descriptor, neural, celis, agglomeration, training, bounding, network, using, deep, used, position, use, microscopy, convolutional, reconstruction, voxel, accuracy, winfried, electron, shown, volume, single, score, moritz, classification, dataset, rand, several, approach, feature, scale, viren, object, computer, watershed, architecture]
Perspective Transformer Nets: Learning Single-View 3D Object Reconstruction without 3D Supervision
Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, Honglak Lee


[introduced] [learning, loss, problem, set] [silhouette, latent] [projection, can, view, matrix, good, one, following, also] [test, transformation, two, corresponding, point, generalization, based, given, project, data] [model, agent, azimuth, demonstrate, experimental] [volume, shape, network, object, image, perspective, volumetric, training, representation, using, trained, figure, single, convolutional, input, shown, transformer, ground, category, reconstruction, used, without, neural, proposed, generated, table, truth, unseen, three, learn, camera, deep, viewpoint, encoder, chair, decoder, voxel, combined, performance, learned, conv, propose, better, multiple, train, arxiv, prediction, recurrent, preprint, computer, able, understanding, output]
PAC Reinforcement Learning with Rich Observations
Akshay Krishnamurthy, Alekh Agarwal, John Langford


[number, path, many, probability] [learning, algorithm, function, complexity, optimal, dependence, contextual, set, bound, since, may, lower, assume, class, least, problem, even, general, show, exponentially, pac, focus, bandit] [step, consensus, large, size, requires, deterministic, efficient] [can, assumption, one, also, require, regression, global, denote, first] [sample, polynomial, empirical, estimate, given, space, finite, true, theoretical, result, two] [state, reinforcement, policy, model, value, observation, elimination, exploration, reactive, dfs, pomdps, rich, new, search, decision, action, future, surviving, transition, lsvee, must, time, reward, current, required, planning, realizability, agent, goal] [use, using]
Multi-armed Bandits: Competing with Optimal Sequences
Zohar S. Karnin, Oren Anava


[total, block, bin, average, thus, identify, number, say, inside, denoted, knowledge, notice] [regret, algorithm, loss, arm, bound, optimal, setting, deviation, absolute, online, exploitation, problem, will, est, round, bandit, set, concentrated, adversary, consider, defined, observe, whenever, obtaining, bounded, best, example, assume, might, provide] [stochastic, strongly, standard, optimization] [can, phase, analysis, one, linear, first, observed, sublinear, hard, also, still, def] [mean, test, random, two, statistical, given, fixed, stationary] [action, exploration, environment, value, dynamic, feedback, whether, change, player, time, within, benchmark, determine, long] [sequence, variation, use, static, work, without, using, adversarial]
Incremental Boosting Convolutional Neural Network for Facial Action Unit Recognition
Shizhong Han, Zibo Meng, AHMED-SHEHAB KHAN, Yan Tong


[strong, number, expression, average] [learning, loss, function, active, set, improvement, will] [incremental, boosting, iteration, boosted, standard, due] [can, ieee, analysis] [selected, based, well, data, positive] [action, decision, limited, information, current, blue, individual, benchmark, target, new] [facial, cnn, proposed, layer, classifier, recognition, training, score, performance, weak, figure, database, calculated, feature, convolutional, cnns, employed, face, used, semaine, neural, spontaneous, input, traditional, activation, shown, red, different, unit, recognizing, novel, learned, disfa, classification, using, deep, trained, updated, illustrated, table, discriminative, previous, four, network]
SoundNet: Learning Sound Representations from Unlabeled Video
Yusuf Aytar, Carl Vondrick, Antonio Torralba


[knowledge] [learning, show, may, loss, since, even] [large, standard, five] [also, can, comparison, order, leverage, one, synchronization, existing] [data, suggest, mean, length] [teacher, model, rich, human] [sound, network, unlabeled, visual, convolutional, video, deep, vision, use, soundnet, scene, using, classification, natural, trained, acoustic, layer, training, performance, representation, transfer, dataset, figure, table, train, used, audio, accuracy, recognition, learn, object, without, labeled, hidden, better, discriminative, neural, approach, deeper, supervision, imagenet, vgg, transferring, andrew, feature, antonio, architecture, recognize, propose, depth, learned, preprint, perform, visualize, dcase, carl]
Improved Error Bounds for Tree Representations of Metric Spaces
Samir Chowdhury, Facundo Mémoli, Zane T. Smith


[tree, ultrametric, number, linkage, clustering, find, notice, claim, present] [bound, theorem, let, max, optimal, defined, cardinality, define, set, upper, covering, proof, depending, prove, obtain, dependence, bounded, will, appendix, function, provide, duality, consider, notion, problem, observe, since, show] [method, university] [can, dimension, stability, one, min, remark, also, denote, note, low, first] [metric, space, embedding, distortion, doubling, data, slhc, additive, finite, given, dxn, dln, two, gromov, hyperbolicity, uxn, ultrametricity, voronoi, result, journal, preserving, point, construction, numerical, dendrogram] [write, hierarchical, typically] [using, question, able, different, map, multiplicative]
Satisfying Real-world Goals with Dataset Constraints
Gabriel Goh, Andrew Cotter, Maya Gupta, Michael P. Friedlander


[constraint, negative, number, often] [problem, convex, will, algorithm, fairness, function, upper, ramp, rate, deployed, set, tight, svm, may, randomized, appendix, loss, every, zafar, hinge, thresholded, learning, let, minimize, svmoptimizer, egregious, feasible, bound, expected] [optimization, large, datasets, method, objective, dual, parameter] [can, one, linear, vector, error, also, support, small, certain] [churn, positive, section, testing, two, given, test, expressed] [model, optimizing, new, desired, optimize, unconstrained, found] [training, classifier, dataset, using, proposed, classification, recall, approach, unlabeled, propose, used, use, labeled, figure, table, different, prediction, candidate, accuracy]
High-Rank Matrix Completion and Clustering under Self-Expressive Models
Ehsan Elhamifar


[clustering, notice, number, incomplete, complete, subset, whose, graph] [algorithm, since, function, convex, problem, conference, set, learning, consider, show, finding, selecting] [machine, performs, optimization, due, method, objective, large, solve] [missing, completion, matrix, fraction, cij, subspace, can, error, lrmc, recover, corrupted, solution, lie, sparse, union, ssc, ieee, rank, synthetic, real, min, via, nonzero, cin, lifting, first, ambient] [data, point, given, journal, important] [framework, deal, information] [using, mfa, computer, figure, different, motion, international, propose, performance, similarity, better, use, representation, dataset, pattern]
Sorting out typicality with the inverse moment matrix SOS polynomial
Edouard Pauwels, Jean B. Lasserre


[degree, detection, number, sum, level, panel, potential, unique, average, collection] [set, theorem, let, will, problem, learning, satisfies, defined, function, lemma, appendix, optimal, provide] [suggests, machine, evaluation, optimization, method] [matrix, can, orthogonal, assumption, following, real, solution, computation, outlier, global, denote, order, also, one] [polynomial, moment, section, empirical, measure, data, given, mathematical, monomials, distinguished, christoffel, cloud, positive, outlyingness, intrusion, well, npd, estimation, based, sublevel, borel, rationale, density, result, family, equal] [inverse, value, simple, left] [shape, network, figure, use, using, similar, score, higher, different, connection, international]
Optimal Binary Classifier Aggregation for General Losses
Akshay Balsubramani, Yoav S. Freund


[ensemble, partial, potential, aggregation, structure, constraint, lagrange, false] [loss, function, binary, minimax, convex, general, learning, optimal, problem, label, theorem, set, example, max, defined, lemma, minimize, will, bound, algorithm, minimizer, class, earlier, may, randomized, appendix] [optimization, slack, dual, efficient, convergence, sigmoid, discussed, applies] [can, vector, also, min, assumption, linear, formulation, solution, convexity, error, paper] [test, data, true, given, predictor, well, result, uniform, increasing, particularly, corresponding, equation] [game, information, decision, convenient, typically, simply] [prediction, unlabeled, using, classifier, classification, used, work, different, including, predict, like, without, use, figure]
Completely random measures for modelling block-structured sparse networks
Tue Herlau, Mikkel N. Schmidt, Morten Mørup


[number, caron, fox, degree, obtained, block, edge, exchangeability, probability, exchangeable, vertex, aij, according, variable, expression, completely, many, assignment, crmsbm, notice, kallenberg, total, structure] [will, define, obtain, function, assume, consider, since, selecting] [sampling, latent, standard, supplementary, method, parameter, update, posterior, bayesian, iid, inference] [can, first, suppose, link, sparse, following, order] [random, measure, distribution, important, infinite, procedure, corresponding, crm, mass, statistical, associated, construction, based] [model, process, poisson, must, form, found, simple, prior] [network, using, figure, representation, use, generated, modelling, different, prediction, invariance, work, generate]
Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods
Antoine Gautier, Quynh N. Nguyen, Matthias Hein


[thus, unique, complete, number] [theorem, let, defined, function, every, lemma, problem, show, optimal, satisfies, constant, achieve, algorithm, proof, guarantee, loss, implies, max, learning, exists, rate, class, prove, consider, lipschitz] [method, objective, sgd, convergence, gradient, optimization, parameter, large, stochastic] [global, one, spectral, can, matrix, note, order, globally, main, optimality, small, see, optimum, certain, also, first, nonnegative, critical, paper, following, fuab, linear, tensor] [fixed, point, metric, nonlinear, radius, two, positive, data, section, space, test, given, uci] [model, reason, new] [neural, hidden, use, training, network, layer, deep, figure, performance, architecture, work]
Fast learning rates with heavy-tailed losses
Vu C. Dinh, Lam S. Ho, Binh Nguyen, Duy Nguyen


[probability, clustering, obtained, number] [learning, fast, lemma, function, hypothesis, rate, loss, class, risk, sup, bounded, satisfies, mendelson, exists, define, theorem, heavy, let, consider, exist, defined, least, prove, study, bound, optimal, obtain, derive, mehta, concentration, brownlees, depending, regularity, constant, implies] [convergence, standard, verify, log, machine, separable] [condition, can, assumption, denote, order, following, also, vector, arbitrarily, min] [empirical, envelope, distribution, finite, result, given, journal, random, positive, definite, robust, annals] [unbounded, information, failure, extend, new, van, assuming] [source, previous, using, enable, work, recall, neural]
Adaptive Concentration Inequalities for Sequential Decision Problems
Shengjia Zhao, Enze Zhou, Ashish Sabharwal, Stefano Ermon


[probability, number, walk, many, variable, often, threshold, definition, identification, law] [bound, algorithm, theorem, stopping, let, hoeffding, arm, adaptive, problem, set, best, asymptotically, concentration, inequality, optimal, bounded, will, achieve, show, hold, terminate, confidence, make, lil, exponentially, chosen, learning, positiveness, adaptively, constant, function, risk, crossed, unlikely, robert] [log, sequential, large, university, parameter] [can, also, zero, sparse, plot, one, analysis, following] [random, mean, boundary, empirical, based, sample, distribution, well, asymptotic, testing, lim, journal] [time, new, agent, decision, behavior, value, design] [figure, performance, similar, use, various, including, using]
Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning
Jean-Bastien Grill, Michal Valko, Remi Munos


[trailblazer, node, number, avg, tree, root, called, definition, possible, either, subset, child, uct, probability, notice, structure, identify, branching] [complexity, max, case, bound, algorithm, set, want, define, problem, call, consider, provide, optimistic, optimal, munos, may, defined, oracle, theorem, difficulty, conference, expected, function] [sampling, end, factor] [can, order, following, first, high, min, note, one] [sample, finite, measure, infinite, two, maximum, estimate, exponential, specific, quantity, polynomial, random, result] [planning, value, state, model, next, action, transition, information, mdp, control, potentially, policy, reward, decision, stop] [generative, depth, figure, previous, use, using, neural, like]
Maximization of Approximately Submodular Functions
Thibaut Horel, Yaron Singer


[definition, many, called, among, theory] [submodular, function, algorithm, case, set, constant, learning, ratio, lower, greedy, problem, monotone, show, cardinality, max, let, maximizing, obtain, exists, coverage, bound, submodularity, lemma, theorem, implies, class, even, returned, close, achieves, define, access, drawn, uniformly, inequality, active, general, proof, matroid, version, optimal, consider, since, convex, otherwise, bounded] [approximation, approximate, optimization, size, objective] [can, approximately, one, error, high, assumption, noise, via, solution, noisy, main, second, note, also, first] [random, given, result, curvature, two, application, additive, estimated, independent, exponential] [value, information, range] [using, used, input]
Adversarial Multiclass Classification: A Risk Minimization Perspective
Rizal Fathony, Anqi Liu, Kaiser Asif, Brian Ziebart


[consistency, constraint, number, potential, possible] [loss, multiclass, minimization, function, set, learning, risk, hinge, theorem, algorithm, label, svm, margin, llw, binary, universally, argmaxj, surrogate, max, since, best, peter, minimax, equilibrium, conference, convex, case] [dual, machine, method, optimization, parameter, datasets, large, processing] [vector, support, linear, can, formulation, also, relative, min, computational, low] [kernel, fisher, empirical, data, provides, journal, space, consistent, gaussian, true, universal, theoretical, two, given, multivariate, efficiency] [game, model, information, value] [adversarial, feature, training, classification, using, classifier, prediction, three, table, performance, accuracy, neural, input, use, generation, dataset, figure]
Learning Sensor Multiplexing Design through Back-propagation
Ayan Chakrabarti


[number, find, level] [learning, optimal, since, even, set] [inference, full, method, standard] [noise, measurement, one, can, also, see, hard, significantly, first, sensing, sparse, note, computational, ieee] [corresponding, joint, measure, based, two, entropy, given] [sensor, design, measured, across, light, learns, choice, intensity] [color, pattern, reconstruction, network, training, image, bayer, multiplexing, neural, using, learned, learn, approach, layer, use, jointly, different, used, channel, input, demosaicking, rgb, traditional, camera, output, train, deep, trained, psnr, higher, cfa, performance, single, able, better, similar, table, aliasing, encode, coded, convolutional, proposed, final, figure]
Multi-step learning and underlying structure in statistical models
Maia Fraser


[possible, probability, definition, structure, subset] [learning, class, bound, lower, case, let, shattering, function, algorithm, assume, concept, will, upper, complexity, expected, problem, example, learner, prove, define, consider, supremum, theorem, worst, loss, access, version, depends, defined, sup, hypothesis, since, implies, pac, best, now] [size, marginal, marginals] [dimension, can, also, error, one, compatibility, assumption, noisy, group, first, invariant, sufficient] [sample, distribution, joint, given, data, true, measure, manifold, uniform, two, generalization, determined, underlying, section, statistical, conditionals, finite, reference] [framework, model, action, stick, next] [using, labeled, unlabeled, use, used, figure, training, three]
Learning Bayesian networks with ancestral constraints
Eunice Yuh-Jie Chen, Yujia Shen, Arthur Choi, Adnan Darwiche


[ancestral, dag, constraint, edge, ordering, structure, path, decomposable, variable, graph, directed, iff, tree, contains, negative, find, cpdag, topological, number, ancestor, entailed, programming, adding, knowledge, enforce, compatible, shortest, denoted, represent] [set, learning, optimal, consider, satisfies, conference, algorithm, problem, finding, since, function, oracle, let, satisfy, show, exists, will, case, may] [bayesian, machine] [can, one, denote, also, linear, note, following, first, percentage] [given, based, section, two, space, positive, equation, maximum] [search, infer, artificial, time, heuristic] [approach, network, using, table, used, proposed, different, dataset, impact, international, available]
Coevolutionary Latent Feature Processes for Continuous-Time User-Item Interactions
Yichen Wang, Nan Du, Rakshit Trivedi, Le Song


[interaction, base, number, mae, influence, contains] [function, will, convex, may, online, service, algorithm, since] [latent, averaging, method, gradient, objective, compute, efficient, epoch, term, optimization, due] [user, can, item, rank, low, poissontensor, lowrankhawkes, coevolving, one, matrix, also, observed, stic, factorization, recommendation, collaborative, hence, group, iptv, movie, product, interact, fip, ank] [point, data, two, conditional, based, space, given, testing, kernel, exponential] [time, model, temporal, intensity, process, event, hawkes, history, drift, nature, reddit, new, coevolutionary, yelp, poisson, framework] [feature, prediction, different, using, figure, capture, better]
Universal Correspondence Network
Christopher B. Choy, JunYoung Gwak, Silvio Savarese, Manmohan Chandraker


[negative, many, mining, number] [learning, loss, since, fast, surrogate, margin] [faster, efficient, large, normalization, optimization] [also, accurate, note, global, hard, second, sparse, invariant, can] [metric, geometric, neighbor, nearest, space, test, universal, estimation] [raw, prior, across, directly] [correspondence, convolutional, spatial, network, use, image, dense, feature, transformer, semantic, patch, fully, contrastive, sift, visual, figure, similarity, table, training, ucn, dataset, pck, performance, used, ground, flow, truth, using, pair, kitti, different, propose, train, shape, deep, layer, keypoints, neural, keypoint, learn, daisy, accuracy, without, trained, siamese, object, pascal, cnn, novel, matching, various, cub, generate, appearance, several]
Active Learning with Oracle Epiphany
Tzu-Kuo Huang, Lihong Li, Ara Vartanian, Saleema Amershi, Xiaojin Zhu


[number, disagreement, probability, induced, threshold, total] [oracle, epiphany, active, learning, query, algorithm, epical, complexity, version, will, agnostic, hypothesis, label, may, passive, cal, bound, case, unknown, set, erm, learner, define, let, mcal, since, online, problem, theorem, induce, every, webpage, risk, appendix, least, assume, realizable, analyze, abstain, regret, show, huang, consider, formalize] [standard, end, term, processing, machine, step, factor] [analysis, can, error, one, note, also, following, small, order] [space, two, empirical, section, distribution, given, additive, theoretical, test] [information, model, human, basketball, current] [region, neural, input, without, unlabeled, classification, labeled, figure]
Exploiting Tradeoffs for Exact Recovery in Heterogeneous Stochastic Block Models
Amin Jalali, Qiyang Han, Ioana Dumitriu, Maryam Fazel


[community, sbm, number, block, heterogenous, nmin, pmin, detection, program, many, probability, recoverability, graph, recoverable, configuration, yij, summary, notice, possible, edge, total, clustering] [theorem, convex, consider, example, case, provide, bound, appendix, equivalent, algorithm, proof, assume, concentration, considered, general, upper, lower] [log, stochastic, size, likelihood, parameter, large, efficiently] [can, recovery, small, exact, matrix, semidefinite, one, observed, condition, via, recover, see, high, still, spectral, connectivity, following, first, relative, sparse, analysis, computational] [random, given, well, maximum, section, estimator, two, based, statistical] [model, information] [similar, different, dense, understanding, use]
Measuring the reliability of MCMC inference with bidirectional Monte Carlo
Roger B. Grosse, Siddharth Ancha, Daniel M. Roy


[probabilistic, reverse, programming, number, many, partition, quality] [bound, upper, lower, appendix, may, function, show, ratio, obtain, consider, set, since] [log, posterior, inference, approximate, divergence, bread, mcmc, stochastic, jeffreys, monte, stan, carlo, webppl, dkl, method, hyperparameters, bdmc, unbiased, sampling, validate, convergence, expectation, qrev, importance, bidirectional, initial, automatic, rev] [can, one, real, exact, analysis, order, matrix, also, first, nonnegative] [data, distribution, true, section, estimate, sample, chain, given, estimator, two, based, described, statistical, fitted] [simulated, forward, model, state, behavior, target, markov] [using, used, use, accuracy, bounding, evaluate, produced, sequence, different]
Adaptive Maximization of Pointwise Submodular Functions With Budget Constraint
Nguyen Cuong, Huan Xu


[possible, partial, constraint, half] [cost, utility, adaptive, set, submodular, function, greedy, submodularity, budget, pointwise, problem, theorem, active, learning, modular, fworst, satisfies, setting, best, optimal, consider, budgeted, general, let, alc, blc, constant, monotone, assume, maximizes, case, guarantee, put, coverage, class, observe, proof, may, show, andreas, will, considered, depends, define, worst] [optimization, factor, select, supplementary] [can, also, item, one, minimal, still, note] [two, data, given, selected, section, uniform, theoretical, generalization] [policy, realization, state, information, value, next] [better, previous, using, table]
NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization
Davood Hajinezhad, Mingyi Hong, Tuo Zhao, Zhaoran Wang


[variable] [problem, algorithm, convex, function, define, case, rate, consider, will, set, theorem, let, setting, constant, show, learning, achieves, worst] [nonconvex, convergence, gradient, stochastic, distributed, nonsmooth, optimization, saga, nestt, smooth, method, sampling, dual, proximal, iteration, update, rgi, quadratic, incremental, picked, descent, processing, splitting, primal, sgd, converges, siam] [can, following, note, component, linear, suppose, local, optimality, min, assumption, solution, one, gap, also, analysis, matrix, signal, sublinear, see, comparison, related] [based, uniform, given, stationary, section, result, data, journal] [information, agent] [different, randomly, proposed, using, used, generated, neural, recent, table]
Generalized Correspondence-LDA Models (GC-LDA) for Identifying Functional Regions in the Brain
Timothy Rubin, Oluwasanmi O. Koyejo, Michael N. Jones, Tal Yarkoni


[number, probability, indicator, variable, assigned, total, david] [function, set, version, equivalent] [constrained, processing, inference] [can, one, via, note, generalized, also, paper, onto, sampled] [word, topic, distribution, functional, subregions, gaussian, peak, token, document, data, subregion, neurosynth, neuroimaging, sample, symmetry, multinomial, two, based, section, associated, mixture, lda, linguistic, correspond, multivariate, refer, given, corresponding, provided, described] [model, brain, cognitive, fmri, unconstrained, human, ability, significant, across, modeling] [spatial, activation, figure, neural, single, three, using, generative, correspondence, capture, database, shown, predicting, language, extracted, novel, used, different, text, region]
A Minimax Approach to Supervised Learning
Farzan Farnia, David Tse


[rule, probability, find, variable] [loss, minimax, problem, learning, theorem, function, set, svm, consider, max, minimizing, will, expected, duality, version, logarithmic, convex, optimal, call, erm, defined, risk, show, hinge, define, general, conference, let] [likelihood, quadratic, machine, step, supplementary, marginal, minimizes, divergence, material, logistic, standard] [linear, can, generalized, regression, following, also, solving, error, via, formulation, min] [maximum, distribution, entropy, bayes, conditional, given, principle, empirical, robust, based, specific, journal, statistical, discrete, underlying, denotes, address] [decision, information, model, uncertainty] [approach, figure, prediction, training, classification, using, supervised, propose, applying, task, predict, proposed, different, seen]
Residual Networks Behave Like Ensembles of Relatively Shallow Networks
Andreas Veit, Michael J. Wilber, Serge Belongie


[path, number, many, collection, removing, block, valid, ensemble] [show, remaining, even, learning, follows] [gradient, stochastic, depend, strongly, batch, processing] [can, error, one, short, view, first, behave, though, fraction] [length, distribution, test, data, result, sample] [individual, module, long, simple, whether, early, exhibit] [residual, network, neural, figure, deleting, training, layer, deep, depth, effective, single, input, vgg, performance, output, highway, dropping, convolutional, skip, unraveled, building, preprint, impact, lesion, investigate, computer, contribute, like, arxiv, shown, trained, downsampling, use, seen, smoothly, flow, vision, agnitude, perform, visual, imagenet, vanishing, deeper, traditional, work, previous]
Variational Bayes on Monte Carlo Steroids
Aditya Grover, Stefano Ermon


[variable, directed, probability, number, parity, probabilistic] [learning, lower, algorithm, bound, will, defined, theorem, set, problem, tight, class, binary] [variational, inference, latent, marginal, log, sampling, posterior, sigmoid, approximate, mod, monte, approximation, carlo, approximating, gradient, amortized, iid, importance, stochastic, optimization, intractable, caltech, key] [can, projected, projection, high] [random, discrete, based, distribution, data, given, family, estimator, true, mean, empirical, test, sample, theoretical] [belief, model, new, median, choice, demonstrate] [using, network, generative, approach, performance, neural, used, use, dataset, single, deep, layer, trained]
Gradient-based Sampling: An Adaptive Importance Sampling for Least-squares
Rong Zhu


[according] [algorithm, since, get, bound, cost, fast, risk, make, theorem, problem, improved, show, excess, argue, considered] [sampling, grad, subsample, pilot, lev, size, gradient, large, method, full, importance, efficient, approximating, supplementary, approximate, xti, replacement, mse, unif, initial, optimization, alev, datasets, computing, log, establishes] [can, computational, solution, matrix, error, also, analysis, real, one, much, small, synthetic, running, leverage] [data, estimate, random, sample, section, uniform, mixture, logarithm, given, statistical, empirical, two, statistically, way, journal] [time, model, poisson, simple, information, choice] [performance, figure, input, use, better, table, various, proposed, similar, different]
Stochastic Variational Deep Kernel Learning
Andrew G. Wilson, Zhiting Hu, Ruslan R. Salakhutdinov, Eric P. Xing


[structure, base, number, many, runtime, probabilistic] [learning, show, conference] [variational, stochastic, inducing, inference, large, scalable, marginal, likelihood, sampling, extra, latent, exploiting, interpolation, scalability, gradient, method, full, machine, intelligence, expressive, wilson, processing, hensman, airline, standard, efficient] [can, correlated, matrix, local, see, regression, sparse] [gaussian, kernel, additive, data, covariance, mixing, section, procedure] [model, process, time, information, artificial] [deep, classification, dnn, training, accuracy, neural, layer, output, figure, input, hidden, network, used, approach, proposed, use, performance, learn, trained, multiple, table, using, applied, architecture, preprint, arxiv, international, learned]
Threshold Bandits, With and Without Censored Feedback
Jacob D. Abernethy, Kareem Amin, Ruihao Zhu


[threshold, potential, number, probability, present] [arm, regret, bandit, bound, ucb, algorithm, censored, function, learner, setting, learning, xit, problem, max, round, now, define, let, nit, kmucb, uncensored, assume, expected, lemma, optimistic, mab, will, theorem, set, inequality, proof, consider, playing, show, peter, chosen, confidence, dkwucb, receive, bounded, deviation, drawn] [stochastic, standard, log, end] [can, one, order, observed, note, error, analysis, following, arg, significantly, relies, min] [given, estimator, sample, distribution, estimate, uniform, two, fixed, mean, based, classical, section, empirical, survival, exp, procedure, dark] [value, reward, time, feedback, information, policy, rti, new, pool, payoff, simply] [adversarial, use]
Error Analysis of Generalized Nyström Kernel Regression
Hong Chen, Haifeng Xia, Heng Huang, Weidong Cai


[subset, coefficient, obtained, theory] [learning, function, hypothesis, fast, theorem, general, rate, bound, dependent, complexity, studied, let, least, optimal, sin, drawn, consider, just, constant, since] [approximation, sampling, method, convergence, select, key, year, parameter, operator, university] [can, analysis, regression, error, regularization, matrix, following, computation, min, symmetric, norm, condition, also, related, computational, paper, column, linear, generalized, square, leverage, rmse] [kernel, gnkr, data, generalization, theoretical, space, empirical, subsampling, positive, knn, sample, associated, given, indefinite, random, satisfactory, two, well, gaussian, fixed, introduce] [continuous, design] [used, previous, performance, table, prediction, different]
Without-Replacement Sampling for Stochastic Gradient Methods
Ohad Shamir


[number, average, split] [algorithm, learning, convex, loss, will, least, bound, function, online, expected, consider, get, regret, smaller, uniformly, rademacher, let, show, complexity, theorem, assume, since, lipschitz, proof, instance, upper, chosen, setting, notation, set] [stochastic, gradient, sampling, machine, svrg, distributed, convergence, optimization, descent, transductive, parameter, suboptimality, size, respect, usually, expectation] [can, also, one, condition, need, note, solution, regularized, assumption, linear, suppose, order, good, analysis, much, require] [data, random, permutation, well, application, result] [individual, communication, required, long, assuming] [using, without, similar, use, randomly, single, several, used, arxiv, preprint, applied]