Ruben Ohana

 

Briefly

I am a second-year Ph.D. student under the supervision of Florent Krzakala (EPFL, formerly at ENS), Alessandro Rudi (INRIA - DIENS, Sierra Team) and Laurent Daudet (LightOn). I do research in Machine Learning at the Ecole Normale Supérieure in Paris and in collaboration with the startup LightOn.

My research interests focus on random features for kernel approximation, random matrices and alternative training methods for deep learning. More precisely, my current work focuses on using Random Features to approximate kernel methods and to study deep learning frameworks such as Reservoir Computing (accepted as Oral @ NeurIPS 2020). I am also looking for new machine learning applications of the Optical Processing Unit developed by LightOn such as kernel approximation or as an adversarial defense.

Before starting my Ph.D., I graduated with an engineering degree in Physics from ESPCI Paris, an MSc in Condensed Matter from the Master ICFP at the Ecole Normale Supérieure and an MSc in Statistics/Machine Learning from the Master in Mathematics at Sorbonne University. I did an internship at NTT Basic Research Laboratory in Japan, where I worked on the Quantum spin Hall Effect in InAS/GaSb double quantum wells, experimentally and theoretically under the supervision of Irie Hiroshi. I also did my first master thesis at the LIGO (the gravitational wave observatory), at the Massachussetts Institute of Technology, where I studied the next generation of lasers that will be integrated in the next update of the interferometer, under the supervision of Peter Fritschel. I did my second master thesis at the Quantum Information Group at the LIP6 at Sorbonne University where I studied Contextuality in quantum information networks, under the supervision of Damian Markham.

You can find me on Google Scholar, Twitter and LinkedIn. Here is a short CV.

Contact

Publications and Preprints

  • R. Ohana*, A. Cappelli*, J. Launay, L. Meunier, I. Poli, F. Krzakala. Adversarial Robustness by Design through Analog Computing and Synthetic Gradients, 2021, preprint.
    [arXiv, Github] [Show Abstract]

    Abstract: We propose a new defense mechanism against adversarial attacks inspired by an optical co-processor, providing robustness without compromising natural accuracy in both white-box and black-box settings. This hardware co-processor performs a nonlinear fixed random transformation, where the parameters are unknown and impossible to retrieve with sufficient precision for large enough dimensions. In the white-box setting, our defense works by obfuscating the parameters of the random projection. Unlike other defenses relying on obfuscated gradients, we find we are unable to build a reliable backward differentiable approximation for obfuscated parameters. Moreover, while our model reaches a good natural accuracy with a hybrid backpropagation - synthetic gradient method, the same approach is suboptimal if employed to generate adversarial examples. We find the combination of a random projection and binarization in the optical system also improves robustness against various types of black-box attacks. Finally, our hybrid training method builds robust features against transfer attacks. We demonstrate our approach on a VGG-like architecture, placing the defense on top of the convolutional features, on CIFAR-10 and CIFAR-100.

  • M. Refinetti*, S. d'Ascoli*, R. Ohana, S. Goldt. The dynamics of learning with feedback alignment, 2020, preprint.
    [arXiv, Github, Twitter thread ] [Show Abstract]

    Abstract: Direct Feedback Alignment (DFA) is emerging as an efficient and biologically plausible alternative to the ubiquitous backpropagation algorithm for training deep neural networks. Despite relying on random feedback weights for the backward pass, DFA successfully trains state-of-the-art models such as Transformers. On the other hand, it notoriously fails to train convolutional networks. An understanding of the inner workings of DFA to explain these diverging results remains elusive. Here, we propose a theory for the success of DFA. We first show that learning in shallow networks proceeds in two steps: an alignment phase, where the model adapts its weights to align the approximate gradient with the true gradient of the loss function, is followed by a memorisation phase, where the model focuses on fitting the data. This two-step process has a degeneracy breaking effect: out of all the low-loss solutions in the landscape, a network trained with DFA naturally converges to the solution which maximises gradient alignment. We also identify a key quantity underlying alignment in deep linear networks: the conditioning of the alignment matrices. The latter enables a detailed understanding of the impact of data structure on alignment, and suggests a simple explanation for the well-known failure of DFA to train convolutional neural networks. Numerical experiments on MNIST and CIFAR10 clearly demonstrate degeneracy breaking in deep non-linear networks and show that the align-then-memorize process occurs sequentially from the bottom layers of the network to the top.

  • A. Sohbi, R. Ohana, I. Zaquine, E. Diamanti, D. Markham. Experimental Approach to Demonstrating Contextuality for Qudits, 2020, under review.
    [arXiv] [Show Abstract]

    Abstract: We propose a method to experimentally demonstrate contextuality with a family of tests for qudits. The experiment we propose uses a qudit encoded in the path of a single photon and its temporal degrees of freedom. We consider the impact of noise on the effectiveness of these tests, taking the approach of ontologically faithful non-contextuality. In this approach, imperfections in the experimental set up must be taken into account in any faithful ontological (classical) model, which limits how much the statistics can deviate within different contexts. In this way we bound the precision of the experimental setup under which ontologically faithful non-contextual models can be refuted. We further consider the noise tolerance through different types of decoherence models on different types of encodings of qudits. We quantify the effect of the decoherence on the required precision for the experimental setup in order to demonstrate contextuality in this broader sense.

  • R. Ohana*, J. Dong*, M. Rafayelyan, F. Krzakala. Reservoir Computing meets Recurrent Kernels and Structured Transforms, Advances in Neural Information Processing Systems 33 (Oral @ NeurIPS 2020).
    [NeurIPS, Oral (starts at 46:30), arXiv, Github, Twitter thread ] [Show Abstract]

    Abstract: Reservoir Computing is a class of simple yet efficient Recurrent Neural Networks where internal weights are fixed at random and only a linear output layer is trained. In the large size limit, such random neural networks have a deep connection with kernel methods. Our contributions are threefold: a) We rigorously establish the recurrent kernel limit of Reservoir Computing and prove its convergence. b) We test our models on chaotic time series prediction, a classic but challenging benchmark in Reservoir Computing, and show how the Recurrent Kernel is competitive and computationally efficient when the number of data points remains moderate. c) When the number of samples is too large, we leverage the success of structured Random Features for kernel approximation by introducing Structured Reservoir Computing. The two proposed methods, Recurrent Kernel and Structured Reservoir Computing, turn out to be much faster and more memory-efficient than conventional Reservoir Computing.

  • R. Ohana, J. Wacker, J. Dong, S. Marmin, F. Krzakala, M. Filippone, L. Daudet. Kernel Computations from large-scale random features obtained by Optical Processing Units, International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020) .
    [ICASSP, arXiv, Github ] [Show Abstract]

    Abstract: Approximating kernel functions with random features (RFs) has been a successful application of random projections for nonparametric estimation. However, performing random projections presents computational challenges for large-scale problems. Recently, a new optical hardware called Optical Processing Unit (OPU) has been developed for fast and energy-efficient computation of large-scale RFs in the analog domain. More specifically, the OPU performs the multiplication of input vectors by a large random matrix with complex-valued i.i.d. Gaussian entries, followed by the application of an element-wise squared absolute value operation - this last nonlinearity being intrinsic to the sensing process. In this paper, we show that this operation results in a dot-product kernel that has connections to the polynomial kernel, and we extend this computation to arbitrary powers of the feature map. Experiments demonstrate that the OPU kernel and its RF approximation achieve competitive performance in applications using kernel ridge regression and transfer learning for image classification. Crucially, thanks to the use of the OPU, these results are obtained with time and energy savings.

  • H. Irie, T. Akiho, F. Couedo, R. Ohana, S. Suzuki, H. Onomitsu, K. Muraki. Impact of epitaxial strain on the topological-nontopological phase diagram and semimetallic behavior of InAs/GaSb composite quantum wells, 2020, Physical Review B.
    [Phys. Rev. B, arXiv] [Show Abstract]

    Abstract: We study the influence of epitaxial strain on the electronic properties of InAs/GaSb composite quantum wells (CQWs), host structures for quantum spin Hall insulators, by transport measurements and eight-band k⋅p calculations. Using different substrates and buffer layer structures for crystal growth, we prepare two types of samples with vastly different strain conditions. CQWs with a nearly strain-free GaSb layer exhibit a resistance peak at the charge neutrality point that reflects the opening of a topological gap in the band-inverted regime. In contrast, for CQWs with 0.50% biaxial tensile strain in the GaSb layer, semimetallic behavior indicating a gap closure is found for the same degree of band inversion. Additionally, with the tensile strain, the boundary between the topological and nontopological regimes is located at a larger InAs thickness. Eight-band k⋅p calculations reveal that tensile strain in GaSb not only shifts the phase boundary but also significantly modifies the band structure, which can result in the closure of an indirect gap and make the system semimetallic even in the topological regime. Our results thus provide a global picture of the topological-nontopological phase diagram as a function of layer thicknesses and strain.