Freewire: An Experiment with Freely Wired Neural Networks

nerdponx · on March 19, 2021

I know there's real development and innovation here, but any time I hear about randomly or freely-wired neural networks, I can't help but be reminded of the "hacker koan":

In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. "What are you doing?", asked Minsky. "I am training a randomly wired neural net to play Tic-tac-toe", Sussman replied. "Why is the net wired randomly?", asked Minsky. "I do not want it to have any preconceptions of how to play", Sussman said. Minsky then shut his eyes. "Why do you close your eyes?" Sussman asked his teacher. "So that the room will be empty." At that moment, Sussman was enlightened.

stainforth · on March 20, 2021

I wonder if this gets at the Chomsky theory about an ingrained language or approach, and if so then it would be useful to pattern the initial arrangement of a neural network intentionally to match what biology has already laid out.

dnautics · on March 19, 2021

I had the identical thought.

ttul · on March 19, 2021

Can anyone link to a recent paper discussing the benefits or advantages of freely wired networks vs. prescribed networks?

nujrabes · on March 19, 2021

Haven't heard the term "freely wired" before, but FAIR released an exploration of randomly wired neural networks which seems conceptually similar. https://arxiv.org/abs/1904.01569

monocasa · on March 19, 2021

Randomly wired networks reminds me of a classic AI koan:

> In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.

> “What are you doing?”, asked Minsky.

> “I am training a randomly wired neural net to play Tic-Tac-Toe” Sussman replied.

> “Why is the net wired randomly?”, asked Minsky.

> “I do not want it to have any preconceptions of how to play”, Sussman said.

> Minsky then shut his eyes.

> “Why do you close your eyes?”, Sussman asked his teacher.

> “So that the room will be empty.”

> At that moment, Sussman was enlightened.

ad404b8a372f2b9 · on March 20, 2021

I don't see how that relates. You don't stick to a single random wiring, you only make the wiring random so that the space of possible wirings of the network is not constrained by the classical sequential paradigm.

optimalsolver · on March 20, 2021

Except random search was found to outperform state-of-the-art neural architecture search algorithms:

https://arxiv.org/abs/1902.08142

greatgoat420 · on March 20, 2021

The story is more about how a random set of weights WILL have preconceptions of how to play. If you look at condition numbers or spectra of random normal matrices they are very much not random.

ttul · on March 20, 2021

Back in engineering school, a million years ago, my partner and I used simulated annealing to “design” a digital circuit implementation in CMOS (a 32 bit CRC), minimizing various parameters like wire length to optimize its function. It worked shockingly well when simulated.

I am a huge fan of using randomized starting states and then allowing the computer to discover the best architecture. It produces, if nothing else, surprising results.

ad404b8a372f2b9 · on March 20, 2021

This is good, I think research in this direction will yield the next breakthrough in ML. The current hierarchical feed-forward model of neural networks is what's limiting our advances. If you look at ResNet or DenseNet, their skip-connections are hacks to bypass that limitation and it brings great improvements. And if you look at the rooting by agreement technique for capsule networks, it's clear that getting biases from later layers to earlier layers improves things as well. And even in our own virtual cortex, it's only up to V1 that things are remotely hierarchical, after that it's a mess of looping wiring between areas. We need to throw away the current neural network and adopt a new paradigm that can express both feed-forward and feed-backward connections innately.

IdiocyInAction · on March 20, 2021

It's not like that hasn't been tried before though. Boltzmann machines have all-to-all connectivity and RNNs have feedback connections (and sometimes backward connections).

NNs have been studied since the 90s and a lot has been tried already. I think one should also keep the bitter lesson in mind.

ad404b8a372f2b9 · on March 20, 2021

Convolutional neural networks were also tried and failed to fulfill their potential for 20 years (or 50 depending on how far you go back) until they didn't and now they're ubiquitous. You cannot discard a whole area of research because one or some implementations of a concept have so far failed to become competitive.

All to all connectivity is possibly the worst paradigm unless you want to do architecture search so I wouldn't hold it as an example of anything wrt this concept. As for recurrent neural networks to my knowledge they've only been used for recurrence in time or space not recurrence between layers for a single sample, though I might be wrong, so they're not relevant to what I'm speaking of. Though there is some work on skip connections (forward and backward) which takes some inspiration from their gating mechanisms.

UncleOxidant · on March 20, 2021

Isn't this basically NEAT (https://en.wikipedia.org/wiki/Neuroevolution_of_augmenting_t...), except in NEAT the network is evolved with a genetic algorithm.

R0b0t1 · on March 20, 2021

Well, no, it looks like this aims to allow you to easily implement such an algorithm.

UncleOxidant · on March 20, 2021

I think NEAT has the advantage here of figuring out a good topology for you.

mpoteat · on March 19, 2021

If freely wired neutral networks are DAGs, I wonder how cyclic graphs of neurons behave, or even if that model would be theoretically meaningful, or computationally feasible.

LeegleechN · on March 19, 2021

Yep, they've existed for decades. Look up "RNN" (recurrent neural network) and "LSTM" (long stort-term memory). They were the standard for neural-network based time-series processing for a while until recently supplanted by Transformers.

goldenkey · on March 19, 2021

https://en.wikipedia.org/wiki/Boltzmann_machine

You generally run them until they settle down. You can train them, in a difficult process, by using the idea of thermal equilibrium.