Let’s build a simple neural network from scratch in pure python plus numpy and finally train it in a genetic fashion.
Let’s start importing the libraries we need in a Jupiter notebook:
import numpy as np import matplotlib.pyplot as plt import os, sys from matplotlib.pyplot import cm import scipy.ndimage as ndimage %matplotlib inline if os.name == 'posix': %config InlineBackend.figure_format = 'retina' #retina display settings
Let us define the activation function. We choose here a sigmoid, but it can be easily changed to a tanh or one of the many ReLu variants.
def sigmoid(x): return 1 / (1 + np.exp(-x))
Let us create our neural network class. In the
__init__ function we require the
network_structure, which is a list of integers defining the size of each layer of the network.
network_structure is the size of the input layer, while
network_structure[-1] the size of the output one. The number of degrees of freedom of the network can be obtained using the
genome_size function. Using set
set_genome one can load a set of genomes into the network. Its input (
genome) is a 2D numpy array containing a sequence of different genomes. Its shape must be (number_of_genomes, genome_size).
feedforward function uses the network to map a collection of inputs to a collection of outputs. The input shape must be (number_of_genomes, input_size), while the shape of the output is (number_of_genomes, output_size).
class NeuralNetwork(object): def __init__(self, network_structure): self.network_structure = network_structure self.structure =  i = 0 for a, b in zip(self.network_structure[:-1], self.network_structure[1:]): self.structure.append([i, i+a*b, i+a*b+b, a, b]) i = i + a*b + b self._genome_size = i def set_genome(self, genome): population = genome.shape self.weights =  self.biases =  for i in self.structure: self.weights.append(genome[:, i:i].reshape(population, i, i)) self.biases.append(genome[:, i:i].reshape(population, i)) def feedforward(self, inputs): layer = inputs for weigh, bias in zip(self.weights, self.biases): layer = sigmoid(np.einsum('ji,jik->jk', layer, weigh) + bias) return layer @property def genome_size(self): return self._genome_size
Let’s use the network we just created to solve a simple task: we generate a horizontal random racing track and player wins if it reacher the end without running off the road. A random track can be generated as follow:
xx = np.arange(0, 400) freq = np.exp(-(xx/10-20)**2/100) par = (xx/10-25)**2/100 top_barrier = np.sin(xx/10*freq)*0.5 + 0.1 + np.sin(xx/10) + par top_barrier = top_barrier - top_barrier + 0.5 bottom_barrier = top_barrier - 1
Let’s define the player physics: a dot with a given 2D position and speed. Every evolution step changes the position vector by the speed vector, moreover, the speed vector is updated by the force vector (mass=1).
class Player(object): def __init__(self, p0, v0): self.p = p0 self.v = v0 def step(self, force=0.1): self.p = self.p + self.v/2 self.v = self.v + force self.p = self.p + self.v/2
Let’s define two helper functions. The former runs the simulation given a set of genomes, the track, and the neural network. The latter plot both the track together with the player paths for a given set of genomes.
run function takes as input a set of genomes and create the players. The player variable has the shape (number_of_genomes, track_length), where track_length corresponds to the horizontal size of the track. Moreover, the
sensor list contains the position of three sensors owed by the player: each frame the player knows the position of the track boundaries few pixels ahead. We initialize the neural network with the genomes and we run the simulation for track_length - last_sensor steps. Finally, the function returns a list of scores and genomes, sorted by the score. If
get_path==True the path of each player is returned.
def run(genomes, top_barrier, bottom_barrier, network, get_path=False, sensors=[0, 5, 9]): player = Player(np.zeros(genomes.shape), np.zeros(genomes.shape)) network.set_genome(genomes) players = np.ones([genomes.shape, top_barrier.shape]) max_score = (top_barrier.shape - sensors[-1]) scores = np.ones(genomes.shape)*max_score for index in range(max_score): players[:, index] = player.p inputs = np.array([player.v, player.p-top_barrier[index + sensors], player.p-bottom_barrier[index + sensors], player.p-top_barrier[index + sensors], player.p-bottom_barrier[index + sensors], player.p-top_barrier[index + sensors], player.p-bottom_barrier[index + sensors] ]).reshape(7, genomes.shape).transpose() force = network.feedforward(inputs).flatten() - 0.5 player.step(force=force) is_out = np.logical_or(player.p>top_barrier[index], player.p<bottom_barrier[index]) current = (1-is_out)*max_score + (is_out)*index scores = np.min([scores, current], axis=0).astype(np.int) if get_path: return [players, scores] results = [scores, genomes] return list(zip(*sorted(list(zip(*results)), key=lambda x: -x)))
plot_genome plots the track and the path chosen by the different genomes. The better is the result of the single genome, the darker is its path color.
def plot_genome(genomes, top_barrier, bottom_barrier, network, alpha=0.5): genomes = np.array(genomes) xx = np.arange(top_barrier.shape) genome_paths = run(genomes, top_barrier, bottom_barrier, network, get_path=True) fig, ax = plt.subplots(1, figsize=(16, 4)) for y, x_max in zip(*genome_paths): ax.plot(y[:x_max], color=cm.viridis(1-(x_max/np.max(genome_paths)*0.9 + 0.05)), alpha=alpha) ax.fill_between(xx, top_barrier, y2=np.max(top_barrier), color='#dddddd') ax.fill_between(xx, np.min(bottom_barrier), y2=bottom_barrier, color='#dddddd') ax.set_xlim(0, np.max(xx)) ax.set_ylim(np.min(bottom_barrier), np.max(top_barrier)) return a
Let’s initialize the neural network with the following structure:
[7,8,4,1]. 7 inputs, two hidden layers with 8 and 4 neurons respectively, and one output. We set each population to have 50 different genomes, which are initialized at random using a uniform distribution between -1 and 1. The result of running the simulation with this first set of random genomes is plotted below.
network = NeuralNetwork(network_structure=[7,8,4,1]) POPULATION = 50 np.random.seed(42) genomes = np.random.uniform(-1, 1, size=(POPULATION, network.genome_size)) results = run(genomes, top_barrier, bottom_barrier, network)
As expected, the first population has very poor performance. Nevertheless, we can still see that some genomes perform much better than others. We will see that defining the next generation of genomes by variations of the best performers, will quickly lead to good drivers.
Evolution and selection
We now need to define how the next generation of genomes is created starting by the previous one. We choose a simple approach here: to create a child we randomly choose a parent by sampling using an exponential distribution (from the list of parents, sorted by performance). We then mutate the genome by summing each of the genome weights by a random number, selected by a uniform distribution between
def evolution_step(results, mutation_rate=0.1): population = len(results) new_generation =  for _ in range(population): genome_in_mutation = results[min([int(np.random.exponential(scale=5)), population-1])] genome_in_mutation = genome_in_mutation + np.random.uniform(-mutation_rate, mutation_rate, size=genome_in_mutation.shape) new_generation.append(genome_in_mutation) return np.array(new_generation)
We can now run the simulations for a number of epochs and look at the performance of the model. Below, we report the evolution after 30, 60 and 120 epochs, plotting the first 20 performers.
scores =  for index in range(NUMBER_OF_EPOCHS): genomes = evolution_step(results, mutation_rate=0.1) results = run(genomes, top_barrier, bottom_barrier, network) scores.append(np.mean(results[:-20]))
By looking at the performance at each evolution step we can see that, between 50 and 80 epochs, the model learns how to stay on track. Beyond 100 epochs almost all the genomes complete the track with no problem.
Testing the model
We now want to check whether if the model learned to drive or just memorized the track. To check that, we create a brand new, much harder track, and we test how the model performs in an unknown environment, with no training. Let’s create the track:
test_xx = np.arange(0, 4000) np.random.seed(0) test_top_barrier = np.random.random(test_xx.shape) test_top_barrier = ndimage.gaussian_filter1d(test_top_barrier, 10)*20 test_top_barrier = test_top_barrier - test_top_barrier + 0.5 test_bottom_barrier = test_top_barrier - 1
And run the simulation, showing again the best 20 performers.
ax = plot_genome(results[:20], test_top_barrier, test_bottom_barrier, network)
As we can see, all the 20 genomes reach the end of the track! The conclusion is that the neural network learned to stay on track, and its knowledge is general and not track specific.