Magic in algorithm implementation

It’s quite common when you get stuck in algorithm implementation, no matter how hard tried, it just doesn’t converge.

There’s a good line in the classic “Matrix” movie: Sooner or later, you are going to realize, it’s different between knowing the path and walking the path.

Whenever get stuck, it’s a good chance to understand deeper, learn more.

Among these takeaways, a piece of interesting part is: implementing algorithm is not like the analogy of building, a whole building won’t collapse because of a small brick was not in place, but the prior does.

Let’s initiate with 2 samples, and sure the list will grow:

case 1: leave out the plus sign

In a simple neuron network, when doing the training part, we are trying to get the right “weight” as described below:

\[Weight += Error * Learning_rate * Input\]

Here, a small “+” sign was leave out, and in a consequence, the weight parameter won’t converge, as every time doing the training, it starts from a fresh, instead of standing on previous weights’ tuning basis. Leave the “+” out, there won’t be any “error” or “warning” generated from your program, it just doesn’t converge.

case 2: leave out the bracket notation

This is a language specific case:

import numpy as np
threshold = 0.1
rand = np.random.random
if rand < threshold:
    do_something()

The program will run with no error or warnings under python 2.7 version, except the “do_something()” will never be executed.

The “np.random.random” is a python builtin method or function, the “np.random.random()” is a “float”. The error is quite covert as the program does not throw out any warning or error message.

This might not be a perfect case to emphasize importance of a single notation, cases like leave out brackets are also quite common in formulas. We will try to come up with more later.

case 3: dot product on different objects

import numpy as np
a = np.mat("1 2;3 4")
b = np.mat("1 -1;0 1")
print(a * b)

print(np.multiply(a, b))

a = np.mat("1 2;3 4").A
b = np.mat("1 -1;0 1").A
print(a * b)

Now, question: will these 3 print out the same content? If not, which should be applied into dot product when doing convolution operation in deep neural network?

The type of first a is matrix, and second is ndarray, we will need to transform matrix into ndarray to perform convolution dot product or replace “*” with “multiply”.

There are the following dot commands in numpy:

print(np.dot(a, b))
print(np.matmul(a, b))
print(a * b)
print(a@b)
print(np.multiply(a, b))

Watch out for the correct usage.