INTRODUCTION
CS112 - Fall 2024
Later this semester, you will create a working neural network in Java, using only your own code. In later
classes, you will probably use neural network libraries developed by others to learn about many facets
of Machine Learning. But in this class, you will learn that there is no magic in making a neural network—
it is something you can build yourself...though the fact that neural networks perform so well does seem
like magic.
What is a neuron?
A neuron is a nerve cell or brain cell. They are found in any animal with a brain or some approximation of a brain. (This is almost every type of multicellular animal: people, insects, fish, even jellyfish...but not sea sponges!)
Nerve cells have multiple inputs and a single output. The inputs come from other nerve cells or from "sensors" such as the eye's retina or the ear, and the outputs go to other nerve cells or to "actuators" such as muscles or organs.
Ok, What is a Neuron in a Computer?
Researchers were intrigued by the ability of large networks of animal neurons – that is, "brains" – to store information, make decisions, and learn. They began experimenting with simple computer functions that mimicked the understood bioelectrical operation of neurons, and they got surprisingly good results on a variety of different tasks.
A popular computer function, the "perceptron", was introduced by Frank Rosenblatt in 1957. The basic operation of a perceptron is:
-
wi are a set of weights
-
inputi are the inputs to the perceptron
-
b is an additive bias
-
activation() is an "activation function", a nonlinear function
What Kind of Problems?
In Paul's last project before leaving the video industry, he learned about neural networks and used them to answer the following question: "If a TV operator wants to process N video channels, and each video
channel has a spatial resolution of X pixels wide by Y pixels high, and some encoded bit-rate of B
megabits per second, how many computer servers are needed to process the videos without
overloading?"1 The solution was a neural network that took in the resolutions and bit-rates as inputs
and returned a number of servers. This solution saved Paul's employer over $1M per year in cloud
computing costs.
Another problem frequently solved by neural networks is image recognition. A neural network can be trained with images of cats, dogs, lions, tigers, sheep, elephants, etc. And then when a new image is fed to the network, it correctly answers what type of animal is shown in the image.
And of course, Large Language Models such as ChatGPT use Neural Networks to answer questions.
What is "Training"?
The output of a neuron of course depends on the values of the weights and bias (and the activation function). The output of a neural network—a network of neurons—depends on the weights and biases of all of the neurons in a network.
"Training" is a computational process that takes a large set of inputs, and corresponding known outputs, and adjusts every neuron's weights and bias, so that the output of the neural network gets closer and closer to the known output for every input. When training is complete, a new input can be fed to the neural network, even if the input was not in the set used for training, and it should produce a correct output.
Why do neural networks work? This is not really understood--just like we don't understand how brains
work, at any large scale. There are plenty of alternative computational models for making decisions, but
this model seems to work quite well.
This Week's Lab
For this week, you will build and test a Neuron. In a file RELUNeuron.java please write a class RELUNeuron. For the activation function, use the "RELU function": 2
double activation(double x) {
x /= 20.0;
return x > 0 ? x : 0.0;
}
1 This is somewhat simplified from the actual question...
2 "RELU" stands for "Rectified Linear Unit" which doesn't make things any clearer, does it? In this context,
"rectified" comes from Electrical Engineering, where a "rectifier" blocks electrical currents that are "less than 0" i.e
going in the wrong direction, but permits currents in the proper direction.
For class RELUNeuron:
-
- The constructor takes in the number of inputs for the Neuron, and initializes all weights and bias values to a random value between -1.0 and +1.0.
-
- an output() method takes in a double[] array of inputs and calculates the proper output value
-
- a write() method to write the neuron's weights and bias to a DataOutputStream (see DataOutputStream's writeDouble() method)
-
- a read() method to read the neuron's weights and bias from a DataInputStream (see readDouble() method).
You'll need several class variables, of course. Weights, bias, and maybe more.
To train your Neuron, I will give you a bunch of training data files, each of which contains 501 double values each. These values are not saved as text—they are saved as raw binary double values (I used a DataOutputStream). For each file:
-
the first 500 values are inputs to your Neuron,
-
the last value is the expected output from your Neuron.
In class, we will discuss the details of how to train your Neuron. Can you improve this basic training recipe? To reach a smaller error faster?
After you train your Neuron on the provided training data, please save your resulting weights and bias (using your write() method) to a file called weights.dbl .
A critical part of this week's lab is for you to design, execute, and document a set of tests for your Neuron. For this week, please write another Java file TestNeuron.java. This class of course tests your Neuron. You should think about how to do this!
-
You must test all of your Neuron's methods, to make sure they work properly. o Howdoyoutesttheconstructor?
o Howdoyoutestwrite()andread()?
o Howtotestoutput()?o Howtotesttrain()?
-
You must write up a 1-2 page document describing how you tested, how your TestNeuron
class works, and your test results. (You do not need to talk about your Neuron class.)
Conclusion
In this project you learned how to build and test a computer "neuron" using nothing but ordinary arithmetic and Boolean logic. In a few weeks we will build, train, and test a real Neural Network.
Rubric
▪ 0-20 points for code quality and proper operation and training of RELUNeuron.java
▪ 10 points if you weights.dbl gives a reasonable error with test files
▪ 0-20 points for your TestNeuron.java
▪ 0-20 points on your test writeup
Further Reading
"How LLM's work." LINK