RESEARCH

Basic Classification with a Neural Network with Keras and TensorFlow

27 JANUARY 2025
Mark Sikaundi - Data Scientist and AI Researcher.

Share this post

A new generation of African talent brings cutting-edge AI to scientific challenges

Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It allows for easy and fast prototyping, supports both convolutional networks and recurrent networks, and runs seamlessly on both CPUs and GPUs.

Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It allows for easy and fast prototyping, supports both convolutional networks and recurrent networks, and runs seamlessly on both CPUs and GPUs.

Keras is designed to be user-friendly, modular, and extensible. It provides a simple and intuitive interface that allows you to build complex neural networks with just a few lines of code. Keras also supports a wide range of neural network architectures, including feedforward networks, convolutional networks, recurrent networks, and more.

TensorFlow is an open-source machine learning library developed by Google. It provides a flexible and efficient framework for building and training machine learning models, including neural networks. With TensorFlow, you can easily create custom models, train them on large datasets, and deploy them in production environments.

TensorFlow is designed to be scalable, allowing you to train models on distributed systems with multiple GPUs and CPUs. It also provides tools for visualizing and debugging your models, making it easier to understand and improve their performance.

In this tutorial, we will use Keras with TensorFlow as the backend to build a basic classification model. We will train the model on the famous MNIST dataset, which consists of 28x28 pixel grayscale images of handwritten digits. The goal is to classify the digits into one of ten classes (0-9) based on the pixel values.

We will start by loading the dataset and preprocessing the images. We will then build a simple neural network with a single hidden layer and train it on the training data. Finally, we will evaluate the model on the test data and visualize the results.

By the end of this tutorial, you will have a basic understanding of how to build and train a neural network using Keras and TensorFlow. You will also have a working classification model that can recognize handwritten digits with high accuracy.

To get started, you will need to install the following Python libraries: TensorFlow, Keras, NumPy, and Matplotlib. You can install them using pip:

pip install tensorflow keras numpy matplotlib

Once you have installed the required libraries, you can proceed with the tutorial. Let's get started!

Step 2: Understand the Basic Structure of a Keras Model

A Keras model is a collection of layers that are connected in a sequential or functional manner. Each layer in the model performs a specific type of processing, such as feature extraction or classification. The layers communicate with each other through connections, which carry the output of one layer to the input of the next layer.

Keras models are typically built using the Sequential API or the Functional API. Here, we'll cover the basic structure of a Keras model using the Sequential API.


import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
            

Step 2 Define the model:

The Sequential model is a linear stack of layers. You can create a Sequential model by passing a list of layer instances to the constructor.


model = Sequential([
    Dense(128, activation='relu', input_shape=(784,)),
    Dense(10, activation='softmax')
])
    

Here, we define a Sequential model with two layers: a hidden layer with 128 neurons and a ReLU activation function, and an output layer with 10 neurons and a softmax activation function. The input shape of the model is (784,), which corresponds to the 28x28 pixel images of handwritten digits.

Step 3 Compile the model:

Before training the model, you need to compile it with a loss function, an optimizer, and evaluation metrics.


model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
    

Here, we compile the model with the Adam optimizer, the sparse categorical crossentropy loss function, and the accuracy metric. The Adam optimizer is a popular optimization algorithm that is well-suited for training deep neural networks. The sparse categorical crossentropy loss function is used for multi-class classification tasks, and the accuracy metric measures the performance of the model on the training data.

Step 4 Train the model:

Now we can train the model on the training data.


model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))
    

Here, we train the model on the training data (X_train, y_train) for 10 epochs with a batch size of 32. We also evaluate the model on the test data (X_test, y_test) after each epoch to monitor its performance.

Step 5 Evaluate the model:

Finally, we can evaluate the model on the test data and visualize the results.


loss, accuracy = model.evaluate(X_test, y_test)
print('Test accuracy:', accuracy)
    

Here, we evaluate the model on the test data (X_test, y_test) and print the test accuracy. This gives us an idea of how well the model generalizes to new, unseen data.

Step 6 Visualize the results:

Finally, we can visualize the results of the model on the test data.


import matplotlib.pyplot as plt

predictions = model.predict(X_test)
plt.figure(figsize=(10, 10))
for i in range(25):
    plt.subplot(5, 5, i + 1)
    plt.imshow(X_test[i].reshape(28, 28), cmap='gray')
    plt.title('Predicted: {}'.format(np.argmax(predictions[i])))
    plt.axis('off')
plt.show()
    

Here, we use the model to make predictions on the test data (X_test) and visualize the results. We plot the first 25 images in the test data along with the predicted class labels. This gives us an idea of how well the model is performing on the test data.

What are layers in Keras

A Keras model is a collection of layers that are connected in a sequential or functional manner. Each layer in the model performs a specific type of processing, such as feature extraction or classification. The layers communicate with each other through connections, which carry the output of one layer to the input of the next layer.

There are several types of layers in a Keras model, including input layers, output layers, and hidden layers. The input layer receives the data and passes it to the first hidden layer. The hidden layers process the data in parallel and communicate with each other through connections. The output layer produces the final output of the model, which can be used for classification or regression tasks.

The basic structure of a Keras model is as follows:

The input layer receives the data and passes it to the first hidden layer. The hidden layers process the data in parallel and communicate with each other through connections. The output layer produces the final output of the model, which can be used for classification or regression tasks.

The layers in a Keras model are connected in a sequential or functional manner. In a sequential model, the layers are stacked on top of each other, with the output of one layer feeding into the input of the next layer. In a functional model, the layers are connected in a more complex manner, allowing for more flexibility in the model structure.

In the next section, we will build a simple Keras model with a single hidden layer and train it on the MNIST dataset. We will then evaluate the model on the test data and visualize the results.

Step 3: Build a Simple Keras Model

In this section, we will build a simple Keras model with a single hidden layer. The model will consist of three layers: an input layer, a hidden layer, and an output layer. We will use the Sequential API to create the model and add the layers one by one.

The input layer will receive the 28x28 pixel grayscale images of handwritten digits. The hidden layer will perform feature extraction and classification, and the output layer will produce the final classification result.

The basic structure of the model is as follows:

The input layer will receive the 28x28 pixel grayscale images of handwritten digits. The hidden layer will perform feature extraction and classification, and the output layer will produce the final classification result.

The hidden layer will consist of 128 neurons with the ReLU activation function. ReLU is a popular activation function that introduces non-linearity into the model, allowing it to learn complex patterns in the data.

The output layer will consist of 10 neurons with the softmax activation function. Softmax is a common activation function used in classification tasks, as it produces a probability distribution over the classes.

In the next section, we will implement the model in Keras and train it on the MNIST dataset. We will then evaluate the model on the test data and visualize the results.

Step 4: Train the Keras Model

In this section, we will train the Keras model on the MNIST dataset. We will use the model to classify the handwritten digits into one of ten classes (0-9) based on the pixel values.

We will start by loading the MNIST dataset and preprocessing the images. We will then build the Keras model with a single hidden layer and train it on the training data. Finally, we will evaluate the model on the test data and visualize the results.

The training process consists of the following steps:

The training process consists of the following steps:

The training process consists of the following steps:

Learn more about KerasLupleg Community