Create a Neural Network to Classify Movie Reviews as Positive or Negative

Tiempo de lectura: 3 minutos

Reading Time: 3 minutes

Today, I’m going to show you how to create a Neural Network applied to Artificial Intelligence (AI) that can classify movie reviews as positive or negative.

Step 1: Prepare the Data

First, we need to obtain the data to train our neural network. In this case, we will use the IMDB movie reviews dataset, which contains 50,000 reviews labeled as positive or negative.

We can obtain this dataset using the tf.keras.datasets package of TensorFlow, as shown below:

import tensorflow as tf

(train_data, train_labels), (test_data, test_labels) = tf.keras.datasets.imdb.load_data(num_words=10000)

Here, we are loading the training and testing data and limiting the number of words to the top 10,000 most common words in the dataset.

Step 2: Preprocess the Data

Once we have the data, we need to preprocess it before feeding it into our neural network. In this case, we will use padding technique to ensure that all text sequences have the same length.

train_data = tf.keras.preprocessing.sequence.pad_sequences(train_data, maxlen=256)
test_data = tf.keras.preprocessing.sequence.pad_sequences(test_data, maxlen=256)

Step 3: Define the Neural Network Architecture

Now, we can define the architecture of our neural network. In this example, we will use a dense neural network with two hidden layers and one output layer.

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(10000, 16),
    tf.keras.layers.GlobalAveragePooling1D(),
    tf.keras.layers.Dense(16, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

Here, we are defining an embedding layer, which transforms the input (in this case, text sequences) into dense vectors. Then, we use a pooling layer to reduce the dimensionality of the vectors, followed by two dense layers with ReLU and sigmoid activation functions, respectively.

Step 4: Compile and Train the Neural Network

Once we have defined the architecture of the neural network, we can compile and train it using the training dataset.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(train_data, train_labels, epochs=10, validation_data=(test_data, test_labels))

Here, we are using the Adam optimizer and the binary cross-entropy loss function to compile the neural network. Then, we

are training the neural network for 10 epochs using the training data, with the testing data as validation.

Step 5: Evaluate the Neural Network

Finally, we can evaluate the accuracy of our neural network using the testing data.

loss, accuracy = model.evaluate(test_data, test_labels)
print('Accuracy: {:.2f}%'.format(accuracy * 100))

Complete Example

import tensorflow as tf
from tensorflow import keras

# Load the IMDB dataset
imdb = keras.datasets.imdb
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

# Preprocess the data
train_data = keras.preprocessing.sequence.pad_sequences(train_data, value=0, padding='post', maxlen=256)
test_data = keras.preprocessing.sequence.pad_sequences(test_data, value=0, padding='post', maxlen=256)

# Define the neural network architecture
model = keras.Sequential()
model.add(keras.layers.Embedding(10000, 16))
model.add(keras.layers.GlobalAveragePooling1D())
model.add(keras.layers.Dense(16, activation=tf.nn.relu))
model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))

# Compile and train the neural network
model.compile(optimizer=tf.optimizers.Adam(),
              loss='binary_crossentropy',
              metrics=['accuracy'])

history = model.fit(train_data,
                    train_labels,
                    epochs=10,
                    batch_size=512,
                    validation_data=(test_data, test_labels),
                    verbose=1)

# Evaluate the neural network
results = model.evaluate(test_data, test_labels)
print('Test accuracy: ', results[1])

In this example, we first load the IMDB dataset using the load_data() function from the keras.datasets.imdb module. Then, we preprocess the data using the pad_sequences() function from the keras.preprocessing.sequence module to ensure that all reviews have the same length.

Next, we define the neural network architecture using the Sequential() class from the keras.models module, and add layers using the add() function.

After defining the neural network architecture, we compile and train the neural network using the compile() function and the fit() function, respectively. In this case, we use the Adam optimizer and the binary_crossentropy loss function, and train the neural network for 10 epochs with a batch size of 512.

Finally, we evaluate the accuracy of the neural network using the evaluate() function, and display the accuracy in the console using the print() function.

Leave a Comment