Using and Deploying GPT4ALL, an Alternative to Llama-2 and GPT4 for Low-Resource PCs with Python and Docker

Tiempo de lectura: 2 minutos

Hello, today we are going to learn how to deploy GPT4All, the open-source and commercial alternative to GPT-4 that also consumes fewer resources than Llama-2.

In this tutorial, we will learn how to run GPT4All in a Docker container and with a library to directly obtain prompts in code and use them outside of a chat environment.

It is a model similar to Llama-2 but without the need for a GPU or internet connection. However, it requires approximately 16GB of RAM for proper operation (you can create swap files if needed).

If you need more information about GPT4All, you can find it on their official website: https://gpt4all.io/index.html

The first thing we need to do is to have Docker/Docker Compose installed in our environment (https://devcodelight.com/instalar-docker-y-docker-compose-en-oracle-linux/)

Once installed, we will use the following Docker Compose container:

  • Create a file inside Dockerfile/container_python

Our file called container_python contains the following:

# syntax=docker/dockerfile:1
FROM python:3.8

#WORKDIR /app

RUN apt-get update -y
RUN apt install python3-pip -y
RUN pip install --upgrade pip

RUN pip install gpt4all

RUN cd /tmp/

# Run a command to keep the container running
CMD tail -f /dev/null

With this configuration, we already have a prepared environment. Now let’s create the docker-compose.yml file:

version: "3.1"

services:
  auto_chatgpt:
      build:
        context: ./Dockerfile
        dockerfile: container_python
      restart: unless-stopped
      container_name: auto_chatgpt
      working_dir: /app
      volumes:
        - ./app:/app

In this case, we tell the container that we are going to build it from Dockerfile/container_python and we specify a volume named app where we will place our .py file that we are going to create.

Now, create a file named gpt.py inside the app/ folder:

from gpt4all import GPT4All
model = GPT4All("orca-mini-3b.ggmlv3.q4_0.bin")
output = model.generate("Tell me the capital of Spain")
print(output)

With this, it would be ready to run. Now we can launch it with the following command:

docker-compose up -d

Since it is not self-executing, we can run our file by entering the container:

docker exec -it auto_chatgpt /bin/bash

Now the console will open, allowing us to execute:

python gpt.py

It will provide the result in the console. The first time may take a while as it downloads the model.

This is the project structure:

*I created the gitignore file to prevent cache files from being synchronized.

Leave a Comment