Hello, today we are going to learn how to deploy GPT4All, the open-source and commercial alternative to GPT-4 that also consumes fewer resources than Llama-2.
In this tutorial, we will learn how to run GPT4All in a Docker container and with a library to directly obtain prompts in code and use them outside of a chat environment.
It is a model similar to Llama-2 but without the need for a GPU or internet connection. However, it requires approximately 16GB of RAM for proper operation (you can create swap files if needed).
If you need more information about GPT4All, you can find it on their official website: https://gpt4all.io/index.html
The first thing we need to do is to have Docker/Docker Compose installed in our environment (https://devcodelight.com/instalar-docker-y-docker-compose-en-oracle-linux/)
Once installed, we will use the following Docker Compose container:
- Create a file inside Dockerfile/container_python
Our file called container_python contains the following:
# syntax=docker/dockerfile:1 FROM python:3.8 #WORKDIR /app RUN apt-get update -y RUN apt install python3-pip -y RUN pip install --upgrade pip RUN pip install gpt4all RUN cd /tmp/ # Run a command to keep the container running CMD tail -f /dev/null
With this configuration, we already have a prepared environment. Now let’s create the docker-compose.yml file:
version: "3.1" services: auto_chatgpt: build: context: ./Dockerfile dockerfile: container_python restart: unless-stopped container_name: auto_chatgpt working_dir: /app volumes: - ./app:/app
In this case, we tell the container that we are going to build it from Dockerfile/container_python and we specify a volume named app where we will place our .py file that we are going to create.
Now, create a file named gpt.py inside the app/ folder:
from gpt4all import GPT4All model = GPT4All("orca-mini-3b.ggmlv3.q4_0.bin") output = model.generate("Tell me the capital of Spain") print(output)
With this, it would be ready to run. Now we can launch it with the following command:
docker-compose up -d
Since it is not self-executing, we can run our file by entering the container:
docker exec -it auto_chatgpt /bin/bash
Now the console will open, allowing us to execute:
python gpt.py
It will provide the result in the console. The first time may take a while as it downloads the model.
This is the project structure:
*I created the gitignore file to prevent cache files from being synchronized.