Install Llama 2, the Meta Open Source and Commercializable AI that competes with Chat GPT for use with Python and Docker (with GPU access)

Tiempo de lectura: 3 minutos

Hello, today we will see how we can install and download llama 2, the Meta AI that competes with chatgpt 3.5. For this, I have created a Docker Compose that will help us set up the environment.

The first thing we need to do is go to the llama 2 page and request a version from their own website https://ai.meta.com/llama/

Now it will ask for an email address, I recommend using a university or research email account to get the approval as quickly as possible:

Next, you will receive an email with installation instructions.

To get started, we first need to download the Llama 2 environment without the model. For this, we go to their GitHub and clone it:

https://github.com/facebookresearch/llama

Once downloaded, we can either install it on our machine or use a Docker Compose.

I will create a Docker Compose that you can use to install this AI:

  • Create a folder named ‘app’ at the root of the project
  • Now create another folder named ‘Dockerfile’ where we will add our custom environment configuration file.

Create a file called ‘container_python’ inside Dockerfile (Dockerfile/container_python) and add this content:

# syntax=docker/dockerfile:1
#FROM python:3.11.3
FROM python:3.8

#WORKDIR /app

#-y is for when it gives an error of yes
RUN apt-get update -y
RUN apt install python3-pip -y
RUN pip install --upgrade pip
# Run a command to keep the container running
CMD tail -f /dev/null

Now create a file at the root named docker-compose.yml

version: "3.1"

services:

  llama_ia_python:
      build:
        context: ./Dockerfile
        dockerfile: container_python
      restart: unless-stopped
      container_name: llama_ia_python
      working_dir: /app
      volumes:
        - ./app:/app

With this, our environment is now set up.

Now, we have to launch the container with the command:

docker compose up -d

This will leave the container running in the background, and we can execute our commands. I recommend accessing the console directly from Docker:

docker exec -it llama_ia_python bash

This will open the console for us.

The first thing we need to do is navigate to the llama repository:

cd llama-main

And run download.sh to download the necessary models.

./download.sh

Now it will ask for the installation URL, this URL is received from the email (“When asked for your unique custom URL, please insert the following:”), copy it and paste it.

It will now ask us which models we want to download, if we want all of them, press enter.

It will start downloading the models:

Once the models are downloaded, I recommend saving them separately. They are in the folders:

Since it only allows downloading for 24 hours and 5 times.

Once the models are downloaded, we can install llama:

pip install -e .

Now, to check if it works, we need to use this command:

torchrun --nproc_per_node 1 example_text_completion.py \
    --ckpt_dir llama-2-7b/ \
    --tokenizer_path tokenizer.model \
    --max_seq_len 128 --max_batch_size 4

*Note: If you use Docker, make sure it can access the GPU, otherwise this service will not work. If it doesn’t work with Docker Compose, you can try running it locally.

Leave a Comment