Hello, today we will see how we can install and download llama 2, the Meta AI that competes with chatgpt 3.5. For this, I have created a Docker Compose that will help us set up the environment.

The first thing we need to do is go to the llama 2 page and request a version from their own website https://ai.meta.com/llama/
Now it will ask for an email address, I recommend using a university or research email account to get the approval as quickly as possible:

Next, you will receive an email with installation instructions.
To get started, we first need to download the Llama 2 environment without the model. For this, we go to their GitHub and clone it:
https://github.com/facebookresearch/llama
Once downloaded, we can either install it on our machine or use a Docker Compose.
I will create a Docker Compose that you can use to install this AI:
- Create a folder named ‘app’ at the root of the project
- Now create another folder named ‘Dockerfile’ where we will add our custom environment configuration file.
Create a file called ‘container_python’ inside Dockerfile (Dockerfile/container_python) and add this content:
# syntax=docker/dockerfile:1 #FROM python:3.11.3 FROM python:3.8 #WORKDIR /app #-y is for when it gives an error of yes RUN apt-get update -y RUN apt install python3-pip -y RUN pip install --upgrade pip # Run a command to keep the container running CMD tail -f /dev/null
Now create a file at the root named docker-compose.yml
version: "3.1"
services:
  llama_ia_python:
      build:
        context: ./Dockerfile
        dockerfile: container_python
      restart: unless-stopped
      container_name: llama_ia_python
      working_dir: /app
      volumes:
        - ./app:/app
With this, our environment is now set up.
Now, we have to launch the container with the command:
docker compose up -d
This will leave the container running in the background, and we can execute our commands. I recommend accessing the console directly from Docker:
docker exec -it llama_ia_python bash
This will open the console for us.
The first thing we need to do is navigate to the llama repository:
cd llama-main
And run download.sh to download the necessary models.
./download.sh
Now it will ask for the installation URL, this URL is received from the email (“When asked for your unique custom URL, please insert the following:”), copy it and paste it.
It will now ask us which models we want to download, if we want all of them, press enter.
It will start downloading the models:

Once the models are downloaded, I recommend saving them separately. They are in the folders:

Since it only allows downloading for 24 hours and 5 times.
Once the models are downloaded, we can install llama:
pip install -e .
Now, to check if it works, we need to use this command:
torchrun --nproc_per_node 1 example_text_completion.py \
    --ckpt_dir llama-2-7b/ \
    --tokenizer_path tokenizer.model \
    --max_seq_len 128 --max_batch_size 4
*Note: If you use Docker, make sure it can access the GPU, otherwise this service will not work. If it doesn’t work with Docker Compose, you can try running it locally.

