Language Models: General or Instructional – A Key Comparison

Language Models: General or Instructional – A Key Comparison

Tiempo de lectura: < 1 minutoWe need to understand that not all models are created equal. The main difference between general models and instructional models. General models Ejemplo:Pregunta: “Explain how a neural network works”Response from a general model: can give an extensive explanation, include unnecessary concepts or jump topics. Instruct Models Ejemplo:Pregunta: “Explain how a neural network works step by … Read more

How to activate GPU access from a Docker container, for example, to access an LLM model.

How to activate GPU access from a Docker container, for example, to access an LLM model.

Tiempo de lectura: 2 minutosEnable GPU access is essential if we need to start a LLM model and use GPU with VRAM. Previous requirements Please ensure you have an NVIDIA GPU installed on your machine with the drivers installed and updated. Check using: nvidia-smi You should see your GPU and driver version. Additionally, Docker must be installed. NVIDIA Container … Read more

Using VLLM with Docker for Deploying Our LLM Models in Production

Using VLLM with Docker for Deploying Our LLM Models in Production

Tiempo de lectura: 2 minutosIt’s an inference server optimized (uses paged attention) that supports models like Llama 3, Mistral, Gemma, Phi, Qwen, etc. This offers an OpenAI-compatible API, perfect for easy integration. We will create the Docker Compose that allows us to deploy it: File: docker-compose.yml version: “3.9” services: vllm: image: vllm/vllm-openai:latest container_name: vllm restart: unless-stopped ports: – “8000:8000” … Read more

Using LangChain with Ollama to Create an AI Agent Tool

Using LangChain with Ollama to Create an AI Agent Tool

Tiempo de lectura: < 1 minutoWe are going to learn how to connect LangChain to our deployed Ollama server today with a small example, for instance, an IA Tools. We first need to have Ollama deployed: here is how. You will obtain the endpoint once Ollama is deployed, and then use LangChain with Python to connect to it remotely. You … Read more

How to Create a Multimodal Chatbot with AI Generative

How to Create a Multimodal Chatbot with AI Generative

Tiempo de lectura: 2 minutosIn 2025, LLaMA (Large Language Model Meta AI) has consolidated as one of the most versatile options for chatbots local or cloud-based, capable of processing text, images, and audio. In this tutorial, you will learn to create a multimodal chatbot using only LLaMA. LLaMA 3 has versions with 7B, 13B, and 70B parameters; for local … Read more

How to Create a Mobile App with Local AI Using Mistral and Transformers.js with React Native

How to Create a Mobile App with Local AI Using Mistral and Transformers.js with React Native

Tiempo de lectura: 2 minutosDo you imagine an app that works with artificial intelligence without internet connection? Today I’ll show you how to use Transformers.js and a model like Mistral 7B quantized in the browser or on your mobile, without sending data to external servers. You’ll achieve total privacy by using your own device, it’s free to use, works … Read more