Do you imagine an app that works with artificial intelligence without internet connection? Today I’ll show you how to use Transformers.js and a model like Mistral 7B quantized in the browser or on your mobile, without sending data to external servers.

You’ll achieve total privacy by using your own device, it’s free to use, works on mobiles, devices, web…
You need to:
Install transformers
npm install @xenova/transformers
Loading the model:
html
import { pipeline } from '@xenova/transformers'; const runModel = async () => { const classifier = await pipeline('text-classification', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english'); const result = await classifier('This app idea is incredible.'); console.log(result); }; runModel();
This example uses a basic model.
Replace with a more powerful model (like Mistral)
const chat = await pipeline('text-generation', 'Xenova/mistral-7b-instruct'); const output = await chat('Can you explain what is Transformers.js in simple language?');
Note: These models use WebGPU if available. On mobile devices, you can use it with WebView or native WebGPU (experimental on Android/iOS 17+).
You can integrate this into a React Native App:
You can use a <WebView>
to run this model from the mobile without internet connection:
import React from 'react'; import { WebView } from 'react-native-webview'; export default function App() { return ( <WebView source={{ uri: 'file:///android_asset/index.html' }} /> ); }
You include your model in index.html
using Transformers.js
.
BONUS: Use Hugging Face models in .gguf
If you prefer to use quantized models, you can combine them with engines like:
This allows you to run Mistral 7B Q4 or Q5 in the browser or mobile device.
Remember that an application like this takes up a lot of space.
