Instructions to use pearl-ai/Gemma-4-31B-it-pearl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Inference
- HuggingChat
pearl-ai/Gemma-4-31B-it-pearl
Pearl Gemma 4 instruction-tuned checkpoint for Pearl inference and mining. Like our other Pearl-certified models, it is intended to run with the Pearl vLLM mining plugin so inference can participate in Pearl mining (Proof-of-Useful-Work alongside useful compute). Layout and runtime fields are in config.json.
- Project website: https://pearlresearch.ai
- Pearl repository: https://github.com/pearl-research-labs/pearl
- Miner / vLLM plugin docs: https://github.com/pearl-research-labs/pearl/tree/master/miner
Benchmarks
Results from our evaluation runs (lmms-eval + vLLM, full test sets). Original is google/gemma-4-31B-it; Pearl is this checkpoint.
| Model | GPQA | MMLU | HumanEval (pass@1) | MGSM3 | MMMU-Pro Vision* | Video-MME (short) |
|---|---|---|---|---|---|---|
| Original | 77.27% | 90.93% | 94.70% | 88.62% | 54.57% | 79.0% |
| Pearl | 77.37% | 90.56% | 94.15% | 89.09% | 54.45% | 78.2% |
* MMMU-Pro Vision (Pearl / Original): lmms-eval mmmu_pro_vision, direct-answer prompting, max_new_tokens=256, full test set (1730 samples). Google reports 76.9% on MMMU Pro in the Gemma 4 model card; they do not publish the eval recipe for that figure (prompting, subset, or aggregation).
Pearl mining (vLLM plugin)
Pearl mining means serving through the Pearl miner stack: a pearld node (RPC), pearl-gateway, and the vLLM miner build that loads the Pearl plugin (NoisyGEMM / gateway integration). Details are in the miner README.
Typical flow:
- Run
pearldwith RPC enabled. - Start the Pearl miner / vLLM image or workspace (plugin-enabled vLLM).
- Point the server at this model; gateway + miner components handle mining-side integration.
High-level prerequisites:
- Python 3.12,
uv, CUDA + NVIDIA GPU (see miner docs for supported architectures) - Rust toolchain (for Pearl miner build paths)
- A running
pearldwith RPC credentials for the gateway
Docker (recommended for mining)
From the Pearl repository root:
docker buildx build -t vllm_miner . -f miner/vllm-miner/Dockerfile
docker run --rm -it --gpus all \
-p 8000:8000 -p 8337:8337 -p 8339:8339 \
-e PEARLD_RPC_URL=<PEARLD_URL> \
-e PEARLD_RPC_USER=<RPC_USER> \
-e PEARLD_RPC_PASSWORD=<RPC_PASSWORD> \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--shm-size 8g \
vllm_miner:latest \
pearl-ai/Gemma-4-31B-it-pearl \
--host 0.0.0.0 --port 8000 \
--max-model-len 8192 \
--gpu-memory-utilization 0.9 \
--enforce-eager
Inference with vLLM
Serve from the Hugging Face Hub id (or from a local directory containing this snapshot):
uv run vllm serve pearl-ai/Gemma-4-31B-it-pearl \
--host 0.0.0.0 \
--port 8000 \
--max-model-len 8192 \
--gpu-memory-utilization 0.9 \
--enforce-eager
Flags (same as above, one line):
uv run vllm serve pearl-ai/Gemma-4-31B-it-pearl --host 0.0.0.0 --port 8000 --max-model-len 8192 --gpu-memory-utilization 0.9 --enforce-eager
These commands load the full model (text + vision). Append --language-model-only if you only need text — it disables the vision tower and uses less GPU memory.
Model details
- Architecture:
Gemma4ForConditionalGeneration(model_type:gemma4) - Modalities: Text, Image
License
Use and redistribution are subject to the Gemma license terms from Google; this repository is a Pearl distribution of weights derived from that ecosystem.
Limitations
Models can produce incorrect or unsafe outputs. Validate in your environment before production use.
- Downloads last month
- 15,873