This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the MoE DELLA merge method using google/gemma-4-26B-A4B as a base.
Models Merged
The following models were included in the merge:
- google/gemma-4-26B-A4B
- BeaverAI/Orion-26B-A4B-v1b-GGUF
- Darkhn/Gemma-4-26B-A4B-Animus-V14.1-FFT
- Gryphe/Gemma-4-26B-A4B-StyleTune-V2
- Gryphe/Pantheon-Reasoning-26B-A4B-1.1
- ReadyArt/Serenity-26B-A4B-GGUF
- zerofata/G4-MeroMero-26B-A4B
⚙️ Configuration
architecture: Gemma4ForConditionalGeneration
base_model: B:\26B\google_gemma-4-26B-A4B
models:
- model: B:\26B\BeaverAI_Orion-26B-A4B-v1b-GGUF
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\ReadyArt_Serenity-26B-A4B-HB16-Q8_0
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\zerofata_G4-MeroMero-26B-A4B
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\Gryphe--Pantheon-Reasoning-26B-A4B-1.1
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\Gryphe--Gemma-4-26B-A4B-StyleTune-V2
parameters:
weight:
- filter: "lm_head"
value: 1.0
- filter: "embed_tokens"
value: 1.0
- value: 0.0
density: 0.9
epsilon: 0.09
merge_method: moe_della # v3 patches missing lm_head and embed_tokens
parameters:
lambda: 1.0
normalize: false
int8_mask: false
rescale: true
router_strategy: della # average # random_init
blend_experts: true
dtype: float32
out_dtype: bfloat16
tokenizer:
source: union
chat_template: auto
name: Goetia 26B A4B v1.4
This model is NOT currently uncensored. There are refusals, and standard jailbreak resistance.
However it can easily be decensored using Heretic, although I have not had time to uncensor v1.4 yet. To ablate using ARA (recommended), I'd start with this json, and follow the steps outlined in the v1.3 readme.
🧙 Heretic Grimoire
reproduce.json
{
"version": "1.2.0-dev",
"base_model": "Naphula/Goetia-26B-A4B-v1.3",
"timestamp": "2026-06-19T08:04:47Z",
"metrics": {
"kl_divergence": 0.030937770381569862,
"refusals": 3,
"n_bad_prompts": 100
},
"parameters": {
"start_layer_index": "14",
"end_layer_index": "26",
"preserve_good_behavior_weight": "1.4404",
"steer_bad_behavior_weight": "0.0100",
"overcorrect_relative_weight": "0.9144",
"neighbor_count": "15"
},
"target_components": [
"attn.o_proj"
],
"hardware": "RTX 6000 Blackwell (96GB)"
}
💡 Details
v1.4 was made the same way as v1.3 except with additional patches and less models. See this page and the section below for more details.
Critical patch notes for G4 26B A4B moe_della merges
auto.py fix
#if _get_tied_weight_keys is None:
# LOG.warning(
# "Unable to get tied weights - incompatible transformers version",
# )
# tied_keys = None
#else:
# tied_keys = _get_tied_weight_keys(model)
####
# Force untying for Gemma 4 configurations to ensure lm_head is compiled
tied_keys = None
####
if ignore_on_save is not None:
ignore_on_save = set(ignore_on_save)
plan.py fix
#for model, w_in in zip(models, weights_in):
# index = LoaderCache().get(model).index
# if any(
# name in index.tensor_paths
# for name in [w_in.name] + (w_in.aliases or [])
# ):
# any_weight = True
# break
for model, w_in in zip(models, weights_in):
index = LoaderCache().get(model).index
if any(
name in index.tensor_paths
for name in [w_in.name] + list(w_in.aliases or [])
):
any_weight = True
break
gemma4.json
{
"model_type": "gemma4",
"architectures": [
"Gemma4ForConditionalGeneration"
],
"num_layers_config_key": "text_config.num_hidden_layers",
"vocab_size_config_key": "text_config.vocab_size",
"pre_weights": [
{ "name": "model.language_model.embed_tokens.weight", "is_embed": true },
{ "name": "model.embed_vision.embedding_projection.weight", "optional": true },
{ "name": "model.vision_tower.std_bias", "optional": true },
{ "name": "model.vision_tower.std_scale", "optional": true },
{ "name": "model.vision_tower.patch_embedder.input_proj.weight", "optional": true },
{ "name": "model.vision_tower.patch_embedder.position_embedding_table", "optional": true }
],
"layer_templates": {
"weights": [
{ "name": "model.language_model.layers.${layer_index}.self_attn.q_proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.self_attn.k_proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.self_attn.v_proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.self_attn.o_proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.self_attn.q_norm.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.self_attn.k_norm.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.mlp.gate_proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.mlp.up_proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.mlp.down_proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.input_layernorm.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.post_attention_layernorm.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.pre_feedforward_layernorm.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.post_feedforward_layernorm.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.post_feedforward_layernorm_1.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.post_feedforward_layernorm_2.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.pre_feedforward_layernorm_2.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.router.per_expert_scale", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.router.proj.weight", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.router.scale", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.experts.gate_up_proj", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.experts.down_proj", "optional": true },
{ "name": "model.language_model.layers.${layer_index}.layer_scalar", "optional": true }
]
},
"post_weights": [
{ "name": "model.language_model.norm.weight" },
{
"name": "lm_head.weight",
"is_embed": true,
"optional": true,
"aliases": ["model.language_model.embed_tokens.weight"]
}
]
}
yaml
architecture: Gemma4ForConditionalGeneration
base_model: B:\26B\google_gemma-4-26B-A4B
models:
- model: B:\26B\BeaverAI_Orion-26B-A4B-v1b-GGUF
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\ReadyArt_Serenity-26B-A4B-HB16-Q8_0
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\zerofata_G4-MeroMero-26B-A4B
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\Gryphe--Pantheon-Reasoning-26B-A4B-1.1
parameters:
weight:
- filter: "lm_head"
value: 0.0
- filter: "embed_tokens"
value: 0.0
- value: 0.2
density: 0.9
epsilon: 0.09
- model: B:\26B\Gryphe--Gemma-4-26B-A4B-StyleTune-V2
parameters:
weight:
- filter: "lm_head"
value: 1.0
- filter: "embed_tokens"
value: 1.0
- value: 0.0
density: 0.9
epsilon: 0.09
merge_method: moe_della # v3 patches missing lm_head and embed_tokens
parameters:
lambda: 1.0
normalize: false
int8_mask: false
rescale: true
router_strategy: della # average # random_init
blend_experts: true
dtype: float32
out_dtype: bfloat16
tokenizer:
source: union
chat_template: auto
name: Goetia 26B A4B v1.4
With the above patches applied, it now appears to be merging correctly. We'll see if it generates the full sized 49.4GB safetensors instead of 47GB
[MoE_DELLA Audit] Layer: lm_head.weight | Lambda=1.00
[BASE] google_gemma-4-26B-A4B
Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT : 0.0% (W:0.00 D:0.90 E:0.09 N:68.94)
Gryphe--Gemma-4-26B-A4B-StyleTune-V2 : ██████████████████████████████████████████████████ 100.0% (W:1.00 D:0.90 E:0.09 N:68.94)
Gryphe--Pantheon-Reasoning-26B-A4B-1.1 : 0.0% (W:0.00 D:0.90 E:0.09 N:68.79)
zerofata_G4-MeroMero-26B-A4B : 0.0% (W:0.00 D:0.90 E:0.09 N:68.94)
Executing graph: 20%|███████████████████████████████████▊ | 1352/6635 [03:53<2:01:35, 1.38s/it]WARNING:mergekit.graph:Fast path OOM, falling back to chunking
WARNING:mergekit.graph:OOM at chunk 64, reducing to 32 (attempt 1, progress: 0/128)
WARNING:mergekit.graph:OOM at chunk 32, reducing to 16 (attempt 2, progress: 0/128)
Executing graph: 20%|████████████████████████████████████ | 1360/6635 [04:04<1:03:57, 1.37it/s]WARNING:mergekit.graph:Fast path OOM, falling back to chunking
WARNING:mergekit.graph:OOM at chunk 64, reducing to 32 (attempt 1, progress: 0/128)
WARNING:mergekit.graph:OOM at chunk 32, reducing to 16 (attempt 2, progress: 0/128)
Executing graph: 21%|█████████████████████████████████████▏ | 1384/6635 [04:19<13:57, 6.27it/s]
[MoE_DELLA Audit] Layer: model.language_model.layers.0.mlp.down_proj.weight | Lambda=1.00
[BASE] google_gemma-4-26B-A4B
BeaverAI_Orion-26B-A4B-v1b-GGUF : ██████████ 20.1% (W:0.20 D:0.90 E:0.09 N:47.15)
Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT : █████████ 19.9% (W:0.20 D:0.90 E:0.09 N:46.66)
Gryphe--Gemma-4-26B-A4B-StyleTune-V2 : 0.0% (W:0.00 D:0.90 E:0.09 N:46.66)
Gryphe--Pantheon-Reasoning-26B-A4B-1.1 : ██████████ 20.1% (W:0.20 D:0.90 E:0.09 N:47.17)
ReadyArt_Serenity-26B-A4B-HB16-Q8_0 : █████████ 19.9% (W:0.20 D:0.90 E:0.09 N:46.66)
zerofata_G4-MeroMero-26B-A4B : █████████ 19.9% (W:0.20 D:0.90 E:0.09 N:46.66)
Executing graph: 21%|█████████████████████████████████████▎ | 1392/6635 [04:20<11:58, 7.29it/s]