📜 Goetia 26B A4B v1.4

Goetia Grimoire

🧙‍♀ The Invocation

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the MoE DELLA merge method using google/gemma-4-26B-A4B as a base.

Models Merged

The following models were included in the merge:

⚙️ Configuration

architecture: Gemma4ForConditionalGeneration
base_model: B:\26B\google_gemma-4-26B-A4B
models:
  - model: B:\26B\BeaverAI_Orion-26B-A4B-v1b-GGUF
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\ReadyArt_Serenity-26B-A4B-HB16-Q8_0
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\zerofata_G4-MeroMero-26B-A4B
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\Gryphe--Pantheon-Reasoning-26B-A4B-1.1
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\Gryphe--Gemma-4-26B-A4B-StyleTune-V2
    parameters:
      weight:
        - filter: "lm_head"
          value: 1.0
        - filter: "embed_tokens"
          value: 1.0
        - value: 0.0
      density: 0.9
      epsilon: 0.09
merge_method: moe_della # v3 patches missing lm_head and embed_tokens
parameters:  
  lambda: 1.0
  normalize: false
  int8_mask: false
  rescale: true
  router_strategy: della # average # random_init
  blend_experts: true
dtype: float32
out_dtype: bfloat16
tokenizer:
  source: union
chat_template: auto
name: Goetia 26B A4B v1.4

This model is NOT currently uncensored. There are refusals, and standard jailbreak resistance.

However it can easily be decensored using Heretic, although I have not had time to uncensor v1.4 yet. To ablate using ARA (recommended), I'd start with this json, and follow the steps outlined in the v1.3 readme.

🧙 Heretic Grimoire

reproduce.json

{
  "version": "1.2.0-dev",
  "base_model": "Naphula/Goetia-26B-A4B-v1.3",
  "timestamp": "2026-06-19T08:04:47Z",
  "metrics": {
    "kl_divergence": 0.030937770381569862,
    "refusals": 3,
    "n_bad_prompts": 100
  },
  "parameters": {
    "start_layer_index": "14",
    "end_layer_index": "26",
    "preserve_good_behavior_weight": "1.4404",
    "steer_bad_behavior_weight": "0.0100",
    "overcorrect_relative_weight": "0.9144",
    "neighbor_count": "15"
  },
  "target_components": [
    "attn.o_proj"
  ],
  "hardware": "RTX 6000 Blackwell (96GB)"
}

💡 Details

v1.4 was made the same way as v1.3 except with additional patches and less models. See this page and the section below for more details.


Critical patch notes for G4 26B A4B moe_della merges

auto.py fix

    #if _get_tied_weight_keys is None:
    #    LOG.warning(
    #        "Unable to get tied weights - incompatible transformers version",
    #    )
    #    tied_keys = None
    #else:
    #    tied_keys = _get_tied_weight_keys(model)
    ####
    # Force untying for Gemma 4 configurations to ensure lm_head is compiled
    tied_keys = None
    ####
    if ignore_on_save is not None:
        ignore_on_save = set(ignore_on_save)

plan.py fix

#for model, w_in in zip(models, weights_in):
            #    index = LoaderCache().get(model).index
            #    if any(
            #        name in index.tensor_paths
            #        for name in [w_in.name] + (w_in.aliases or [])
            #    ):
            #        any_weight = True
            #        break
            for model, w_in in zip(models, weights_in):
                index = LoaderCache().get(model).index
                if any(
                    name in index.tensor_paths
                    for name in [w_in.name] + list(w_in.aliases or [])
                ):
                    any_weight = True
                    break

gemma4.json

{
  "model_type": "gemma4",
  "architectures": [
    "Gemma4ForConditionalGeneration"
  ],
  "num_layers_config_key": "text_config.num_hidden_layers",
  "vocab_size_config_key": "text_config.vocab_size",
  "pre_weights": [
    { "name": "model.language_model.embed_tokens.weight", "is_embed": true },
    { "name": "model.embed_vision.embedding_projection.weight", "optional": true },
    { "name": "model.vision_tower.std_bias", "optional": true },
    { "name": "model.vision_tower.std_scale", "optional": true },
    { "name": "model.vision_tower.patch_embedder.input_proj.weight", "optional": true },
    { "name": "model.vision_tower.patch_embedder.position_embedding_table", "optional": true }
  ],
  "layer_templates": {
    "weights": [
      { "name": "model.language_model.layers.${layer_index}.self_attn.q_proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.self_attn.k_proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.self_attn.v_proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.self_attn.o_proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.self_attn.q_norm.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.self_attn.k_norm.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.mlp.gate_proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.mlp.up_proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.mlp.down_proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.input_layernorm.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.post_attention_layernorm.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.pre_feedforward_layernorm.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.post_feedforward_layernorm.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.post_feedforward_layernorm_1.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.post_feedforward_layernorm_2.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.pre_feedforward_layernorm_2.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.router.per_expert_scale", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.router.proj.weight", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.router.scale", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.experts.gate_up_proj", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.experts.down_proj", "optional": true },
      { "name": "model.language_model.layers.${layer_index}.layer_scalar", "optional": true }
    ]
  },
  "post_weights": [
    { "name": "model.language_model.norm.weight" },
    { 
      "name": "lm_head.weight", 
      "is_embed": true, 
      "optional": true,
      "aliases": ["model.language_model.embed_tokens.weight"]
    }
  ]
}

yaml

architecture: Gemma4ForConditionalGeneration
base_model: B:\26B\google_gemma-4-26B-A4B
models:
  - model: B:\26B\BeaverAI_Orion-26B-A4B-v1b-GGUF
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\ReadyArt_Serenity-26B-A4B-HB16-Q8_0
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\zerofata_G4-MeroMero-26B-A4B
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\Gryphe--Pantheon-Reasoning-26B-A4B-1.1
    parameters:
      weight:
        - filter: "lm_head"
          value: 0.0
        - filter: "embed_tokens"
          value: 0.0
        - value: 0.2
      density: 0.9
      epsilon: 0.09
  - model: B:\26B\Gryphe--Gemma-4-26B-A4B-StyleTune-V2
    parameters:
      weight:
        - filter: "lm_head"
          value: 1.0
        - filter: "embed_tokens"
          value: 1.0
        - value: 0.0
      density: 0.9
      epsilon: 0.09
merge_method: moe_della # v3 patches missing lm_head and embed_tokens
parameters:  
  lambda: 1.0
  normalize: false
  int8_mask: false
  rescale: true
  router_strategy: della # average # random_init
  blend_experts: true
dtype: float32
out_dtype: bfloat16
tokenizer:
  source: union
chat_template: auto
name: Goetia 26B A4B v1.4

With the above patches applied, it now appears to be merging correctly. We'll see if it generates the full sized 49.4GB safetensors instead of 47GB

[MoE_DELLA Audit] Layer: lm_head.weight | Lambda=1.00
  [BASE] google_gemma-4-26B-A4B
  Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT           :                                                      0.0% (W:0.00 D:0.90 E:0.09 N:68.94)
  Gryphe--Gemma-4-26B-A4B-StyleTune-V2              : ██████████████████████████████████████████████████ 100.0% (W:1.00 D:0.90 E:0.09 N:68.94)
  Gryphe--Pantheon-Reasoning-26B-A4B-1.1            :                                                      0.0% (W:0.00 D:0.90 E:0.09 N:68.79)
  zerofata_G4-MeroMero-26B-A4B                      :                                                      0.0% (W:0.00 D:0.90 E:0.09 N:68.94)
Executing graph:  20%|███████████████████████████████████▊                                                                                                                                            | 1352/6635 [03:53<2:01:35,  1.38s/it]WARNING:mergekit.graph:Fast path OOM, falling back to chunking
WARNING:mergekit.graph:OOM at chunk 64, reducing to 32 (attempt 1, progress: 0/128)
WARNING:mergekit.graph:OOM at chunk 32, reducing to 16 (attempt 2, progress: 0/128)
Executing graph:  20%|████████████████████████████████████                                                                                                                                            | 1360/6635 [04:04<1:03:57,  1.37it/s]WARNING:mergekit.graph:Fast path OOM, falling back to chunking
WARNING:mergekit.graph:OOM at chunk 64, reducing to 32 (attempt 1, progress: 0/128)
WARNING:mergekit.graph:OOM at chunk 32, reducing to 16 (attempt 2, progress: 0/128)
Executing graph:  21%|█████████████████████████████████████▏                                                                                                                                            | 1384/6635 [04:19<13:57,  6.27it/s]
[MoE_DELLA Audit] Layer: model.language_model.layers.0.mlp.down_proj.weight | Lambda=1.00
  [BASE] google_gemma-4-26B-A4B
  BeaverAI_Orion-26B-A4B-v1b-GGUF                   : ██████████                                          20.1% (W:0.20 D:0.90 E:0.09 N:47.15)
  Darkhn_Gemma-4-26B-A4B-Animus-V14.1-FFT           : █████████                                           19.9% (W:0.20 D:0.90 E:0.09 N:46.66)
  Gryphe--Gemma-4-26B-A4B-StyleTune-V2              :                                                      0.0% (W:0.00 D:0.90 E:0.09 N:46.66)
  Gryphe--Pantheon-Reasoning-26B-A4B-1.1            : ██████████                                          20.1% (W:0.20 D:0.90 E:0.09 N:47.17)
  ReadyArt_Serenity-26B-A4B-HB16-Q8_0               : █████████                                           19.9% (W:0.20 D:0.90 E:0.09 N:46.66)
  zerofata_G4-MeroMero-26B-A4B                      : █████████                                           19.9% (W:0.20 D:0.90 E:0.09 N:46.66)
Executing graph:  21%|█████████████████████████████████████▎                                                                                                                                            | 1392/6635 [04:20<11:58,  7.29it/s]
Downloads last month
-
Safetensors
Model size
26B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Naphula/Goetia-26B-A4B-v1.4

Collection including Naphula/Goetia-26B-A4B-v1.4

Paper for Naphula/Goetia-26B-A4B-v1.4