🔄 In a Training Loop

Michał Wiliński

MWilinski

1 4 40

https://michal-wilinski.com

AI & ML interests

Machine Learning, Reinforcement Learning

Recent Activity

updated a model about 21 hours ago

MWilinski/qwen2.5-3b-gail

updated a model 2 days ago

MWilinski/qwen2.5-3b-sft-irl

liked a Space 3 days ago

gemma-challenge/gemma-interactions-view

View all activity

Organizations

Papers 3

arxiv:2505.13291

arxiv:2502.06037

arxiv:2409.13530

spaces 3

models 10

MWilinski/qwen2.5-3b-gail-pirate

Updated May 5

MWilinski/qwen2.5-3b-sft-pirate

Updated May 5

MWilinski/qwen2.5-3b-gail-frozen

Updated Apr 18

MWilinski/qwen2.5-3b-gail-unfrozen

Updated Apr 18

MWilinski/qwen2.5-3b-dpo-irl

Updated Apr 16

MWilinski/qwen2.5-3b-sft-irl

Updated Apr 16

MWilinski/qwen2.5-3b-gail

Updated Mar 27

MWilinski/dro-v-qwen3-1.7b-paperlike

Updated Mar 13

MWilinski/dro-qwen3-1.7b-full-fixed-tau

Updated Feb 27

MWilinski/dro-qwen3-1.7b-full

Updated Feb 27

datasets 19

MWilinski/rlhf-irl-pirate-expert

Viewer • Updated Apr 30 • 6k • 6

MWilinski/rlhf-irl

Viewer • Updated Apr 15 • 14k • 82

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-oss-20b-diverse-openrouter

Viewer • Updated Mar 24 • 200 • 31

MWilinski/hh-rlhf-harmless-base-rollouts-gpt-oss-20b-diverse-openrouter

Viewer • Updated Mar 24 • 200 • 11

MWilinski/hh-rlhf-irl

Viewer • Updated Mar 23 • 10k • 7

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-5.1-policy

Viewer • Updated Mar 10 • 2k • 9

MWilinski/hh-rlhf-harmless-base-rollouts-gpt-5.1-policy

Viewer • Updated Mar 10 • 2k • 11

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-5.1-child

Viewer • Updated Mar 10 • 1.5k • 28

MWilinski/hh-rlhf-helpful-base-rollouts-gpt-5.1-adult

Viewer • Updated Mar 10 • 1.5k • 16

MWilinski/hh-rlhf-harmless-base-rollouts-gpt-5.1-adult

Viewer • Updated Mar 10 • 1.5k • 21

View 19 datasets

Michał Wiliński

AI & ML interests

Recent Activity

Organizations

Papers 3

spaces 3 Sort: Recently updated

Urban Autonomy Instance Segmentation

HF-Docs-QA

bit

models 10 Sort: Recently updated

datasets 19 Sort: Recently updated

spaces 3

models 10

datasets 19