·
AI & ML interests
None yet
Organizations
models 25
Anna4242/qwen25-7b-multihop-grpo-checkpoint-200
8B • Updated • 1
Anna4242/qwen25-7b-singlehop-grpo-checkpoint-200
8B • Updated • 4
Anna4242/qwen25-3b-instruct-grpo-merged
3B • Updated • 1
Anna4242/qwen25-3b-base-grpo
Text Generation
• Updated • 2
Anna4242/qwen25-7b-full-sft-multihop
8B • Updated • 2
Anna4242/qwen25-3b-full-sft-multihop
3B • Updated • 1
Anna4242/qwen25-7b-sft-grpo-checkpoint-200
Reinforcement Learning
• Updated Anna4242/qwen25-3b-original-sft-ep1-grpo-checkpoint-200
Text Generation
• Updated • 2
Anna4242/Qwen2.5-7B-Instruct-onlyrl-step-1000
8B • Updated • 1
Anna4242/Qwen2.5-7B-Instruct-Singlehop-SFT
8B • Updated • 2