← Back to leaderboard

Flint

Show HN: Flint – A 30B model fine-tuned for less repetition

43 AI Score

Show_hn other Added Apr 16, 2026

Details

Sector

other

Total Funding

Last Round

About

As frontier LLMs have very little output diversity even for open ended queries. We built Flint to see if we could reverse this. It’s a finetuned Qwen3 30B model specifically trained to produce higher entropy when asked open ended questions.<p>Flint significantly increases the NoveltyBench score compared to the base model, without significantly reducing the score on non-creative benchmarks like MMLU-STEM.<p>This shows that that divergence tuning doesn't actually have to be a tax on base capabilities.<p>Flint scores 7.47/10 on NoveltyBench while most frontier models score between 1.8 and 3.2.

AI Score Reasoning

Flint addresses a legitimate technical gap in LLMs—the lack of creative diversity—with impressive benchmark results that suggest a high degree of technical proficiency. However, the project currently lacks commercial traction, team visibility, and a clear moat against frontier labs who could implement similar divergence tuning at scale.

Source

Show_hn — View original →