stepfun 3.5 flash pls
hi unsloth org, we need stepfun 3.5 flash, that's really important for us.
really need it
step is quantized, using it right now, you only need to change files in llama bin for the fresh edit branch to support it, if you don't like llama native webbrowser server chat (which is not saved), same in oobabooga just replace llama files in bin folder. Frankly i can't distinguish these coding models, Mirothinker makes many errors in code but there's a neat bug of trying to offer improvements to code at the end with each iteration (this is not existed in other models, also improvement bug disappear if you try to tune mirothinker by strict system prompt rules). If i wouldn't be so lazy i would put all results on github, will ask Stepfun to prepare it.
But Mirothinker last Friday offers great idea and Python code for llama.cpp to introduce LORA adapter for low quant models, the idea is to tune Q2 size to the quality of Q8 size just by applying LORA(which really works if anyone ever tried them with image generations). Some already saw idea and published Research papers in a hurry about this (2 days ago, thats not me, there was video on Youtube about this).
#My Hardware# Intel Xeon E5-2699v4 LGA2011-3 22 cores 44 threads (2016) $110 # Gigabyte C612 chipset 12 RAM slots VGA motherboard year 2016 $150 # Samsung-Hynix ECC RAM 12x64Gb=768Gb ~$900 # VGA monitor # IKEA chair # Run: Trillions Deepseeks, Kimis in Q5-Q6, 400-500billions in BF16