Yes, I agree superconducting-ring-as-battery isn't very practical at all. I was shocked by how little it can store. I was picturing that it could just become like a super super massive power line and store as much as you want, electrons are small, are they not going to fit? I thought any number of electrons will fit pretty much, it'll just keep increasing the current racing through it (not saying my EE training is the best), which I thought would be a lot. I didn't know there is a heavy magnetic component that is the limiting factor of all this. In some ways I guess it is still a lot of storage, but it isn't useful for anything. Still cool.
I wanted to see how much energy a superconductor could store if it's being used like a battery (with the electrons just going round and round in a loop with no resistance - wheeeee!) This tool helps model that. See how high you can get the stored energy to go.
As you can see from the numbers, this is not a very practical solution for any of your energy storage needs. Still fun to play with, so I thought I'd share.
It might interest people to know you can also easily fine-tune the text portion of this specific model (E2B) to behave however you want! I fine-tuned it to talk like a pirate but you can get it to do anything you have (or can generate) training data for. (This wouldn't make it to the text to speech portion though.) So you can easily train it to act a certain way or give certain types of responses.
I've benchmarked this on an actual Mac Mini M4 with 24 GB of RAM, and averaged 24.4 t/s on Ollama and 19.45 t/s on LM Studio for the same ~10 GB model (gemma4:e4b), a difference which was repeated across three runs and with both models warmed up beforehand. Unless there is an error in my methodology, which is easy to repeat[1], it means Ollama is a full 25% faster. That's an enormous difference. Try it for yourself before making such claims.
[1] script at: https://pastebin.com/EwcRqLUm but it warms up both and keeps them in memory, so you'll want to close almost all other applications first. Install both ollama and LM Studio and download the models, change the path to where you installed the model. Interestingly I had to go through 3 different AI's to write this script: ChatGPT (on which I'm a Pro subscriber) thought about doing so then returned nothing (shenanigans since I was benchmarking a competitor?), I had run out of my weekly session limit on Pro Max 20x credits on Claude (wonder why I need a local coding agent!) and then Google rose to the challenge and wrote the benchmark for me. I didn't try writing a benchmark like this locally, I'll try that next and report back.
It depends on the hardware, backend and options. I've recently tried running some local AIs (Qwen3.5 9B for the numbers here) on an older AMD 8GB VRAM GPU (so vulkan) and found that:
llama.cpp is about 10% faster than LM studio with the same options.
LM studio is 3x faster than ollama with the same options (~13t/s vs ~38t/s), but messes up tool calls.
Ollama ended up slowest on the 9B, Queen3.5 35B and some random other 8B model.
Note that this isn't some rigorous study or performance benchmarking. I just found ollama unnaceptably slow and wanted to try out the other options.
In case someone would like to know what these are like on this hardware, I tested Gemma 4 32b (the ~20 GB model, the largest Gemma model Google published) and Gemma 4 gemma4:e4b (the ~10 GB model) on this exact setup (Mac Mini M4 with 24 GB of RAM using Ollama), I livestreamed it:
The ~10 GB model is super speedy, loading in a few seconds and giving responses almost instantly. If you just want to see its performance, it says hello around the 2 minute mark in the video (and fast!) and the ~20 GB model says hello around 5 minutes 45 seconds in the video. You can see the difference in their loading times and speed, which is a substantial difference. I also had each of them complete a difficult coding task, they both got it correct but the 20 GB model was much slower. It's a bit too slow to use on this setup day to day, plus it would take almost all the memory. The 10 GB model could fit comfortably on a Mac Mini 24 GB with plenty of RAM left for everything else, and it seems like you can use it for small-size useful coding tasks.
If anyone here is interested in its creative writing style, I gave both the 10 GB and 20 GB models the prompt "write a short story", here the results: [1]
They don't really have the structure of a short story, though the 20 GB model is more interesting and has two characters rather than just one character.
In another comment, I gave them coding tasks, if you want to see how fast it does at coding (on a 24 GB Mac Mini M4 with 10 cores) you can watch me livestream this here: [2]
Both models completed the fairly complex coding task well.
Yes, I agree superconducting-ring-as-battery isn't very practical at all. I was shocked by how little it can store. I was picturing that it could just become like a super super massive power line and store as much as you want, electrons are small, are they not going to fit? I thought any number of electrons will fit pretty much, it'll just keep increasing the current racing through it (not saying my EE training is the best), which I thought would be a lot. I didn't know there is a heavy magnetic component that is the limiting factor of all this. In some ways I guess it is still a lot of storage, but it isn't useful for anything. Still cool.
reply