Hacker Newsnew | past | comments | ask | show | jobs | submit | objektif's commentslogin

This is pretty insightful thank you. Which provider are you guys using? Is it also over the phone or fully web/app based. Do you have any resources you can point me to learn about this?

We use a bunch, at the moment we mainly self host (and use pipecat) use Daily, and a few niche boutique suppliers who built things for us.

There is a great resource for learning this stuff - the CEO of Daily, Kwindla Kramer, hosted a series of 1hr sessions on low latency voice ai. Here:

https://youtube.com/playlist?list=PLzU2zoMTQIHjMPZ-OnpC3ozZs...

Some of this is a bit outdated but most of it is very valuable.

Kwindla posts a lot of extremely useful stuff on x and linkedin, incl. working, easily replicable sub 500ms setups.


Beautiful thanks. We are also looking at this and another complication is transcripts can get pretty messy updates, corrections etc.

Are there faster mini/nano versions as well?


Not this time, no.


Usually, those get released a few weeks later.


Does anyone know good provider for low latency llm api provider? We tried to look at Cerebras and Groq but they have 0 capacity right now. GPT models are too slow for us at the moment. Gemini are better but not really at same level as GPT.


This depends a bit on your cost sensitivity and what model families you want support for, but Baseten and Fireworks have been my goto.

Currently Baseten has ~610ms TTFT and ~82 tk/s for Kimi K2.6, which is roughly 2x the throughput of GPT-5.4 (per their openrouter stats). GLM 5 is slightly slower on both metrics, but still strong.


No. They like stealing land.


What are you basing how good they are on? Personal experience or some benchmarks?


Benchmarks, we have internal ones testing reasoning fine-tuned v/s frontier + prompts

For some use cases it can be parity performance at 1/20th the cost up to exceeds at 1/10th the cost. Trade-off is ofc narrow applicability


How can I learn more about these models? Are they open source?


there are plenty of OSS finetuned models + base models around. If you're looking for doing these on your own dataset, worth getting in touch with cartesien.io or wire up https://github.com/SalesforceAIResearch/PretrainRL-pipeline


Thank you.


Yeah we care about Iranian protesters you got this right.


That's not what I said.


Will Randian tech bros start calling for socialism soon? Inshallah.


He sounds greedy as fuck. He speed ran buggy POS to sell to model co? Obvious as day what is there to see?


PG commissioned dan on X to send anyone who criticize Andrej or Pete to gulag.


Anyone using claws for something meaningful in a startup environment? I want to try but not sure what we can do with this.


PR. Say you fired all your friends and replaced them with mac minis.


Haha good point? Once I do how much money can I raise on my Series Z?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: