Whether something is "impractical" depends on your expectations. High-latency un... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		zozbot234 14 hours ago \| parent \| context \| favorite \| on: Outsourcing plus local AI will soon become more ec... Whether something is "impractical" depends on your expectations. High-latency unattended inference is definitely viable, even though it doesn't align much with what's being run in hyperscale datacenters.

		help

dns_snek 13 hours ago [–]

I'd like to meet the person who's been using a 1 token/second system as their primary LLM for at least a few weeks. Anyone?

I think 1 token/second is optimistic here - and even then it's over 11 days per million tokens.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact