Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
Unified Vision-Language Agents – Detect, Segment, OCR, Generate and More (github.com/vlm-run)
5 points by fzysingularity 4 months ago | past | 1 comment
Replace OCR with Vision Language Models (github.com/vlm-run)
292 points by EarlyOom on Feb 26, 2025 | past | 125 comments
Show HN: Visually parse an entire YouTube video frame by frame (github.com/vlm-run)
5 points by EarlyOom on Feb 21, 2025 | past
A Node.js SDK for calling Vision Language Models (github.com/vlm-run)
6 points by EarlyOom on Feb 20, 2025 | past
Run structured extraction on documents/images locally with Ollama and Pydantic (github.com/vlm-run)
170 points by EarlyOom on Feb 20, 2025 | past | 29 comments

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: