Ad Code

Responsive Advertisement

Ticker

6/recent/ticker-posts

Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck https://ift.tt/GgEXIQ5

Show HN: Yapit – PDF and webpage reader with TTS that doesn't suck Yapit converts PDFs and web pages to audio, with a vision-LLM pipeline that handles math and complex layout instead of garbling them. I built it because I read a lot of papers and content online, but drift off after two paragraphs. Listening while following along keeps me focused and lowers the bar to actually start. Every TTS tool I tried broke on complex formatting. Papers with math, citations, figure references, page numbers in the middle of sentences. You either get garbled output or you're listening to raw LaTeX. Yapit converts everything to markdown as a common format. For web pages, defuddle ( https://ift.tt/R2I5Zus ) handles the extraction and strips clutter from web pages, presenting the main article content in a clean, consistent format. For PDFs, a vision LLM rewrites each page into markdown with annotation tags that separate what you see from what gets read aloud. Math is rendered visually but gets spoken alt text. Citations like "[13]" or "(Schmidhuber, 1970)" are silently displayed. Page numbers and headers are removed entirely. Both extraction and audio are cached by content hash, so the same content is never processed or synthesized twice. Self-hosting works with any OpenAI-compatible TTS server (vLLM-Omni, ...) and any OpenAI-compatible vision model for PDF extraction: git clone --depth 1 https://ift.tt/BMbleIt && cd yapit cp .env.selfhost.example .env.selfhost make self-host Kokoro TTS also runs in the browser via WebGPU on desktop. Try it on Attention Is All You Need (all voices cached, no account needed): https://ift.tt/3OzuiVt... Or paste any URL: https://ift.tt/Ndny1T0 https://ift.tt/uWRr3Pm... GitHub: https://ift.tt/TDYh38s (AGPL-3) https://ift.tt/TDYh38s April 6, 2026 at 05:28AM

Post a Comment

0 Comments