Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm familiar with those tools, but they don't preserve the formatting of the source document (when PDF) and certain types of embedded resources don't transfer properly.

Some PDF documents don't contain the source text as digitized text, either. It's just a bundle of scanned images.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: