User:Qfissler: Difference between revisions

Jump to navigation Jump to search
m
SM pdf parts extract note
m (Plugins compared. Sections rationalised. TOC added.)
m (SM pdf parts extract note)
Line 54: Line 54:
[[Tekwiki_Guidelines]]
[[Tekwiki_Guidelines]]
<br>
<br>
==Working with PDFs==
===Extracting Part Refs===
I had fun extracting lists of references to the [[151-0367-00]] transistor from the [[475]] service manual.
Get the text from the manual
$ pdftotxt 475_SM.pdf 475_SM.txt
The tables are split but remarkably consistent so it's not too difficult to work back and pick up the previous column
$ grep -B12 151-0367 475_SM.txt
Grab the refs I want and paste them into a text editor, then turn new lines into comma space...


===OCR===
===OCR===
Line 62: Line 76:
`ocrmypdf` seems to work very well - the recognised text lines up with the image text - best results so far.
`ocrmypdf` seems to work very well - the recognised text lines up with the image text - best results so far.
Will also try `pdfsandwich`  
Will also try `pdfsandwich`  
Can extract pages from a pdf, remove any duff pages, rotate any pages which are better rotated, then img2pdf and ocrmypdf and then rebuild the pdf


  ~/Electronics/Scopes/TekTronix/2215/Letter$ ls -l
  ~/Electronics/Scopes/TekTronix/2215/Letter$ ls -l
906

edits

Navigation menu