Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion AUTHORS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ Development Lead
Contributors
------------

None yet. Why not be the first?
* Dale Furrow <[email protected]>
1 change: 1 addition & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
Version Date Changes
------- -------- ------

v0.4.0 9/22/16 Add deskew, text-recognition, gui-dialog for save, multithreading
v0.3.0 8/25/14 Allow arbitrary page sizes and auto-crops
v0.1.0 1/1/14 First release
======= ======== ======
26 changes: 19 additions & 7 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ Features
* `Integrates with ScanBd <http://virantha.github.io/scanpdf/html>`_ to respond to hardware button presses
* Automatically removes blank pages.
* Scans in color, and automatically down-converts into 1-bit B/W image for text/greyscale images
* Auto-crops to the proper page size.
* (optionally) Auto-crops to the proper page size.
* (optionally) applyies unpaper formatting to finished images
* (optionally) applies pdf-sandwich text-recognition to finished pdf

Usage:
------
Expand All @@ -37,6 +39,10 @@ The simplest way to use this is:

scanpdf scan pdf <pdffile>

Or alternatively

scanpdf scan pdf (To bring up a file-save dialog to direct the finished pdf file.)

This will first perform the scan, and then the conversion to PDF. If you want
to split up the scan and the PDF conversion into two separate invocations (for
reasons clarified below), then you can do:
Expand Down Expand Up @@ -64,15 +70,19 @@ additional post-processing using unpaper_:
::

--dpi=<dpi> DPI to scan in [default: 300]
--device=<device> Scanning device (sub '%' for spaces)
--crop Run ImageMagick cropping routine
--tmpdir=<dir> Temporary directory
--keep-tmpdir Whether to keep the tmp dir after scanning or not [default: False]
--face-up=<true/false> Face-up scanning [default: True]
--keep-blanks Don't check for and remove blank pages
--blank-threshold=<ths> Percentage of white to be marked as blank [default: 0.97]
--post-process Run unpaper to deskew/clean up

--blank-threshold=<ths> Percentage of white to be marked as blank [default: 0.97]
--post-process Process finished images with unpaper
--text-recognize Run pdfsandwich for text recognition

Right now, I'm assuming this is getting called via ScanBD, so I don't have the option to manually specify the
scanner. If you really want to use this standalone, for now, please just set the ``SCANBD_DEVICE`` environment
variable to your scanner device name before running this script.
Right now, I'm assuming this is getting called via ScanBD, so I don't have the option to manually specify the
scanner. If you really want to use this standalone, for now, either set the --device option or just set the
``SCANBD_DEVICE`` environment variable to your scanner device name before running this script.


Installation
Expand All @@ -87,6 +97,8 @@ Requires ImageMagick and SANE to be installed, for the command line tools:
* ``identify``
* ``ps2pdf``
* ``scanadf``
* ``unpaper``
* ``pdfsandwich``

Also requires epstopdf.

Expand Down
2 changes: 1 addition & 1 deletion TODO.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,5 @@ Todo list
=========

- Make it more generic in terms of stand-alone usage
- Add docstrings
- Consider changing default blank-threshold value

63 changes: 63 additions & 0 deletions docs/deskew_readme.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Deskew
------------------------
by Marek Mauder
http://galfar.vevb.net/deskew/
https://bitbucket.org/galfar/app-deskew

v1.10 2014-03-04

Overview
------------------------

Deskew is a command line tool for deskewing scanned text documents.
It uses Hough transform to detect "text lines" in the image. As an output, you get
an image rotated so that the lines are horizontal.

There are binaries built for these platforms (located in Bin folder):
Win32, Win64, Linux 32bit+64bit, Mac OSX 32bit. Some binaries have sufix
identifying their platform (deskew64.exe, deskew-osx, etc.).

You can find some test images in TestImages folder and
scripts to run tests (RunTests.bat and runtests.sh) in Bin.
Note that scripts just call 'deskew' command so you may need
to rename binary for your platform to just 'deskew'.

Usage
------------------------

deskew [-o output] [-a angle] [-t a|treshold] [-b color] [-r rect] [-f format] [-s info] input
-o output: Output image file (default: out.png)
-a angle: Maximal skew angle in degrees (default: 10)
-t a|treshold: Auto threshold or value in 0..255 (default: a)
-b color: Background color in hex format RRGGBB (default: trns. black)
-r rect: Skew detection only in content rectangle (pixels):
left,top,right,bottom (default: whole page)
-f format: Force output pixel format (values: b1|g8|rgba32)
-s info: Info dump (any combination of):
s - skew detection stats, p - program parameters
input: Input image file

Supported file formats
Input: BMP, JPG, PNG, JNG, GIF, DDS, TGA, PBM, PGM, PPM, PAM, PFM, PSD, TIF (depends on platform)
Output: BMP, JPG, PNG, JNG, GIF, DDS, TGA, PGM, PPM, PAM, PFM, PSD, TIF (depends on platform)

Version History
------------------------
1.10 2014-03-04:
- TIFF support for Win64 and 32/64bit Linux
- forced output formats
- fix: output file name were always lowercase
- fix: preserves resolution metadata (e.g. 300dpi) of input when writing output
1.00 2012-06-04:
- background color
- "area of interest" content rect
- 64bit and Mac OSX support
- PSD and TIFF (win32) support
- show skew detection stats and program parameters
0.95 2010-12-28:
- Added auto thresholding
- Imaging library updated.
0.90 2010-02-12:
-Initial version


2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
docopt>=0.6.1
wx>=3.0.2.0

30 changes: 23 additions & 7 deletions scanpdf.egg-info/PKG-INFO
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Metadata-Version: 1.0
Name: scanpdf
Version: 0.3.0
Version: 0.4.0
Summary: Utility to use SANE/scanadf to scan to PDF
Home-page: UNKNOWN
Author: Virantha N. Ekanayake
Expand Down Expand Up @@ -35,6 +35,9 @@ Description: Scan PDF - Easy scans in Linux with a document scanner like the Fuj
* `Integrates with ScanBd <http://virantha.github.io/scanpdf/html>`_ to respond to hardware button presses
* Automatically removes blank pages.
* Scans in color, and automatically down-converts into 1-bit B/W image for text/greyscale images
* deskews images
* (optionally) applyies unpaper formatting to finished images
* (optionally) applies pdf-sandwich text-recognition to finished pdf

Usage:
------
Expand All @@ -43,6 +46,12 @@ Description: Scan PDF - Easy scans in Linux with a document scanner like the Fuj
::

scanpdf scan pdf <pdffile>

Or alternatively

scanpdf scan pdf

To bring up a file-save dialog to direct the finished pdf file.

This will first perform the scan, and then the conversion to PDF. If you want
to split up the scan and the PDF conversion into two separate invocations (for
Expand Down Expand Up @@ -70,16 +79,21 @@ Description: Scan PDF - Easy scans in Linux with a document scanner like the Fuj

::

--dpi=<dpi> DPI to scan in [default: 300]
---dpi=<dpi> DPI to scan in [default: 300]
--device=<device> Scanning device (sub '%' for spaces)
--crop Run ImageMagick cropping routine
--tmpdir=<dir> Temporary directory
--keep-tmpdir Whether to keep the tmp dir after scanning or not [default: False]
--face-up=<true/false> Face-up scanning [default: True]
--keep-blanks Don't check for and remove blank pages
--blank-threshold=<ths> Percentage of white to be marked as blank [default: 0.97]
--post-process Run unpaper to deskew/clean up
--blank-threshold=<ths> Percentage of white to be marked as blank [default: 0.97]
--post-process Process finished images with unpaper
--text-recognize Run pdfsandwich for text recognition


Right now, I'm assuming this is getting called via ScanBD, so I don't have the option to manually specify the
scanner. If you really want to use this standalone, for now, please just set the ``SCANBD_DEVICE`` environment
variable to your scanner device name before running this script.
scanner. If you really want to use this standalone, for now, either set the --device option or just set the
``SCANBD_DEVICE`` environment variable to your scanner device name before running this script.


Installation
Expand All @@ -88,12 +102,14 @@ Description: Scan PDF - Easy scans in Linux with a document scanner like the Fuj

$ pip install scanpdf

Requires ImageMagick and SANE to be installed, for the command line tools:
Requires ImageMagick, SANE, unpaper and pdfsandwich to be installed, for the command line tools:

* ``convert``
* ``identify``
* ``ps2pdf``
* ``scanadf``
* ``unpaper``
* ``pdfsandwich``

Also requires epstopdf.

Expand Down
3 changes: 2 additions & 1 deletion scanpdf.egg-info/requires.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
docopt>=0.6.1
docopt>=0.6.1
wx>=3.0.2.0
Binary file added scanpdf/deskew64
Binary file not shown.
Loading