Skip to content

Commit

Permalink
Merge >700 commits into master. Yay!
Browse files Browse the repository at this point in the history
  • Loading branch information
sahib committed Oct 25, 2015
2 parents d514de2 + 5a2ed61 commit 31b8110
Show file tree
Hide file tree
Showing 143 changed files with 20,390 additions and 4,281 deletions.
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,14 @@ docs/rmlint.1
.scons*
.sconf*
.rope*
*.pyc

docs/_build
__pycache__

lib/config.h
lib/formats/py.c
lib/formats/py.c
lib/formats/sh.c
uninstall-
gui/app/resources/app.gresource
*.a
13 changes: 10 additions & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,16 +1,23 @@
language: c
install:
- sudo apt-get update
- sudo apt-get install python3-sphinx python3-nose gettext python3-setuptools valgrind
- sudo apt-get install libblkid-dev libelf-dev libglib2.0-dev libjson-glib-dev
- sudo apt-get install python3-sphinx python3-nose gettext python3-setuptools
- sudo apt-get install libblkid-dev libelf-dev libglib2.0-dev libjson-glib-dev
- sudo easy_install3 pip
- sudo /usr/local/bin/pip install sphinx_bootstrap_theme

compiler:
- clang
- gcc

notifications:
email:
- [email protected]
- [email protected]

script: scons VERBOSE=1 && scons config && export USE_VALGRIND=1 && PEDANTIC=1 PRINT_CMD=1 sudo nosetests3 -a '!slow'
script:
- scons VERBOSE=1
- scons config
- export RM_TS_PRINT_CMD=1
- export RM_TS_PEDANTIC=0
- sudo -E nosetests3 -s -v -a '!slow'
2 changes: 1 addition & 1 deletion .version
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.2.0 Dreary Dropbear
2.4.0 Myopic Micrathene
93 changes: 89 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,91 @@ All notable changes to this project will be documented in this file.

The format follows [keepachangelog.com]. Please stick to it.

## [2.3.0 (No name yet)] -- [unreleased]
## [2.4.0 Myopic Micrathene] -- 2015-10-25

### Fixed

- ``rmlint`` should compile on Mac OSX now.
- Bugfix: Broken ``chown`` calls in sh script (thanks Shukrat Mukimov)
- Bugfix: memory corruption when specifying ``-T dd`` alone.
- Bugfix: Make ``-D`` and ``-k / -K`` play together nicely (thanks phiresky).
- Smaller compile time troubles fixed.
- Progressbar uses timeout-based redraws which leads to much smoother drawing
and less cpu footprint.
- ``pretty`` formatter (default) produces now valid escaped commands.
It is still intented for visual output only. That's why a note for this was
added.

### Added

- A fully working graphical user interface which is installed as a python module
by default (can be disabled via compile option ie ``scons --without-gui``).
It can be started via ``rmlint --gui``.
- Support for automatic deduplication on btrfs using ``BTRFS_IOC_FILE_EXTENT_SAME``.
The Shellscript now will contain calls to ``rmlint --btrfs $source $dest``
for duplicates on ``btrfs`` filesystems if the user specified ``-c sh:clone``.
- Benchmark suite that will track the performance of rmlint from release to release.
This helps developers detect any speed regressions or improvements and is a tool
to help develop and validate optimization strategies.
- Shell/Python-script now does more sanity checks before removing and can be told to
re-compare files byte-by-byte before removing them (``-p`` option when running
the ``.sh`` file).
- Add a new ``--hash`` option so rmlint can be used as a very fast file hashing
utility, eg ``rmlint --hash`` works like ``sha1sum``, or ``rmlint --hash -d md5``
works like ``md5sum``. Also does sha256, sha512, murmur{128}, spooky{32,64,128},
city{128}.
- ``--sort-by`` learned new keys: ``l`` (path length) and ``d`` (path depth).
- New ``--unmatched-basename`` option only finds twins with differing basenames.
- Smaller performance and memory optimisations in shredder.

### Changed

- ``-g`` now checks if there is already a ``sh`` and ``json`` formatter before
it adds one.
- ``-PP`` now defaults to ``xxhash`` as hashing algorithm.
- ``-o / --output`` learned to guess the formatter you want to use from the file ending.
For example ``-o /tmp/test.json`` will work like ``-o json:/tmp/test.json``.
- JSON output contains ``rmlint`` version and revision now.
- ``--replay`` learned to merge several json files.
- Internal refactoring (credits go to Daniel) of the scheduler and hashing
library. The duplicate finding process has be split in separate modules.

## [2.3.0 Ominous Oscar] -- 2015-06-15

### Fixed

- Compiles on Mac OSX now. See also: https://github.com/sahib/rmlint/issues/139
- Fix a crash that happened with ``-e``.
- Protect other lint than duplicates by ``-k`` or ``-K``.
- ``chown`` in sh script fixed (was ``chmod`` by accident).

### Added

- ``--replay``: Re-output a previously written json file. Allow filtering
by using all other standard options (like size or directory filtering).
- ``--sort-by``: Similar to ``-S``, but sorts groups of files. So showing
the group with the biggest size sucker is as easy as ``-y s``.

### Changed

- ``-S``'s long options is ``--rank-by`` now (prior ``--sortcriteria``).
- ``-o`` can guess the formatter from the filename if given.
- Remove some optimisations that gave no visible effect.
- Simplified FIEMAP optimisation to reduce initial delay and reduce memory overhead
- Improved hashing strategy for large disks (do repeated smaller sweeps across
the disk instead of incrementally hashing every file on the disk)

## [2.2.1 Dreary Dropbear Bugfixes]

### Fixed

- Incorrect handling of -W, --no-with-color option
- Handling of $PKG_CONFIG in SConstruct
- Failure to build manpage
- Various BSD compatibility issues
- Nonstandard header sequence in modules using fts
- Removed some unnecessary warnings


## [2.2.0 Dreary Dropbear] -- 2015-05-09

Expand Down Expand Up @@ -44,7 +128,7 @@ The format follows [keepachangelog.com]. Please stick to it.
physical disk to enable fast reading without disk thrash. The improved
algorithm now increases the number of cpu threads used to hash the data
as it is read in. Also an improved mutex strategy reduces the wait time
before the hash results can be processed.
before the hash results can be processed.
Note the new threading strategy is particularly effective on the
"paranoid" (byte-by-byte) file comparison method (option -pp), which is
now almost as fast as the default (SHA1 hash) method.
Expand All @@ -60,7 +144,7 @@ The format follows [keepachangelog.com]. Please stick to it.
the core got slower very fast due to linear lookups. Fixed.
- performance regression: No SSDs were detected due to two bugs.
- commandline aborts also on non-fatal option misuses.
- Some statistic counts were updated wrong sometimes.
- Some statistic counts were updated wrong sometimes.
- Fixes in treemerge to respect directories tagges as originals.
- Ignore "evil" fs types like bindfs, nullfs completely.
- Fix race in file tree traversal.
Expand Down Expand Up @@ -100,7 +184,8 @@ The format follows [keepachangelog.com]. Please stick to it.
Initial release of the rewrite.

[unreleased]: https://github.com/sahib/rmlint/compare/master...develop
[2.2.0 Dreary Dropbear]: https://github.com/sahib/rmlint/compare/master...develop
[2.2.1 Dreary Dropbear Bugfixes]: https://github.com/sahib/rmlint/compare/master...develop
[2.2.0 Dreary Dropbear]: https://github.com/sahib/rmlint/releases/tag/v2.2.0
[2.1.0 Malnourished Molly]: https://github.com/sahib/rmlint/releases/tag/v2.1.0
[2.0.0 Personable Pidgeon]: https://github.com/sahib/rmlint/releases/tag/v2.0.0
[keepachangelog.com]: http://keepachangelog.com/
5 changes: 3 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@

======


.. image:: https://raw.githubusercontent.com/sahib/rmlint/develop/docs/_static/logo.png
:align: center

Expand Down Expand Up @@ -94,8 +95,8 @@ AUTHORS
Here's a list of developers to blame:

=================================== ============================= ===========================================
*Christopher Pahl* https://github.com/sahib 2010-2014
*Daniel Thomas* https://github.com/SeeSpotRun 2014-2014
*Christopher Pahl* https://github.com/sahib 2010-2015
*Daniel Thomas* https://github.com/SeeSpotRun 2014-2015
=================================== ============================= ===========================================

There are some other people that helped us of course.
Expand Down
89 changes: 61 additions & 28 deletions SConstruct
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ def check_git_rev(context):
rev = subprocess.check_output('git log --pretty=format:"%h" -n 1', shell=True)
except subprocess.CalledProcessError:
print('Unable to find git revision.')
except AttributeError:
# Patch for some special sandbox permission problems.
# See https://github.com/sahib/rmlint/issues/143#issuecomment-139929733
print('Not allowed.')

rev = rev or 'unknown'
conf.env['gitrev'] = rev
Expand All @@ -85,7 +89,7 @@ def check_libelf(context):
if GetOption('with_libelf') is False:
rc = 0

if rc and tests.CheckHeader(context, 'libelf.h'):
if rc and tests.CheckHeader(context, 'libelf.h', header="#include <stdlib.h>"):
rc = 0

if rc and tests.CheckLib(context, ['libelf']):
Expand All @@ -98,6 +102,19 @@ def check_libelf(context):
return rc


def check_uname(context):
rc = 1

if rc and tests.CheckHeader(context, 'sys/utsname.h', header=""):
rc = 0

conf.env['HAVE_UNAME'] = rc

context.did_show_result = True
context.Result(rc)
return rc


def check_gettext(context):
rc = 1

Expand Down Expand Up @@ -210,6 +227,22 @@ def check_sysctl(context):
return rc


def check_posix_fadvise(context):
rc = 1

if tests.CheckDeclaration(
context, 'posix_fadvise',
includes='#include <fcntl.h>'
):
rc = 0

conf.env['HAVE_POSIX_FADVISE'] = rc

context.did_show_result = True
context.Result(rc)
return rc


def check_xattr(context):
rc = 1

Expand Down Expand Up @@ -260,33 +293,29 @@ def check_c11(context):
return rc


def check_sse42(context):
if GetOption('with_sse') is False:
def check_sqlite3(context):
rc = 1
if tests.CheckHeader(context, 'sqlite3.h'):
rc = 0
else:
rc = 1

if tests.CheckDeclaration(context, '__SSE4_2__'):
rc = 0
else:
conf.env.Prepend(CFLAGS=['-msse4.2'])

conf.env['HAVE_SSE42'] = rc
if tests.CheckLib(context, ['sqlite3']):
rc = 0

conf.env['HAVE_SQLITE3'] = rc
context.did_show_result = True
context.Result(rc)
return rc


def check_sqlite3(context):
def check_btrfs_h(context):
rc = 1
if tests.CheckHeader(context, 'sqlite3.h'):
rc = 0

if tests.CheckLib(context, ['sqlite3']):
if tests.CheckHeader(
context, 'linux/btrfs.h',
header='#include <stdlib.h>\n#include <sys/ioctl.h>'
):
rc = 0

conf.env['HAVE_SQLITE3'] = rc
conf.env['HAVE_BTRFS_H'] = rc
context.did_show_result = True
context.Result(rc)
return rc
Expand Down Expand Up @@ -409,7 +438,7 @@ AddOption(
action='store', metavar='DIR', help='libdir name (lib or lib64)'
)

for suffix in ['libelf', 'gettext', 'fiemap', 'blkid', 'json-glib']:
for suffix in ['libelf', 'gettext', 'fiemap', 'blkid', 'json-glib', 'gui']:
AddOption(
'--without-' + suffix, action='store_const', default=False, const=False,
dest='with_' + suffix
Expand Down Expand Up @@ -464,16 +493,18 @@ conf = Configure(env, custom_tests={
'check_libelf': check_libelf,
'check_fiemap': check_fiemap,
'check_xattr': check_xattr,
'check_sse42': check_sse42,
'check_sha512': check_sha512,
'check_blkid': check_blkid,
'check_sysctl': check_sysctl,
'check_posix_fadvise': check_posix_fadvise,
'check_sys_block': check_sys_block,
'check_bigfiles': check_bigfiles,
'check_c11': check_c11,
'check_gettext': check_gettext,
'check_sqlite3': check_sqlite3,
'check_linux_limits': check_linux_limits
'check_linux_limits': check_linux_limits,
'check_btrfs_h': check_btrfs_h,
'check_uname': check_uname
})

if not conf.CheckCC():
Expand Down Expand Up @@ -549,9 +580,6 @@ if 'clang' in os.path.basename(conf.env['CC']):
conf.env.Append(CCFLAGS=['-fcolor-diagnostics']) # Colored warnings
conf.env.Append(CCFLAGS=['-Qunused-arguments']) # Hide wrong messages

conf.env.Append(CCFLAGS=['-march=native'])
conf.check_sse42()

# Optional flags:
conf.env.Append(CFLAGS=[
'-Wall', '-W', '-Wextra',
Expand All @@ -567,9 +595,9 @@ env.ParseConfig(pkg_config + ' --cflags --libs ' + ' '.join(packages))

conf.env.Append(_LIBFLAGS=['-lm'])

conf.check_sysctl()
conf.check_blkid()
conf.check_sys_block()
conf.check_sysctl()
conf.check_libelf()
conf.check_fiemap()
conf.check_xattr()
Expand All @@ -578,6 +606,9 @@ conf.check_sha512()
conf.check_gettext()
conf.check_sqlite3()
conf.check_linux_limits()
conf.check_posix_fadvise()
conf.check_btrfs_h()
conf.check_uname()

if conf.env['HAVE_LIBELF']:
conf.env.Append(_LIBFLAGS=['-lelf'])
Expand All @@ -589,10 +620,14 @@ if conf.env['HAVE_SQLITE3']:
env = conf.Finish()

library = SConscript('lib/SConscript')
program = SConscript('src/SConscript', exports='library')
SConscript('tests/SConscript', exports='program')
programs = SConscript('src/SConscript', exports='library')
env.Default(library)

SConscript('tests/SConscript', exports='programs')
SConscript('po/SConscript')
SConscript('docs/SConscript')
SConscript('gui/SConscript')


def build_tar_gz(target=None, source=None, env=None):
tarball = 'rmlint-{a}.{b}.{c}.tar.gz'.format(
Expand Down Expand Up @@ -669,7 +704,6 @@ if 'config' in COMMAND_LINE_TARGETS:
Find non-stripped binaries (needs libelf) : {libelf}
Optimize using ioctl(FS_IOC_FIEMAP) (needs linux) : {fiemap}
Support for SHA512 (needs glib >= 2.31) : {sha512}
Support for SSE4.2 instructions for fast CityHash : {sse42}
Support for swapping metadata to disk (needs SQLite3) : {sqlite3}
Build manpage from docs/rmlint.1.rst : {sphinx}
Support for caching checksums in file's xattr : {xattr}
Expand Down Expand Up @@ -709,7 +743,6 @@ Type 'scons' to actually compile rmlint now. Good luck.
blkid=yesno(env['HAVE_BLKID']),
fiemap=yesno(env['HAVE_FIEMAP']),
sha512=yesno(env['HAVE_SHA512']),
sse42=yesno(env['HAVE_SSE42']),
sqlite3=yesno(env['HAVE_SQLITE3']),
bigfiles=yesno(env['HAVE_BIGFILES']),
bigofft=yesno(env['HAVE_BIG_OFF_T']),
Expand Down
Loading

0 comments on commit 31b8110

Please sign in to comment.