# destor **Repository Path**: chen-boju/destor ## Basic Information - **Project Name**: destor - **Description**: An experimental platform for chunk-level data deduplication. - **Primary Language**: C - **License**: GPL-3.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-10-01 - **Last Updated**: 2025-03-29 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README This repository contains the source code for the paper: ##### Cai Deng, Boju Chen ### The Logic of Fingerprint Upgrade in Deduplicated Storage This project supports the feature of fingerprint upgrade, specifically upgrading 20-byte SHA-1 fingerprints to 32-byte SHA-256 fingerprints. Our implementation of the naïve and the deduplication-aware index is based on the Destor open-source storage system: https://github.com/fomy/destor Features -------- 1. LFU, PFU, Con-PFU, Sim-PFU upgrade algorithms; 2. A new restore algorithm for the upgraded fingerprints; Environment ----------- Linux 64bit. Dependencies ------------ 1. libssl-dev is required to calculate sha-1 and sha-256 digest; 2. lib rocksdb is required for the external key-value store; 3. GLib 2.32 or later version > libglib.so and glib.h may not be found when you first install it. > The header files (that originally locate in /usr/local/include/glib-2.0 and /usr/local/lib/glib-2.0/include) are required to be moved to a searchable path, such as "/usr/local/include". > Also a link named libglib.so should be created in "/usr/local/lib". 4. Makefile is automatically generated by GNU autoconf and automake. Compile ------- If all dependencies are installed, compiling destor is straightforward: >autoreconf -ivf > >./configure > >make > >make install To uninstall destor, run >make uninstall Running ------- If compile and install are successful, the executable file, destor, should have been moved to /usr/local/bin by default. You should create a directory named `log` and a config file named `destor.config`, in where you run destor. A sample destor.config is in the project directory. ### backup The following command creates a deduplicated backup from a given dataset path: `destor /path/to/data -p"a line as in config file"` `/path/to/data` can be a file or a directory. If it is a directory, the entire directory will be recursively backed up in the backup directory defined in destor.config. If it is a file, all lines in the file will be considered as paths of files to be backed up. ### sha-1 restore to restore the data from the backup, run `destor -r0 /path/to/restore -p"a line as in config file"` ### upgrade fingerprints to upgrade fingerprints from sha-1 to sha-256, run `destor -u0 -i -p"a line as in config file"` upgrade methods: - 0: LFU - 1: PFU - 2: Con-PFU - 3: Sim-PFU ### sha-256 restore to restore the data from the upgraded (sha-256) fingerprints, run `destor -n1 /path/to/restore -p"a line as in config file"` Configuration ------------- When destor is initialized, it reads a configuration file named in a fixed name, `destor.config`, which contains lines in the following format: ``` ``` For example: ``` working-directory "/home/cbj/destor_working_directory" direct-reads 1 simulation-level no ``` The following values are used to configure parameters while upgrading fingerprints: * working-directory: path to a directory which will contain the deduplicated backups. * fingerprint-index-cache-size: in-memory cache size in **bytes** for the fingerprint index. * direct-reads: 1 for direct I/O reads, 0 for normal reads. * simulation-level: "no" for regular files, "all" for trace files. * recipe-cdc-max/exp/min-size: the maximum/expected/minimum size (number of chunks) of a logic recipe after dividing large recipes into sub-recipes through CDC. Warnings ---- 1. ONLY support one backup and one upgrade: version 0 corresponds to the backup, and version 1 corresponds to the upgrade. You can restore data from both versions, but use different commands(destor -r0 and destor -n1). Other commands in the origin destor are not supported. 2. The backup task is forced to use the origin htable kvstore, so it does not represent the actual speed. 3. File backupversion.count is disabled, see recipestore.c:22 Author ------ Boju Chen Email : cbj.vip@qq.com (Feel free to contact me if you have any questions about PFU. I would appreciate bug report.)