[go: up one dir, main page]

High memory consumption in duplicity v0.8.X when backing up many small files

Hello,

I have a problem after migrating duplicate from version 0.7.19 (Oracle Linux 7.9) to version 0.8.21 (Oracle Linux 8.5).

On OEL8 I notice significant memory consumption. After approximately 2 hours of work of duplicity, all available RAM is completely exhausted (6 GB), then SWAP memory (1 GB) is used until the duplication process is destroyed due to OOM.

On OEL7, performing the same operation practically does not change the memory consumption. The memory consumption for the entire operating system does not exceed 900 MB (!).

The backup is made on many (over 13 million) small (~1KB-20KB) files lying in many nested directories. The files take up over 1TB in total.

Environment OEL8 (Oracle Linux 8.5)

  • duplicity 0.8.21 (also 0.8.18)
  • Args: /bin/duplicity --no-encryption --verbosity info --progress --log-file /opt/backup/backup.log --volsize=1024 --tempdir=/opt/backup/tmp/ --name BACKUP --archive-dir /opt/backup/ARCHIVE /opt/store file:///opt/backup/BACKUP/FD
  • Linux os-store 5.4.17-2136.302.7.2.1.el8uek.x86_64 #2 (closed) SMP Tue Jan 18 12:11:34 PST 2022 x86_64 x86_64
  • /usr/bin/python3.6 3.6.8 (default, Nov 10 2021, 06:50:23)
  • [GCC 8.5.0 20210514 (Red Hat 8.5.0-3.0.2)]
  • Temp has 3979046998016 available, backup will use approx 1395864371.

Environment OEL7 (Oracle Linux 7.9)

  • duplicity 0.7.19 (April 29, 2019)
  • Args: /bin/duplicity --no-encryption --verbosity info --progress --log-file /opt/backup/backup.log --volsize=1024 --tempdir=/opt/backup/tmp/ --name BACKUP --archive-dir /opt/backup/ARCHIVE /opt/store file:///opt/backup/BACKUP/FD
  • Linux os-store 5.4.17-2136.302.7.2.1.el7uek.x86_64 #2 (closed) SMP Tue Jan 18 13:44:44 PST 2022 x86_64 x86_64
  • /bin/python2 2.7.5 (default, Mar 12 2021, 14:55:44)
  • [GCC 4.8.5 20150623 (Red Hat 4.8.5-44.0.3)]
  • Temp has 3980725600256 available, backup will use approx 1395864371.

Steps to reproduce

prepare /opt/store directory with files:

mkdir /opt/store; cd /opt/store; mkdir 000; cd 000; for i in {001..100}; do mkdir $i; done; for i in {001..100}; do cd $i; for j in {001..999}; do mkdir $j; done; cd ..; done; for i in {001..100}; do cd $i; for j in {001..999}; do cd $j; for k in {001..999}; do dd if=/dev/urandom of=file_$k bs=1 count=$(( RANDOM + 1024 )); done; cd ..; done; cd ..; done

and run:

duplicity --no-encryption --verbosity info --progress --log-file /opt/backup/backup.log --volsize=1024 --tempdir=/opt/backup/tmp/ --name BACKUP --archive-dir /opt/backup/ARCHIVE /opt/store file:///opt/backup/BACKUP/FD

I needed over 2-3 million files to observe the anomalies normally.

Tests

I also did tests on another test host (OEL8, ~3 million files, file size: ~1KB-20KB, total: ~50GB), not using the RPM packages provided by the repository, but compiling the duplicate from the source code. The conclusions are the same like with RPM packages:

  1. duplicity 0.8.21 + python 3.6.8 => high memory consumption dupl-0.8.21_python-3.6.8.txt
  2. duplicity 0.8.21 + python 2.7.18 => memory consumption also increases, but a little less compared to python 3.6 dupl-0.8.21_python-2.7.18.txt
  3. duplicity 0.7.19 + python 3.6.8 => not supported, not tested
  4. duplicity 0.7.19 + python 2.7.18 => memory usage does not change. Expected behavior. dupl-0.7.19_python-2.7.18.txt

Details are after ~15-30 minutes of work. Memory usage:

duplicity-mem_used-compare