Ok, I think I got the hang of this. Let’s write it down so it can help others.
Intro : S3 buckets as a primary storage for Nextcloud
Simple Storage Service (S3) comes from Amazon Web Services (but is now supported by a variety of other providers) and is used to host massive amounts of files, with the ability to select different access tiers for scalability and cost management. It can easily be encrypted server-side, replicated on the the cloud and versioned for safety, and switching between S3 buckets redundancies is as easy as changing an ID/Secret string in config.php
.
All in all, they are pretty great because they enable cheap hosting for data, while keeping the data away from the front-end server (hosting Linux/Apache/Nextcloud), provided that SSD space is more costly. This allows me to keep a 30 GB volume for the server part, which gets fully backed-up once a day as a full snapshot (OS, programs, Apache and Nextcloud config and database at once), and these snapshots can be used to spawn a new computing instance or to roll back the running instance in a matter of seconds.
But…
S3 buckets are not filesystems. And that’s where it gets messy with Nextcloud.
Using sudo -u www-data php /var/www/nextcloud/occ files:cleanup
or sudo -u www-data php /var/www/nextcloud/occ files:scan-app-data
has no effect on external containers (that is, anything not hosted in the local /var/www/nextcloud
). Meaning:
- there is no garbage collection whatsoever on S3 buckets :
-
files versions are supposed to be kept as long as the used storage space of the user is < 50 %. Using S3 buckets, given that you pay for what you use, you might be tempted to keep ressources unlimited, but then versions are kept forever.
- Deleting files doesn’t delete either the database entries for their thumbnails nor the actual thumbnails files on the S3 bucket (issue reported in 2020).
- you can’t manage the garbage collection yourself because:
- as mentionned above, S3 is not a filesystem : it’s a collection of files named like
urn:oid:xxxx
where xxxx
is an incremental number known only by your NextCloud database.
- you can’t empty your DB manually either because the only link between files living on your devices and their S3 object ID is in the Nextcloud database and would be lost foverever.
- the
oc_filecache
database table actually references the file thumbnails path as nextcloud/data/appdata_xxxx/preview/*
, which is super misleading because this folder exists neither on the local /var/www/nextcloud
directory nor on the S3 bucket, and because the thumbnails are very much hosted on the S3 bucket (as pointed by the storage
column of the oc_filecache
table).
- emptying the trashbin doesn’t even work on S3 buckets : issue reported in 2020.
Solving my problem
Problem 1 : unoptimized DB
For some reason, NextCloud doesn’t customarily optimize databases. Running sudo mysqlcheck -o --all-databases
, my DB shrank from 3.8 GB to 2.5 GB but it took ages.
To run it every day at 1am, add a cron job as root user : sudo crontab -e
and add the following line : 0 1 * * * mysqlcheck -o --all-databases
.
Problem 2 : no garbage collection after deleting apps (plugins)
I found tables for apps I had removed a couple of months ago, like Duplicate finder and Nextcloud Mail, even though I deleted all my accounts in Nextcloud Mail before removing the app. DROP TABLE oc_useless_table;
on all useless tables made me recover 20 MB.
Problem 3 : no garbage collection for removed external storages
I added my website server as FTP external storage to see if that was useful, but it was slower than mounting FTP and DAV directly on my OS filebrowser, so I removed it. But all the indexed files and their versions were still in DB.
- List known storages :
SELECT * from oc_storages;
+------------+-------------------------- ---------+-----------+--------------+
| numeric_id | id | available | last_checked |
+------------+------------------------------------+-----------+--------------+
| 4 | object::store:amazon::xxxx | 1 | NULL |
| 6 | object::user:xxx | 1 | NULL |
| 7 | local::/var/www/nextcloud/data/ | 1 | NULL |
| 8 | home::xxx | 1 | NULL |
| 10 | home::xxx | 1 | NULL |
| 12 | ftp::user@server.com/ | 1 | 1675297283 |
+------------+------------------------------------+-----------+--------------+
Note that removed storages can still have a 1
in the available columns, so I don’t know what it stores. Find out the numeric_id
of the storage you don’t use anymore, here it’s 12
.
-
Clear the file cache of files from this removed storage : DELETE FROM oc_filecache WHERE storage=12;
-
Run SQL optimize on oc_filecache
: OPTIMIZE TABLE oc_filecache;
Problem 4 : tags are poorly implemented
I use only 1 tag on 6 files over the whole 2.6 millions files hosted on my NextCloud instance. The only reason I’m even using that tag is for file retention rules.
This seems to be enough to create a full oc_systemtag_object_mapping
table containing 2.6 millions entries and weighting close to a GB for no reason.
I removed the file retention rule (will manage that by hand), the file retention app, the collaborative tag app, and deleted the whole offending table : DROP TABLE oc_systemtag_object_mapping;
.
Nextcloud will re-create a new one or prevent it from being deleted, so we need again to optimize it after re-creation : OPTIMIZE TABLE oc_systemtag_object_mapping;
Database is now down to 1.6 GB.
Problem 5 : versions are kept forever
Because the file version garbage collection expects an account size limit, it will not happen here.
In /var/www/nextcloud/config/config.php
, I added 'versions_retention_obligation' => '14,30',
to have versions retained at least 14 days but at most 30 days, as it’s said on the doc.
Then run sudo -u www-data php /var/www/nextcloud/occ versions:expire
. Barely changes anything to the database size though.
Problem 6 : orphaned thumbnails are not deleted
This guy wrote a garbage collection script that properly links S3 IDs with files to check for orphaned previews (whose parent file has been removed):
- Get it :
wget http://raw.githubusercontent.com/charleypaulus/nextcloudtools/main/delete-old-previews.sh
- Install rclone :
sudo apt install rclone
on Ubuntu
- Backup your Nextcloud database,
- Run it :
sudo /bin/bash delete-old-previews.sh
- Note the number of orphaned thumbnails and escape with Ctrl+D because the S3 bucket is hardcoded in the script and it will fail.
I have 655 orphaned files in there, so I will not bother adapting the script for now.
Conclusion
I’m not impressed with Nextcloud scalability. My instance is 4 months old, has 2 users and 4 devices connected, and still it gathered a lot of dust already. From what I see here, the software can’t scale and I don’t want to be the sysadmin of an organization where users use the full range of features in the software.
What worries me is that most issues I faced with S3 have been reported already for 2 years. They can be solved by a bunch of SQL scripts at the right places. Yet, Nextcloud is developing niche features like Talk (when we already have Jitsi and Matrix) or Mastodon-like stuff instead of fixing basic file handling on external storage. And by “basic file handling” I mean deleting DB entries and caches when a file is deleted on a filesystem, I’m not asking for a neutrino vacuum cleaner.
But more importantly, none of these issues are properly documented on the user doc, so everything looks ok at install time, bad surprises arise later. It says that S3 is supported, so you think it’s properly supported, and it’s not the case.
I decreased here the size of the DB from 3.8 to 1.6 GB (which is still a lot), and I brought the CPU consumption of MariaDB from 15 % back to 0.5-1 %. This took me 3 days and the help of @SysKeeper (thanks a lot).