General
Why is it required
Choosing the Reference Agent in the job
How does it work
Removing or changing the Reference Agent
Limitations and peculiarities
General
Starting with v3.0.0 Resilio Connect Agent introduces a Reference Agent for Synchronisation jobs, and starting with v4.0.0 it's also available for File caching and Hybrid work jobs.
Assigning a Reference Agent in a sync job which is configured to sync NTFS or Posix permissions is compulsory. Without it, the job will give the warning on save.
In the jobs where file permissions are not synced, Reference Agent is not required, but can be used - this will speed up initial folder merge with other RW agents, as described below.
Reference Agent is selected among the Read-Write Agents of version 3.0.0 in the job. This agent can be a part of an RW group. Mobile agents cannot be a Reference Agent.
Starting with Resilio v4.0.0 the whole High Availability group can be selected as Reference Agent.
Agents of pre-3.0.0 version can only have Read-Only or Read-Only Selective Sync permissions in a job together with the Reference Agent. If there's an Agents of older version with Read-Write (or Read-Write Selective Sync) in the job, it won't be possible to select the Reference Agent, it will be grayed out. See below for other limitations.
Why is it required
The Reference Agent is required in order to keep file permissions synced in a proper way and expected direction. This is especially a case for pre-seeded folders where file permissions may be different on the Agents. Due to the distributed nature of Resilio Connect, all the file permissions from read-write Agents will be merged into a single tree and synced to all other Agents. As a result unexpected permissions are assigned giving access to files and folders. This is not a desired behavior.
All Agents in the job will go through the initial synchronisation process - reconcile file permissions and pre-seeded files with those on the Reference Agent, see below for details.
Choosing the Reference Agent is still advisable even if the job is not configured to synchronise file permissions, but it's not required. With the Reference Agent other RW Agents will merge their folder tree faster.
Choosing the Reference Agent in the job
Due to its important role, a Reference Agent shall be a stable, always online server.
Only Agents of version 3.0 and newer can be selected as a Reference Agent.
Starting with Resilio v4.0.0 a High Availability group can be selected as a Reference Agent. Choosing one agent from an HA group is not supported, the whole group acts as an RA.
In an existing job, a newly added Agent cannot be used as a Reference Agent in the job until it performs the initial sync with the others (see below for details about initial sync).
Upgrading the existing jobs
Currently existing jobs will continue working as they are after upgrading to Resilio Connect v3.0. It's possible to select one of the RW Agents to be a Reference Agent in the job though. If the job is configured to sync file permissions and is edited in any way, choosing a Reference Agent becomes required.
Online Agents that are not yet synced with it will perform the initial synchronisation. The currently offline agents will do the same as soon as they appear online, even if they are synced up.
How does it work
When a Reference Agent is selected, all other read-write and read-only Agents in the job will perform the process of initial synchronisation and report the corresponding status:
1) all Agents will locally disable inheritance of file permissions for the sync root folder.
2) all Agents will overwrite all local file permissions of the root folder and files/folders inside with those from the Reference Agent.
File permissions on the Reference Agent remain unchanged and are distributed to other agents in the job.
1. Any file changes on RW agents won't be synced back until the initial sync is completed.
2. If some files are newer on RW Agents, or are missing on the Reference Agent but are present on other RW Agents in the job, these files with their permissions will be synced back to the Reference Agent after initial sync is completed.
Initial sync is a one-time process that starts right after an Agent is added to a job and continues for this Agent until it merges the folder tree with the Reference Agent, or until the initial synchronisation is stopped manually.
Errors about creating share's identifying ID file do not interrupt initial synchronisation, it will resume after the error is fixed. Starting with v3.2.1, initial synchronisation is not stopped if any error appears on the Reference Agent, even if the error is ignored.
If multiple files are added to or updated on the Reference Agent, it will have to process them, which will delay completion of the initial synchronisation. If necessary, add custom parameter enable_file_system_notifications:false
to the Agent profile, so that the Reference Agent is not notified about new/updated files.
During this process the Agents report status "initial synchronisation" in the job run.
Removing pre-existing inherited permissions is a lengthy operation and could take a significant amount of time on large dataset (may take up to 20h for 200 million files dataset).
It's highly advisable not to interrupt initial sync. The initial sync can be interrupted manually or by removing the Reference Agent.
To manually stop initial synchronisation, go to the Job run -> Agents -> select an Agent and click stop on the Overview tab. Event about manually stopping the initial synchronisation is recorded in the audit log.
If an Agent process is restarted, it will continue the initial synchronisation.
Initial synchronisation will start again if the job path is changed for an Agent.
Initial synchronisation will not start again if the job path is changed for the Reference Agent itself.
Peculiarities for High Availability groups
HA group is selected as Reference Agent: the main peculiarity is that the whole group acts as a Reference Agent. All the job run activities and initial synchronization is performed by the group's leader.
Cross-platform synchronization of file permissions
Using Highly Available groups on a system that cannot apply the replicated permissions (for example, on a Linux with replicated NTFS permissions or vise versa) should be avoided. It may lead to unexpected permissions issues and access problems. Always ensure that HA groups are used on systems with compatible file permission structures.HA group is in the job with Reference Agent: all the Agents in the group will stay in 'initial sync' state until all of them complete it. Initial sync can be manually stopped on an Agent in HA group, and in this case it will be automatically stopped on other Agents in the group. If a new agent is added to HA group after initial sync is complete, it will be restarted again on all Agents in the group.
What will happen next
After the initial sync is complete, file permissions will be consistent on all Agents in the job, apart from cases with locked or inaccessible files as mentioned above.
Additionally, the RW Agents will check for the local folder tree. If there are files that are missing from the Reference Agent, or are newer than files on Reference Agent, these will be synced back with their permissions. Once an RW Agent completes its initial synchronisation, it can be used as a Reference Agent.
Removing or changing the Reference Agent
It is possible to change a Reference Agent in the job. However, only one of the RW Agents that has already completed initial sync can be a new Reference Agent. You cannot select a new agent in the job as a Reference Agent.
Do not change Reference Agent in the job until other agents complete the initial sync.
For Resilio Connect v3.4.x Reference Agent can be removed from the job only if it's not configured to synchronise file permissions. Starting with v3.5.0 Reference Agent cannot be removed from an already configured job anymore.
When the Reference Agent is removed from the job, the Agents which haven't finished initial sync yet, will suspend the job and report the error. It will linger on the agents until the new Reference Agent is chosen or the initial sync is stopped manually. Removing the Reference Agent shall be avoided. File permissions may be inconsistent and it will lead to the problem that the Reference Agent is supposed to solve. Newly added RW Agents will perform the initial sync with one of the already existing RW agents that have already completed their initial sync, but will also report the error about missing Reference Agent.
If the reference agent is offline, newly added RW Agents will also perform the initial sync with one of the already existing RW agents that have completed their initial sync.
Limitations and peculiarities
Agents of older versions (below 3.0.0) with Read-Write or Selective Sync Read-Write permissions cannot be used in a job together with Reference Agent. This also means that they cannot be a part of a group in such a job, including auto-groups - the agents won't be auto-assigned to such a group.
The RW Agents that perform initial synchronisation, will report the Read-Only access to the share in the Agent UI. After it's done, the share will switch to RW access in the Agent UI, however restart of the Agent is required for TSS shares on mac OS.
Enabling synchronisation of file permissions in the middle of a job run does not take affect - it requires recreating the job. If recreating the job is not possible at the moment and is put off, do not restart the Agent in such a job run.
Status 'Initial synchronisation' is a virtual status and cannot be filtered on job run overview or in the Agents' table in the job run.
Any errors that appear on the Reference Agent don't interrupt the initial synchronisation, even if the error is ignored on the MC. If you see that initial synchronisation runs for too long for no apparent reason, check for ignored errors list.
In MC v3.3.0 and newer restarting the job on the Reference Agent is not possible.