Code Sync Across Network-Restricted Machines at CERN
When working with CERN’s network-restricted machines—often used in experimental trigger development we may need to connect through multiple SSH hops. Typically, this means logging into lxplus
, then jumping to an intermediate node like lbgw
, and finally reaching a restricted machine. If someone prefer using local tools like VSCode or JetBrains, this setup can be challenging; installing and maintaining these editors in a restricted environment is resource-intensive and often impractical.
To solve this, we’ll set up SSH key-based authentication across all nodes and use a bidirectional sync script that automatically keeps both our local and remote environments in sync. This setup allows us to work with all our favorite local tools while maintaining an up-to-date remote environment in real time. CERN IT’s testing of two-factor authentication (2FA) for SSH access also makes this a timely solution, as it eliminates the need for repeated 2FA prompts.
With this setup, we avoid installing heavy remote servers for VSCode or JetBrains, keeping our development environment flexible and efficient.
if you just need the script you can download it as a gist here
Required Dependencies and Supported OS
This setup requires a few essential tools that are typically pre-installed on Linux and macOS:
- SSH: Secure access across multiple hops is central to this setup.
- rsync: Provides efficient file syncing between local and remote systems.
OS Compatibility
- Linux: Fully compatible with Linux distributions, especially those using
apt
(Debian/Ubuntu) oryum/dnf
(RHEL/Fedora) for package management. - macOS: Compatible with Homebrew-installed dependencies.
Installing Dependencies
On Linux, we can install rsync
with:
# Debian/Ubuntu
sudo apt install rsync
# RHEL/Fedora
sudo yum install rsync
For macOS:
- Install Homebrew if we haven’t already:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- Install
rsync
:brew install rsync
Setting Up SSH Key Authentication Across Hops
Using SSH keys simplifies access and removes the need to re-enter credentials at each hop, which is especially useful for those testing CERN IT’s 2FA SSH access.
Generate an SSH key pair on our local machine if we haven’t already:
ssh-keygen -t rsa -b 4096 -C "email@example.com"
Once w have the keys, copy the public key to each machine in our SSH chain, which might look like lxplus
➞ lbgw
➞ restricted machine. Set up the keys in the following order:
Add the Key to
lxplus
:ssh-copy-id -i ~/.ssh/id_rsa.pub username@lxplus.cern.ch
Add the Key to
lbgw
vialxplus
:ssh -J username@lxplus.cern.ch username@lbgw 'mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys' < ~/.ssh/id_rsa.pub
Add the Key to the Restricted Machine via
lbgw
:ssh -J username@lxplus.cern.ch,username@lbgw username@restricted_machine 'mkdir -p ~/.ssh && cat >> ~/.ssh/authorized_keys' < ~/.ssh/id_rsa.pub
This key setup enables seamless SSH access across hops without repeated authentication, allowing us to reach the final machine without interruption.
To streamline further, configure our ~/.ssh/config
file with the multi-hop settings:
Host lxplus
User username
HostName lxplus.cern.ch
ForwardAgent yes
Host lbgw
User username
HostName lbgw
ProxyJump lxplus
Host restricted_machine
User username
HostName hostname_of_restricted_machine
ProxyJump lbgw
With this setup, we can reach the restricted machine by typing ssh restricted_machine
.
Writing a Real-Time Bidirectional Sync Script with Checksum Comparison
With SSH keys and simplified connections in place, we can now create a script that monitors our local and remote project directories for changes using checksum comparison. By generating a checksum of each directory, the script detects when changes occur and syncs in both directions as needed. This approach provides efficient, real-time sync across our machines.
The sync script is organized into logical parts for flexibility and ease of maintenance.
Script Structure
Before writing the script, think about its structure to help organize necessary functions. Let’s outline the requirements:
Requirements:
- Sync changes from local to remote and vice versa.
- Monitor both local and remote directories for changes.
- Update files on both machines whenever changes are detected.
- Handle both local and remote directories.
- Generate checksums for local and remote directories.
- Implement best practices for logging and error handling.
Based on the requirements, we can proceed with writing the script in functional blocks. The following section will show how we can achieve these requirements using simple functions.
Functions
Parameters and Logging
#!/bin/bash
# Parse user inputs
REMOTE_USER="$1"
REMOTE_SERVER="$2"
REMOTE_PATH="$3"
LOCAL_PATH="$4"
SYNC_INTERVAL="${5:-10}"
# Define colors for logs
GREEN='\033[0;32m'
RED='\033[0;31m'
NC='\033[0m' # No color
log_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
log_error() { echo -e "${RED}[ERROR]${NC} $1"; }
This part sets up color-coded logging functions and parses user inputs for the remote and local paths, with an optional sync interval (defaulting to 10 seconds).
Checksum Generation
# Function to generate a checksum file list in a directory
generate_checksum() {
find "$1" -type f ! -path "*/node_modules/*" -exec md5sum {} + 2>/dev/null | sort | md5sum | awk '{print $1}'
}
This function generates a checksum for a directory, excluding node_modules
. To expand the list of excluded directories, pass additional arguments to the function:
# Function to generate a checksum file list with exclusions
generate_checksum() {
find "$1" -type f ! -path "*/node_modules/*" ! -path "*/$2/*" -exec md5sum {} + 2>/dev/null | sort | md5sum | awk '{print $1}'
}
Bidirectional Sync Functions
# Sync function from local to remote
sync_local_to_remote() {
log_info "Syncing changes from local to remote"
rsync -avz --exclude 'node_modules' "$LOCAL_PATH/" "${REMOTE_USER}@${REMOTE_SERVER}:${REMOTE_PATH}"
}
# Sync function from remote to local
sync_remote_to_local() {
log_info "Syncing changes from remote to local"
rsync -avz --exclude 'node_modules' "${REMOTE_USER}@${REMOTE_SERVER}:${REMOTE_PATH}/" "$LOCAL_PATH"
}
These functions use rsync
to copy files and directories, excluding node_modules
.
Monitoring for Changes
# Monitor changes based on checksum comparison
monitor_changes() {
while true; do
local_checksum=$(generate_checksum "$LOCAL_PATH")
remote_checksum=$(ssh "$REMOTE_USER@$REMOTE_SERVER" "cd $REMOTE_PATH && find . -type f ! -path '*/node_modules/*' -exec md5sum {} + 2>/dev/null | sort | md5sum | awk '{print \$1}'")
if [ "$local_checksum" != "$remote_checksum" ]; then
log_info "Detected changes. Syncing..."
sync_remote_to_local
sync_local_to_remote
else
log_info "No changes detected. Skipping sync."
fi
# Wait before the next check
sleep "$SYNC_INTERVAL"
done
}
Now our script is ready to run!. Save it as sync.sh
Running the Script
After creating the script, make it executable and run it with our desired paths and sync interval:
chmod +x sync_.sh
./sync.sh username restricted_machine /path/to/remote/project /path/to/local/project 10
This setup lets we work seamlessly with our local development tools without installing them on the remote environment. With dependencies aligned for Linux and macOS, this guide offers flexibility for CERN developers working on network-restricted machines, reducing friction caused by multi-hop SSH connections and 2FA.