Day-2 operations guide - MacStadium Docs

This guide covers the ongoing operational tasks for managing a production MacStadium VDI environment. It assumes your deployment is complete, users are onboarded, and you’re now responsible for day-to-day operations and maintenance. For a full list of available playbooks and common variable combinations, see the Ansible quick reference. What this guide covers:

Routine capacity and image management
User lifecycle operations
Backup and recovery

What this guide assumes:

You’ve completed the initial deployment process
You’re familiar with basic Orka operations
You have admin access to Citrix Cloud, Orka hosts, and your Ansible control node
Your environment is operational with active users

Prerequisites:

SSH access to MacStadium VDI hosts
Citrix Cloud admin credentials
A project set up with an Ansible control node running Orka Engine
You have access to your container registry

For advanced configuration topics (bridged networking, HDX tuning, automation, multi-tenancy), see Advanced Configuration. For incident response, change management, and compliance, see Incident Response & Change Management.

Routine operations

Capacity management

Monitoring host utilization:

Check the current VM distribution across your Orka hosts by running the following Ansible script:

ansible-playbook -i inventory list.yml

This playbook shows all your environment’s VMs, and which host each is running on. You’ll want to watch out for uneven host distribution (for example, one host is overloaded while others remain idle), any hosts approaching their VM limit, and watch for any resource warnings in the log output. To check your existing host resource usage, run the following Ansible script:

ansible hosts -i inventory -m shell -a "top -l 1 | grep -E 'CPU|PhysMem'"

You will want to monitor CPU usage, and make sure this doesn’t go above 80% sustained. You’ll also want to observe memory pressure (swap usage), and confirm the amount of available disk space on /var/orka.

Setting up basic monitoring

Use a cron job to capture daily stats on your Ansible control node. For example, the following cron job would add daily capacity to an existing Ansible node daily at 6:00 AM:

# Open crontab editor
crontab -e

# Add this cron entry inside the editor:
0 6 * * * ansible-playbook -i /path/to/inventory list.yml > /var/log/orka-capacity-$(date +\%Y\%m\%d).log

You will want to review your Ansible logs weekly to help you spot trends before they become larger issues.

Scaling up: Adding new Mac hosts

When you need to scale up:

Your existing Orka hosts are consistently above 70% CPU utilization
Users are reporting slowness during peak hours
You are planning to add more desktops than your current host capacity supports

Steps to add a new Mac host:

Provision physical Mac hardware with MacStadium
- Contact MacStadium support to add nodes to your private cloud
- Request a host in same subnet as your existing infrastructure
- Install MacStadium VDI on your new Mac host machine(s)
Add host to Ansible inventory

Edit inventory.ini:

[hosts]
mac-node-1 ansible_host=10.0.100.10
mac-node-2 ansible_host=10.0.100.11
mac-node-3 ansible_host=10.0.100.12
mac-node-4 ansible_host=10.0.100.13  # New host

[all:vars]
ansible_user=admin
ansible_become=yes

Verify connectivity to the new host:

     ansible mac-node-4 -i inventory -m ping

Confirm the Orka Engine version matches existing host(s):

     ansible hosts -i inventory -m shell -a "orka-engine --version"

If the new host has a different Orka version, upgrade any existing hosts or downgrade the new host to match. Version mismatches can cause deployment issues.

Pull required images to the new host:

     ansible-playbook -i inventory pull_image.yml -e "remote_image_name=registry.example.com/citrix-vda/sonoma-finance:v2.0" --limit mac-node-4

Repeat this for each image your environment uses. This prevents slow first deployments when users need desktops on the new host.

Test VM deployment on the new host:

     ansible-playbook -i inventory deploy.yml -e "vm_name=test-new-host-01" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:v2.0" --limit mac-node-4

Verify the test VM boots, registers with Citrix VDA, and is accessible. Once this is confirmed, you can then delete the test VM by running:

     ansible-playbook -i inventory delete.yml -e "vm_name=test-new-host-01"

Deploy production VMs

With your new host successfully verified, you can now deploy additional desktops. Use your existing Ansible playbook(s) to automatically distribute VMs across all available hosts.

     ansible-playbook -i inventory deploy.yml -e "vm_name=citrix-vda-finance-01" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:latest"

Run this command once for each additional VM, using a unique vm_name each time. This may take anywhere between 2-4 hours for full host integration and testing.

Scaling Down: Decommissioning Hosts

When you might scale down:

Your user count has reduced (for example, seasonal workers have been offboarded)
You are consolidating to newer hardware
Cost optimization during low-usage periods

Important note: Decommissioning MacStadium VDI hosts requires migrating or deleting VMs first. MacStadium VDI does not support live VM migration between hosts.

Steps to decommission a host:

Identify VMs on the target host:

     ansible-playbook -i inventory list.yml | grep mac-node-4

You will want to note all VM names running on the host you’re removing.

Choose your migration strategy:

Option A: Delete and redeploy pooled desktops. These can be deleted and recreated on other hosts without impacting users.

     # Delete specific VM
     ansible-playbook -i inventory vm.yml -e "vm_name=citrix-vda-finance-abc123" -e "desired_state=absent"
     
     # Redeploy VM
     ansible-playbook -i inventory deploy.yml -e "vm_name=citrix-vda-finance-new-01" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:latest"

Option B: Snapshot and recreate VMs for dedicated desktops with user data. If users have local data that must be preserved:

Notify users 48 hours in advance
Have your users back up critical data to their network drives
Take VM snapshots
Delete VMs from the old host and redeploy them on the remaining hosts
Restore your user data from existing backups Most environments avoid this by enforcing network storage policies where user data is never stored locally on VMs.
Remove the host from your Ansible inventory

Edit inventory.ini and remove the host:

     [hosts]
     mac-node-1 ansible_host=10.0.100.10
     mac-node-2 ansible_host=10.0.100.11
     mac-node-3 ansible_host=10.0.100.12
     mac-node-4 ansible_host=10.0.100.13  # Removed

Verify VM distribution across hosts

     ansible-playbook -i inventory list.yml

Confirm that your VMs are now running only on the remaining hosts.

Contact MacStadium to decommission hardware

Once a MacStadium VDI host is empty and removed from your inventory, notify MacStadium support to remove the node from your private cloud. Estimated timeline: This may take between 4-8 hours depending on your VM count and migration complexity.

Image updates and patching

macOS Security Updates

Frequency: Monthly (as Apple releases updates) Testing requirement: Always test updates on non-production VMs before rolling out to users. Recommended workflow:

Create a test image

Deploy a test VM from your current golden image:

     ansible-playbook -i inventory deploy.yml -e "vm_name=image-test-01" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:v2.0"

Access the test VM and install updates

     # SSH into the Orka node
     ssh admin@10.0.100.10
     
     # List VMs and filter for the test VM
     orka-engine vm list | grep image-test
     
     # Open VNC connection to the VM
     open vnc://10.0.101.50

Inside the VM:

Navigate to System Settings → General → Software Update
Install all available updates
Reboot as needed

Verify Citrix VDA still functions:

Check VDA registration: System Preferences → Citrix VDA
Test user login through Citrix Workspace
Test HDX features (clipboard, file transfer, USB)
Run the following example Ansible playbook to capture the updated VM as a new image version:

ansible-playbook -i inventory create_image.yml -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:v2.0" -e "remote_image_name=registry.example.com/citrix-vda/sonoma-finance:v2.1"

Delete the test VM

     ansible-playbook -i inventory delete.yml -e "vm_name=image-test-01"

Pilot update rollout

You may want to deploy the updated image to a small group of users first, as seen in the following example Ansible playbook:

ansible-playbook -i inventory deploy.yml -e "vm_name=citrix-vda-finance-pilot-01" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:v2.1"

Run this command once for each additional pilot VM, using a unique vm_name each time. You will want to monitor the updated image deployment for 3-5 business days, collecting user feedback on any new issues or errors, performance changes, and application compatibility issues that may arise.

Full production rollout

If the pilot succeeds, you can proceed to update all VMs in production. For pooled desktops, this process is straightforward, as seen in the following Ansible playbook example:

# Delete existing VMs from finance group
ansible-playbook -i inventory delete.yml -e "vm_name=citrix-vda-finance-01"

_Run this command once for each VM to remove._

# Redeploy with new image version
ansible-playbook -i inventory deploy.yml -e "vm_name=citrix-vda-finance-01" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:v2.1"

_Run this command once for each additional VM, using a unique `vm_name` each time._

You will want to schedule the update to take place during a scheduled maintenance window (evenings or weekends are recommended) to avoid user disruption. For dedicated desktops, users will lose their local data unless it has been backed up. You will want to notify your users in advance (5 business days recommended) and provide them with data backup instructions. Estimated update timeline: One week (testing) + one day (pilot) + 1-2 hours (full production rollout)

Application Updates

Frequency: Varies by application (update applications quarterly or as-needed) Process: Use the same process as described in the macOS Security Updates section, but you will want to modify the VM to install application updates before creating a new golden image. Example: Updating Xcode

Deploy a test VM from the current golden image
Install new Xcode version from Mac App Store or Apple Developer
Test Xcode functionality (build a test project)
Create a new golden image
Pilot the new golden image with your developer team
Roll the new golden image out to production

Note: Keep a CHANGELOG.MD file to show what’s included in each image version. You can use image tags to track this:

registry.example.com/citrix-vda/dev-tools:v1.0 - Xcode 14.3, Sonoma 14.0
registry.example.com/citrix-vda/dev-tools:v1.1 - Xcode 15.0, Sonoma 14.1
registry.example.com/citrix-vda/dev-tools:v1.2 - Xcode 15.2, Sonoma 14.3 You will want to store your CHANGELOG.MD file in your project’s git repository alongside your Ansible playbooks.

Citrix VDA Updates

Frequency: Quarterly (Citrix releases updates every 3-4 months) Check for updates: Citrix Cloud Console → Updates & Announcements Update Process:

Download the new VDA installer from Citrix
Deploy a test VM from your current golden image
Install the new VDA version:
- Copy the VDA installer to the VM
- Run the VDA installer (this may require uninstalling the old version first)
- Reboot the VM
Verify VDA registration and HDX functionality works as expected
Create a new golden image with the updated Citrix VDA version
Pilot the new image and roll out to production using the same process described in the macOS Security Updates section

Note: Always test Citrix VDA updates in a non-production environment, as these can occasionally introduce compatibility issues with specific macOS versions or applications. Rollback plan: Keep the previous golden image version available for 30 days after production rollout. If any issues arise, you can quickly redeploy from the old image as seen in the following Ansible playbook:

# Delete existing VMs from finance group
ansible-playbook -i inventory delete.yml -e "vm_name=citrix-vda-finance-01"

_Run this command once for each VM to remove._

# Redeploy with previous image version (rollback)
ansible-playbook -i inventory deploy.yml -e "vm_name=citrix-vda-finance-01" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:v2.0"

_Run this command once for each additional VM, using a unique `vm_name` each time._

User Lifecycle

Onboarding New Users

Scenario: A new employee needs access to a macOS desktop. For pooled desktops:

Add the new user to a Citrix Delivery Group:

Navigate to Citrix Cloud Console → Manage → Delivery Groups → Select group → Edit Click the “Users” tab → Add users → Search by name or email → Select → Save

Verify capacity

Check if you have unassigned VMs available:

ansible-playbook -i inventory list.yml | wc -l

Compare the listed VM count against the number of users in the Delivery Group. If you need more desktops, review the following example Ansible playbook:

# Add two more desktops
ansible-playbook -i inventory deploy.yml -e "vm_name=citrix-vda-finance-11" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:latest"

ansible-playbook -i inventory deploy.yml -e "vm_name=citrix-vda-finance-12" -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:latest"

User logs in

The user opens Citrix Workspace, authenticates, and clicks their assigned desktop. Citrix then assigns them an available VM from the pool. Expected wait time: 5 minutes for admin tasks + 2-3 minutes for user’s first login. For dedicated desktops: Follow the same process described for pooled desktops, but ensure you deploy exactly as many VMs as you have users. Each user gets their own VM with persistent data.

Reassigning Desktops

Scenario: A user moves to a different department and needs different applications. For pooled desktops:

Remove the user from their old Delivery Group
Add user to the new Delivery Group
User logs out and logs back in, and gets assigned a VM from the new pool

No VM changes are needed, as the user automatically gets a different desktop. For dedicated desktops: If the user needs to keep their data, this requires manual intervention:

Have the user back up their important files to network storage
Delete the user’s old VM
Deploy a new VM from the appropriate golden image
Add the user to their new Delivery Group
The user then restores their backed-up files

Alternatively, if user data doesn’t need to be preserved, proceed to delete the old VM and deploy a new one.

Offboarding and Data Retention

Scenario: An employee leaves the company or no longer needs macOS access. Process:

Remove user from the Citrix Delivery Group

Citrix Cloud Console → Manage → Delivery Groups → Select group → Edit → Users → Remove user → Save

For dedicated desktops: handle data retention

If the user had a dedicated VM, decide: Option A: Keep VM for 30 days (common policy) Do nothing immediately. Keep the VM running, but inaccessible. After 30 days:

ansible-playbook -i inventory vm.yml -e "vm_name=citrix-vda-finance-abc123" -e "desired_state=absent"

Option B: Archive user data before VM deletion

SSH to the Orka host running the VM
Use orka-engine vm backup or host-level snapshots to capture the VM disk (if your environment supports this)
Store the VM backup for the required retention period (check your company’s data retention policies)
Delete the VM

Option C: Immediate deletion (pooled desktops) For pooled desktops where user data isn’t preserved, there is no action needed. Users simply can’t log in anymore, and their next login will assign them to a different VM (if they regain access later). Reclaim capacity if needed: After offboarding multiple users, you may have excess VMs. If usage is consistently below capacity:

# List VMs in finance group
ansible-playbook -i inventory list.yml | grep citrix-vda-finance

# Delete 3 VMs from finance group
ansible-playbook -i inventory delete.yml -e "vm_name=citrix-vda-finance-10"

ansible-playbook -i inventory delete.yml -e "vm_name=citrix-vda-finance-11"

ansible-playbook -i inventory delete.yml -e "vm_name=citrix-vda-finance-12"

Backup and Recovery

VM Snapshot Strategies

Important limitation: Orka Engine does not have native VM snapshot functionality built into the Ansible playbooks. Snapshots must be handled at the host storage level. Available backup approaches: Approach 1: Golden image versioning (this is recommended for most environments) Rather than backing up individual VMs, maintain version history of your golden images. This works well for pooled desktops where user data isn’t stored on VMs. How it works:

Keep the last 3-4 versions of each golden image in your container registry
If any issues arise, redeploy VMs from the previous golden image version
User data is stored on network file shares, not on VMs

Implementation:

ansible-playbook -i inventory create_image.yml -e "vm_image=registry.example.com/citrix-vda/sonoma-finance:v2.1" -e "remote_image_name=registry.example.com/citrix-vda/sonoma-finance:v2.2"

In your container registry, configure your image retention policies to keep the number of image versions specified:

Production images: It is recommended to keep the last four golden image versions (approximately 4-6 months)
Development/test images: Keep the last two golden image versions

Approach 2: Host-level storage snapshots (for dedicated desktops) If your users have dedicated VMs with local data that must be preserved, use host-level tools:

SSH to your MacStadium VDI host
Use APFS snapshot capabilities on the host:

# Create a Time Machine local snapshot
tmutil localsnapshot

# Or use custom scripts to snapshot /var/orka/vms/<vm-name>

Restore the desktop by copying the image snapshot back to the VM disk location

Note: This approach is not automated in MacStadium VDI. You’ll need to build custom tooling or manual procedures. Approach 3: Third-party backup tools Some environments integrate enterprise backup tools (Veeam, Commvault, etc.) at the host level. Consult your backup vendor’s documentation for macOS virtualization support.

Image Backup Procedures

Backup your golden images regularly:

Method 1: Container registry replication Configure your container registry to replicate to a secondary registry for disaster recovery. Primary: registry.example.com
Secondary: backup-registry.example.com (located in a different datacenter) Most container registries (Docker, GitHub Container Registry, Harbor, JFrog Artifactory) support replication. Consult your registry’s official documentation for more information. Method 2: Export images to file storage Manually export images for offline backup:

ansible-playbook -i inventory pull_image.yml -e "remote_image_name=registry.example.com/citrix-vda/sonoma-finance:v2.0"

Disaster Recovery Runbook

Scenario: Complete loss of Orka hosts (datacenter failure)

Prerequisites:

Secondary MacStadium VDI environment located in a different datacenter (requires MacStadium private cloud in multiple locations)
Golden images replicated to a secondary registry accessible from the DR site
Ansible inventory configured with DR hosts

Recovery steps:

Update your Ansible inventory to point to the specified disaster recovery hosts

     [hosts]
     mac-dr-node-1 ansible_host=10.1.100.10
     mac-dr-node-2 ansible_host=10.1.100.11
     
     [all:vars]
     ansible_user=admin
     ansible_become=yes

Pull images to disaster recovery hosts

     ansible-playbook -i inventory-dr pull_image.yml -e "remote_image_name=backup-registry.example.com/citrix-vda/sonoma-finance:v2.0"

Deploy VMs in the disaster recovery environment

     ansible-playbook -i inventory-dr deploy.yml -e "vm_name=citrix-vda-finance-01" -e "vm_image=backup-registry.example.com/citrix-vda/sonoma-finance:v2.0"

Run this command once for each additional VM, using a unique vm_name each time. 4. Update Citrix Cloud configuration VMs will automatically register with Citrix Cloud if:

The specified disaster recovery VMs can reach Citrix Cloud (outbound HTTPS)
Citrix VDA configuration includes the correct Cloud Connector details

If Cloud Connectors are also lost, you’ll need to deploy new ones in the disaster recovery environment first. See Citrix documentation for Cloud Connector installation.

Notify users of temporary environment changes

Users may experience:

Different VM IP addresses (if you have IP-based network policies)
Slightly different VM performance characteristics
Needing to reconnect to their desktop through Citrix Workspace

Expected RTO (Recovery Time Objective): 2-4 hours depending on your VM count and image sizes. Expected RPO (Recovery Point Objective): This depends on your golden image replication frequency. With real-time registry replication: Recovery in minutes.
With daily image backups: Expect up to 24 hours of configuration changes will be lost. Cost consideration: Most customers don’t maintain a full disaster recovery environment due to hardware costs. Alternatively, you can accept longer RTO periods and work with MacStadium to provision new hosts on-demand during disaster recovery.

Data Restoration Workflows

Scenario: User accidentally deletes important files

For pooled desktops:

User data should not be stored on VMs. Redirect users to restore from network file shares, OneDrive, or other corporate backup systems. If user data was incorrectly stored on a pooled VM and is now lost, there is no recovery path. Use this as a learning opportunity to reinforce data storage policies.

For dedicated desktops:

Recovery depends on your backup approach. If you are using host-level snapshots:

SSH to the MacStadium VDI host
Stop the affected VM:

     ansible-playbook -i inventory vm.yml -e "vm_name=citrix-vda-finance-abc123" -e "desired_state=stopped"

Restore VM disk from snapshot (manual process; depends on host storage configuration)

Start the affected VM:

     ansible-playbook -i inventory vm.yml -e "vm_name=citrix-vda-finance-abc123" -e "desired_state=running"

If using third-party backup tools, follow your vendor’s restore procedures to restore specific files or a full VM. Best practice: Train users that local VM storage is not backed up. Enforce policies requiring all important user data to be stored on network drives, cloud storage, or source control systems.

​Routine operations

​Capacity management

Monitoring host utilization:

Setting up basic monitoring

​Scaling up: Adding new Mac hosts

When you need to scale up:

​Steps to add a new Mac host:

​Scaling Down: Decommissioning Hosts

​When you might scale down:

Steps to decommission a host:

​Image updates and patching

​macOS Security Updates

Create a test image

Pilot update rollout

Full production rollout

​Application Updates

​Citrix VDA Updates

​User Lifecycle

​Onboarding New Users

​Reassigning Desktops

​Offboarding and Data Retention

​Backup and Recovery

​VM Snapshot Strategies

​Image Backup Procedures

​Backup your golden images regularly:

​Disaster Recovery Runbook

​Scenario: Complete loss of Orka hosts (datacenter failure)

Recovery steps:

​Data Restoration Workflows

​Scenario: User accidentally deletes important files

For pooled desktops:

For dedicated desktops:

Routine operations

Capacity management

Scaling up: Adding new Mac hosts

Steps to add a new Mac host:

Scaling Down: Decommissioning Hosts

When you might scale down:

Image updates and patching

macOS Security Updates

Application Updates

Citrix VDA Updates

User Lifecycle

Onboarding New Users

Reassigning Desktops

Offboarding and Data Retention

Backup and Recovery

VM Snapshot Strategies

Image Backup Procedures

Backup your golden images regularly:

Disaster Recovery Runbook

Scenario: Complete loss of Orka hosts (datacenter failure)

Data Restoration Workflows

Scenario: User accidentally deletes important files