Table of contents
- Common issues and solutions
- Emergency procedures
- Best practices
- Hardware tips and recommendations
- Contacting support
If you are having trouble understanding the language in this document, please reference the Glossary.
Common issues and solutions
Graphics card issues
Symptoms
- Projector display issues
- Missing or blank displays
- System overheating
- Non-functioning video ports
- Graphics card fan failure
Confirm port functionality
- Verify all video ports are functional.
- Each port on your graphics card drives one projector output. If you notice that one projector is failing to output, try swapping a known working projector to the port that is having the issue. If the issue persists on the new projector, the graphics card may be faulty and need to be replaced.
- When using HDMI, ensure adapters are "Active" rather than "Passive".
- If there are any identifying markers (model number, serial number) on your cable adapters, you can search for them online to determine if they are "passive" or "active".
- For Display Port to HDMI conversions, use active adapters only, and confirm that the cables are not "unidirectional".
Replace failed graphics cards
- Replace with recommended models, see here.
Hard drive failures
Symptoms
- System won't boot
- Kernel errors
- Slow performance
- Dataset having trouble loading
- System freezes
Important: Confirm backups of all of your site-custom datasets and playlists exist
- These are located at
/shared/sos/site-backup.sos1/
. You'll want to confirm that your important data is safe, namely files under/home/sos/sosrc
,/home/sosdemo/sosrc/
, and/shared/sos/media/site-custom/
. - If your backups are out of date, please make manual backups of your site-custom datasets and playlists.
Check repair and recovery options
-
Check system logs for errors
- If familiar with system logs, Ubuntu comes preinstalled with an application called Logs. If you look through the Hardware section of this application, you may find errors related to hard drive operation.
-
Run disk health diagnostics
- Open the Disks application on your SOS machine, and navigate to each of your hard drives on the left hand side. Click the vertical "..." icon in the top right, and choose the "SMART data and self-tests" option. If errors occur here, there may be failing hard drive sectors and require hard drive replacement.
-
Recovery options
- Consider upgrading to upgrading to Solid State Drives (SSD) for better reliability.
- For complete failure:
- Install a fresh Ubuntu OS with the help of the SOS Support Team.
- Reinstall SOS software with the help of the SOS Support Team.
- Restore backed-up datasets and playlists.
-
Preventive measures
- Maintain regular backups.
- These should be automatic, but confirming they exist under the folder
/shared/sos/site-backup.sos1
should be done regularly.
- These should be automatic, but confirming they exist under the folder
- Monitor disk health.
- See "Run disk health diagnostics" detailed above.
- Consider dual-drive setup (500GB boot drive + 1TB shared drive).
- This is our standard setup configuration. If the main boot drive fails, your custom data and settings are left untouched in case of failure.
- Maintain regular backups.
Boot-related issues
Symptoms
- System stuck at boot screen
- Kernel panic errors during startup
- Black/blank screen after booting
- Server termination errors
Check hardware connections
- Verify RAM seating.
- Disconnecting and reconnecting your RAM chips from the motherboard may help.
- Confirm hard drive cables are connected properly within the SOS computer.
Remove the Nvidia driver
- Boot into recovery mode if possible, or connect through SSH from another machine.
- If able to boot into recovery mode, try uninstalling your Nvidia drivers. The best way to do this is to open b. Terminal window and run the command
sudo apt-get purge *nvidia* && sudo apt-get autoremove
. Then try to boot as normal.- SOS requires an Nvidia driver to be installed, so you'll want to reinstall the driver once you've booted up successfully.
- If able to boot into recovery mode, try uninstalling your Nvidia drivers. The best way to do this is to open b. Terminal window and run the command
Power-related issues
Symptoms
- Sudden shutdowns
- System instability
- Failed boot attempts
- Component failures after power events
After power events
- Check all power connections.
- If you're using a power strip for your SOS machine, try to bypass that and plug your machine directly into the wall outlet.
- Verify UPS functionality if present.
- There should be status indicators on the front screen of this device. If any are red or blinking, you may have a UPS failure.
- Test system stability after restart by running the SOS software for some time.
Preventive measures
- Install UPS systems.
- These systems help to maintain system power in the event of a power loss event at your location. These are essentially large backup batteries that help to keep your infrastructure running.
- Document proper shutdown procedures.
- Disconnecting the power cable while the machine is running could lead to software issues that prevent the system from booting. Only do this if absolutely necessary, and the machine will not shut down using the standard Shut Down operation.
Performance issues
Symptoms
- Slow dataset loading
- System lag
- Long boot times
- Delayed response to controls
Performance optimization
- Verify disk space availability on your OS hard drive.
- You can see your available Hard Drive space using the Disks application on your SOS computer.
- Check system resources usage.
- Open the application System Monitor on your SOS machine and verify that no resources are exceeding 80%.
- Monitor temperature levels in the System Monitor application or by running the
nvidia-settings -c :2
command in a terminal.
System maintenance
- Regular system updates.
- These can be done through the Software Updater application, built into Ubuntu. These are all safe to do on a regular basis. Please refrain from updating to a new version of the Ubuntu operating system (i.e Ubuntu 24.04) without first contacting SOS Support, as this will most likely prevent your SOS application from running.
- Remove unnecessary applications.
- Schedule regular reboots.
- We do recommend keeping your SOS machine on most of the time, but after any updates it is a good idea to reboot the system so that they take effect.
Replace failing components
- Most commonly, performance issues are a symptom of failing hardware.
- Replace failing hard drives, including both the boot drive and "SHARED" drive.
- Replace failing graphics cards.
- Removing the graphics cards from your system will help determine if they are the cause. Sometimes failing graphics cards will slow the system to a crawl.
Emergency procedures
-
System failure
- Switch to backup system if available.
- Most sites have an identical Hot Swap computer, SOS2. In the event of a system failure, please swap all connections to SOS2 and confirm it is working as expected. If the backup system crashes as well, the issue may lie in your network configuration.
- Document error messages.
- Take screenshots of errors to provide to Support.
- Contact support with detailed information.
- Switch to backup system if available.
-
Data recovery
- Maintain offline backups.
- Keeping a copy of your
/shared/sos/site-backup.sos1/
folder somewhere other than the SOS computer itself will help to recover data in case of a hard drive failure.
- Keeping a copy of your
- Keep spare hardware available.
- The most important items that are at risk of failure are the graphics cards and the hard drives.
- Test recovery processes regularly and confirm your backups are current.
- Maintain offline backups.
Best practices
Backup system management
- Maintain functioning backup system.
- This is your Hot Swap computer, usually named SOS2.
- Regular testing of backup hardware.
- Confirming that your backup SOS2 computer is functional on a regular basis will help in the event of a system failure, as connections can quickly be swapped to migrate to the new system.
- Keep backup system updated with security patches and SOS software updates.
Documentation
- Keep hardware specifications of your SOS computer recorded.
- Graphics card models
- Dell computer model (ex. Precision Tower 5860)
- Document all system changes.
- Software updates, Hardware replacements.
- Maintain software update history.
- Record custom configurations.
- These can include custom Kiosk configurations located at
/shared/sos/kiosk/
.
- These can include custom Kiosk configurations located at
Hardware tips and recommendations
Graphics cards
- Please see our current recommendations here:
Storage
- Consider SSD upgrades.
- Maintain separate boot/data drives, as is standard in SOS installations.
- Practice regular backup solutions outside of the automatic backup.
- Monitor drive health using the Disks tool.
Contacting support
If issues persist after trying these solutions, contact SOS support at sos.support@noaa.gov with the following information:
- SOS version number
- iPad app version number
- iOS version
- Detailed description of the issue
- Screenshots if applicable
- Steps already attempted to resolve the issue
For urgent hardware issues:
- Document specific error messages
- Provide system specifications
- Include recent system changes
- Prepare TeamViewer access capability (if available)
Comments
0 comments
Please sign in to leave a comment.