When you are running an on-premises IT environment you most likely have some kind of monitoring solution. If something goes wrong in the infrastructure you are notified almost immediately, and you can take appropriate action.
Things are different when using cloud services. Your services are running in a datacenter somewhere else, controlled by another organization and with an internet connection (that sometimes can be unreliable).
For a customer I checked out ExoPrise, a SaaS (Software as a Service) based cloud monitoring solution. With surprising results.
ExoPrise private site
Exoprise is a cloud monitoring solution which is offered as a SaaS solution. This means Exoprise is running in a datacenter somewhere, and you have a subscription for using the services. Because it is a SaaS solution, installing and configuring the monitoring solution is just a matter of minutes.
Exoprise is running in a datacenter somewhere, but when using Exoprise you configure your own private site. A private site is a WIN32 service running on a Windows server in your own environment. From this private site, cloud services are monitored. From an end-user perspective, the private site has the same user experience as your local users have.
You can install and configure multiple sensors in your private site. Each service has its own sensor, there are sensors for Exchange Online, Microsoft Teams (Messages and A/V), Skype for Business, Free/Busy, ADFS, OneDrive, SSLCheck, Amazon, Google, Salesforce …. Tons of sensors are available.
When configured, a sensor performs synthetic transactions against the cloud service. For example, the Exchange Online sensor looks at the average logon time, message transfer speed and network latency. The results are shown graphically for your own sensor, but because it is a SaaS solution Exoprise can compare your results against the results of the rest of the world which they call ‘crowd’. This is shown in the following screenshot, the lower line is my own sensor, the upper (thinner) line is the crowd average.
With all this working from home due to the Covid-19 crisis a lot of organization have been implementing Microsoft Teams rapidly. As an admin you want to know you Teams is performing from your local environment. When configuring a Teams AV sensor it performs all kinds or synthetic transactions, very similar to the transactions a regular user is performing. The sensor is using a test identity in your Teams environment and it uses a bot in Exoprise to communicate with. This way it can measure the call quality of Teams and it measure logon time, A/V streams (audio jitter, packet loss, bitrate), frames per second etc.
When configuring a sensor, an alarm is automatically created. This alarm can be configured during creation:
When the sensor is triggered because of a transaction, an alarm is sent to the email address that is configured and this gives you an immediate overview that something is wrong with the service. And to be honest, I was a bit surprised how often alarms are generated and thus how often you will receive an email. The email will show which sensor is generating the alarm, some analysis information and alarm details as shown in the following screenshot.
Besides the alarm message you’ll also get a weekly overview with real user performance data that is gathered from ‘the crowd’ so it averages over all Exoprise sensors that are deployed. This will show overall trends for all cloud services, for example:
There are multiple cloud monitoring solutions available and I had the opportunity to have a look at the ExoPrise solution. I was surprised by the ease of configuration and ease of use. It is a cloud based service, pull your credit card and you’ll be working within 20 minutes. It is a great tool for monitoring your environment, but at the same time it is a great tool for troubleshooting purposes (when you are a consultant).
I was also surprised by the data that was gathered by the ExoPrise sensors. It shows immediately when something is wrong, and you are notified before your users start calling the servicedesk that something is not working. And that happened more often than I thought before.