Using data analytics to anticipate potential workplace
A team from Aureolis has demonstrated how SAS® Viya® visual text analytics can take unstructured data and make it quantifiable to identify causes for sick leave and absence from work.
More and more working-age people are absent from work. It is often unclear how many are suffering from the same illness and thus, to identify underlying causes that may stem from the work environment. When someone goes on sick leave their employers collect health documents and personal statements, information that is sensitive and specific to each individual.
The documents are considered unstructured text data, it contains natural language and is often difficult to evaluate or compare.
We decided to find out if unstructured data could be translated into useful statistics for employers Analyst Hackathon Team Aureolis
For the 2019 SAS® Viya® Hackathon, Aureolis made a test model for extracting data from unstructured documents and turning them into visual data.
The process is broken down into four steps. Clean and prepare data for text analytics, preliminary analysis, reiteration, and finally, visualize the results.
From unstructured to quantifiable data
By placing the model into the SAS® Viya® environment the data can be quickly generated. The process begins with raw data that is cleaned. Keywords are then pulled from the text in order for the data to be analyzed. The preliminary step involves doing a quick analysis within the basic text and begin to find the categorized topics, for example ‘stress’.
Next, the analysis is run again in the reiteration stage, this time grabbing synonyms that could be grouped together and possibly show similar or identical illnesses that have been described with different words. Often in the first analysis, terms are combined incorrectly, and others are dropped accidentally. This model is a continuous process that requires multiple iterations to achieve optimal results.
The final step is the visualization of the data. The topics are plotted into a diagram that links them to the number of employees affected by the illness. That way companies can quickly scan the information and identify what some of the most common workplace illnesses are.
With this data, companies will be one step ahead and will have the tools to identify and possibly prevent potential risks within the workplace that may contribute to illnesses Analyst Hackathon Team Aureolis
Challenge
- It is difficult to compare unstructured data based on natural language text.
- Employers have the raw data but are not sure how to extract it and determine trends within working-age people’s illnesses.
Solution
- Analyze unstructured text data in SAS’ visual text analytics program.
- Make a visual representation of the data to see similarities.
Benefit
- Prevent some of the potential risks within the workplace that could cause illnesses.
- Decrease the amount of sick leave.
About Hack In SAS Viya
In association with Intel and SAS Nordic User Group “Fans”. SAS hosted the competition “Smarter Together” for SAS partners. Here, teams of analysts and data scientists demonstrate the value of open data combined with SAS® Viya ® in the cloud and open source technologies.
SAS provided the SAS® Viya® platform with a number of SAS tools in the cloud for easy access through saasnow.com. The nine participating teams built showcase based on a number of SAS software products including: Visual Analytics, Visual Statistics, Visual Data Mining & Machine learning, econometrics, and optimization.