Data Requirements and Mapping
The basic way to start with a Process Analysis project is to upload prepared CSV files manually. Such data import helps you get familiar with the Timeline functionality and features it offers, create a mapping for future automatic uploads, or check the uploaded information. In the guidelines below, find instructions on how to prepare a file with a proper structure and map fields.
Note. In addition to the manual file upload, you can import data to the project using different data sources or upload processed data from Repository.
For details, see:
ETL in the Cloud
Where to take data for upload?
You can receive data about processes and events from a variety of sources:
- Database systems (e.g., patient data in a hospital)
- Comma-separated files (CSV) or spreadsheets
- Transaction logs (e.g., a trading system)
- Business suite/ERP systems (SAP, Oracle, etc.)
- Message logs (e.g., from IBM middleware)
Getting data ready for upload
The uploaded data should meet certain criteria. The points below cover the essential information related to creating a correct file for upload.
1. Data structure
While uploading data to Timeline, it is important for a file to have a specific structure. The data file should consist of rows, each of them representing a record that something happened to a specific object at a particular time. The first line defines the fields' names. Other rows contain records in the order related to the fields' names. Such information can be extracted from IT systems, databases, or logs.
The file must have three mandatory columns, related to Timeline ID, Timestamp, and Event name and can include any number of optional columns. All columns can have arbitrary names, as no naming rules are imposed.
Data structure sample
|N008||1/16/2017 7:20:15||Ticket Registered||John Smith||Charlotte||Insurance claims|
|N123||4/10/2017 10:33:58||Ticket Closed||Anna Brown||Boston||Refund|
For example, a single record in the data file can show that support ticket N008 was registered on 1/16/2017 7:20:15. The record may also show that John Smith created this ticket and marked it with an "Insurance claims" comment. The other record can indicate that support ticket N123 was closed on 4/10/2017 10:33:58 by Anna Brown in the Boston office.
2. Mandatory and optional fields within the data structure
In order that the program can extract events that belong to one process and recreate its history as a timeline, the uploaded data should contain 3 mandatory fields. These columns can be named arbitrarily in the uploaded file.
- Timeline ID
A column with some identifiers of objects you want to track over time. This could be an Order ID, Claim ID, Patient Encounter Number, Support Ticket Number, and so on.
- Event name
A column describing what happened to the object at a time – Order Submitted, Patient Departed, Adjuster Assigned, Ticket Escalated, etc.
A column with timestamps showing when something happened in the life of the object. This column generally contains a date and time. If a date with no time is provided, midnight (12:00 AM, 00:00:00) will be used.
Important. Make sure you save timestamps in the file in one of the following formats:
- 1/16/2017 7:20:15
- 1/16/2017 7:20:15 AM
- 2017-01-26 7:20:15
Other columns are optional and become attributes for timelines. For example, you can map a column with the office location to see what processes occur in this place mostly.
3. File format
The data should be placed into a CSV file. It typically stores tabular data in plain text. In this file, no two columns should have the same name. The file should be Locale English (United States) and US ASCII or UTF-8 encoded.
File sample - CSV
TimelineID;Timestamp;Event name;Employee;Location A;1/16/2017 7:20:15;Student Applied;John;Boston A;03.10.2017 16:54;Student Accepted;Mary;Boston A;04.11.2017 15:04;Bill Generated;Ann;Charlotte B;02.01.2017 9:15;Student Applied;John;Boston B;03.02.2017 16:20;Student Accepted;Mary;Boston
Important. If the values in any field include commas, the format of the file may break. In this case, use double quotes to specify a new string. For example, Microsoft Excel does it automatically, however, some tools like MS SQL Export Wizard require manual settings.
Once you have uploaded your file, you need to map the data. This is the process of association data received from IT systems with attributes that display in Timeline. For example, map a mandatory column Timestamp in order that the program pulls timestamps for the processes from this column. Or map a column with employees' names using a New attribute label to be able to make a breakdown by dimensions and see who is in charge of specific processes.
Note. Timeline imports only mapped table columns as event attributes. Fields without mapping labels are not uploaded to the project.
The descriptions below contain information about the mapping process, and mandatory and optional fields.
1. How to map fields
In the Map columns step, you create a mapping for the uploaded data. Drag and drop labels to the respective columns. You should map at least 3 mandatory columns: Timeline ID, Timestamp, and Event name. For example, drag and drop a Timeline ID label to mark the column that contains such data.
While uploading a file to the non-empty project, the program may show previously created mapping if it matches the uploaded data. You can reassign mapping if you want to change it.
After you map the required fields, click Label all as attributes to automatically create attributes from all other columns.
2. How to define mandatory fields
The program uses information from the uploaded file to create timelines for further analysis. You should map the following columns with relative labels so that the program extracts timelines from the uploaded data:
- Timeline ID
Map this label to the column that contains the IDs of the monitored objects. An object may correspond to the ID of an order, claim, patient form, support ticket, etc. The ID will link all the events associated with the given object.
Drag and drop this label to the column containing the time of the events occurring throughout the lifetime of the object.
Important. Make sure that the data file uses the correct date and time format. See Mandatory and optional fields within the data structure section for details.
- Event name
Map this label to the column containing the events associated with the object at any given time. Examples of such events are Order received, Customer called, Survey commissioned, etc.
3. How to define and use optional fields
In addition to the mandatory columns, the file can contain any number of additional fields to be used as dimensional attributes. You can filter by these fields, group, and break down by them, or use them as additional information when analyzing the processes.
While mapping, the program has 2 pre-configured labels that can be mapped optionally:
- Event category
This is an auxiliary field that must be used if your data file contains events of many different types. It is recommended to use the Event category field for files containing events of more than 150 different types. The Event category field contains a group of all related events of the same type as a separate subset.
- Event number
This is an auxiliary field that you may want to use if your data file contains events with identical data and/or time stamps. To order such events, you can map them to a field that specifies their order. This order will be used if multiple timeline events have identical data and/or time stamps. The data in this column will be numbered 1, 2, 3, etc., depending on the numbering system to be used.
Other optional columns
Besides the above optional fields, you can map columns to any other fields which will be used as attributes. You will then be able to use these fields as filters, group or ungroup elements by these fields, or use them as a source of additional information for process analysis.
Drag and drop a New attribute label to any of the columns. The label will be named as the column in the original file. Click the pencil icon to rename the attribute. It will be uploaded to the project for all events that have this attribute. After setting up an optional attribute and uploading data, you can, for example, filter timelines using attribute values and configure analysis modules based on them, such as Interval measurements.
Process view page
After the data upload, the program generates timelines using processed data and opens the Process view page. By default, the program shows Primary path view. This graph displays the most common flow of events in timelines. You can switch to Milestone view to discover the current timeline set. On the right, you will see Default board showing the overview statistics for the selected view module, such as Count of timelines, Time range, etc.
For more details on the new analysis tool, see Getting Started with Process View.