Recently, we were working on representation of Timeseries data. What we had in mind was to give the user an ability to choose the start and end date and represent the data related to farmers’ sowing. We simply hooked up a Graphing library from Google (Google Charts) and got the data represented.
This works perfectly fine when we are looking for data over a week or perhaps a month. But what about when we want to access data for a few month?
Look below, the diagram makes no sense to a user who is supposed to analyze and generate conclusions according to the data.
Understanding the problem
We realized that the requirement is to represent data in a smart way based on the user’s requirement. We decidedd to generate an aggregate date chart ie a chart with x-axis containing range of dates instead of a single date. We can aggregate the data automagically based on the date range set by the user.
This is something I hoped libraries would come built-in with, but couldn’t find any major open source libraries which do that, after trying Google Charts/Charts.js. Some premium libraries have these features included.
This is an easy to do yet very useful addition to any of these charting libraries. I will try to implement this in a library independent mechanism here using some very basic reproducible principles.
But, isn’t it better to pre-process the data and send it from server?
It is, in big data graphical implementations, such as monitoring data, where this calculation can also be done on the server side if there are millions of records, and that is something Time series optimized databases excel at. In this scenario, we are looking for a quick way to get us to convert a set of datapoints into a visual representation.
Solution
- Find the best possible representation of the data, is it months? Is it days? We create a hashing function to convert each date into this representation. An example is the start of week / start of month.
- We convert every date into the hashes as required. Why are the hashes still dates? We want to use the Date support present in the charting libraries and not implement the library ourselves.
- If you are looking at the above hash(start of week / start of month), Moment.js makes it extremely easy using the helpful
startOf()
functions. If you are looking to hash differently (eg: mid of week / mid of month), you will need some extra computation such as conversion to timestamp and then averaging. - Finally plot the data on a time axis to get an aggregate representation of your data.
In the implementation details below, I have added a plug and play solution to convert data into a representation which can be easily plotted to get an Aggregate date chart!
Implementation details
These libaries already support date as an axis, so the only next step is to aggregate the data over periods. Our data is simple to understand, a set of points and their dates:
eg:
[{'date': 'Jun 12'}, {'date': 'Jun 18'}, {'date': 'Jun 19'} ... {'date': 'Aug 12'}]
The final data that we need for our charting library to work with is a mapping between a date and a y-axis value:
{ 'Jun 12': 1, 'Jun 18': 1, 'Jun 19': 1, 'Aug 12': 1 }
How to represent the same data when we have data of 100 days? In that case, we simply show data in weekly buckets. Since the charting library would still expect a date as input, we would use the start of the week as the x-axis bucket. The hash function we are using is start of the week.
{ 'Jun 11': 1, // Refers to Jun 11 - Jun 17 'Jun 18': 2 // Refers to Jun 18 - Jun 24 'Aug 06': 1 // Refers to Aug 06 - Aug 12 }
Similarly, when the data goes higher to say 365 days, it makes sense to show buckets of months and so on. Here the date input to the charting library is start of month.
{ 'Jun 1': 3, // Refers to the month of Jun 'Aug 1': 1 // Refers to the month of Aug }
Code
if (daysRange < 30) { split_type = 'day'; } else if (daysRange < 120) { split_type = 'week'; } else if (daysRange < 600) { split_type = 'month'; } else { split_type = 'year'; } // Hash functon depends on split type used. const hash_func = date => moment(date) .startOf(split_type) .toDate() .toDateString();
Optionally, we also write function to calculate the information shown in the tooltip
const tooltip_func = (date, value) => { if (split_type === 'day') { return value + ' on ' + moment(date).format('DD MMMM'); // 1 on 3 Jun } else if (split_type === 'week') { return value + ' in ' + ( moment(date).format('DD MMMM') + ' - ' + moment(date) .add(7, 'days') .format('DD MMMM YYYY') // 10 in 3 Jun - 10 Jun 2018 ); } else if (split_type === 'month') { return value + ' in ' + moment(date).format('MMMM YYYY'); // 40 in Jun 2018 } return value + ' in ' + moment(date).format('YYYY'); // 150 in 2018 };
Here’s what we get:
Perfect experience for our users. A simplistic implementation which helps us express aggregate time data as expected by our users.