This article first appeared in Digital Edge, The Edge Malaysia Weekly on November 2, 2020 - November 8, 2020
Much of Malaysia’s rapid digitisation effort that is being undertaken by both the public and private sectors is built on a foundation of high-quality and granular data. Consequently, the Malaysian Administrative Modernisation and Management Planning Unit (Mampu) launched an open data portal (www.data.gov.my) in 2014 to make some government data freely available for the public to use and republish at no cost. While this is an admirable initiative, there are critics who complain about things such as the quality of data provided and the policies that govern open government data. Digital Edge spoke to experts on the state of play.
When it comes to the state of open government data in Malaysia, there is plenty of room for improvement, says Kuek Ser Kuang Keng, data journalist and founder of Data-N, a data journalism training programme for newsrooms.
“First, some of the important data is not timely. This means that the latest data you can obtain could be two to three years old. This data, such as that for crime and road accidents, is important to journalists and researchers, and can affect people’s everyday lives,” says Kuek, an open data advocate who has provided consultation to the Malaysian government on the implementation of its open data policy.
Second, he notes, the data is not granular, meaning detailed enough. “For example, in the US, the demographic data is so granular that you can get the information down to the sub-districts, which is their version of a mukim. You can even get the demographic data such as age, gender and income levels, down to a single block, which is one or two streets.
“In contrast, the level of granularity in Malaysia is by districts, such as Petaling in Selangor. Petaling can mean Petaling Jaya plus all the areas surrounding it, with a population of a few hundred thousand people. This kind of data is not very useful,” he says.
To illustrate the importance of high-quality open government data, Kuek cites an example of a budding entrepreneur looking to open a restaurant in the US. To pick a location for the restaurant, the entrepreneur can visit the census website to find out a neighbourhood’s income levels and direct and indirect competitors, which will significantly impact the decision-making process.
However, it will be difficult to do the same in Malaysia. The entrepreneur would have to visit the site of the location himself, or hire expensive intelligence and market survey services generally reserved for large corporations. Thus, having open government data can help reduce the cost of running a business and the gap between large corporations and small and medium enterprises (SMEs).
Kuek also notes that Malaysia’s open data policy and standards are behind Indonesia and Singapore, where data can be accessed via an application programming interface (API). An API allows data to be collected and updated automatically without requiring significant manual input.
“Even during the Covid-19 pandemic, the data that we have access to in Malaysia is semi-open. Internationally, you can just download the CSV file [which is a raw data format] from the website, and connect the database via an API.
“But in Malaysia, although the Ministry of Health does provide us with data during its daily briefings, it is still locked in a PDF format, and there is no API for us to pull the data from the system automatically.”
While a great deal of government data is available and published online, most of it is not in an “open” format,” says Ashraf Shaharudin, a research associate at Khazanah Research Institute (KRI).
Ashraf published a discussion paper titled “Open Government Data: Principles, Benefits and Evaluations” on Sept 22 this year. He says to be considered open, data should fulfil a set of requirements: being complete, granular, timely, accessible, machine-processable and non-proprietary.
“For example, a lot of annual report data is published in a PDF format. This means that the data is not machine-processable and is not downloadable in bulk,” he says.
“There is also the issue of granularity, where data published is not down to the lowest level of granularity for a meaningful and accurate analysis, not only in terms of spatial aspects (such as local districts as opposed to state) but also in terms of gender, ethnicity, income brackets, age and so on.”
Ashraf says KRI utilises open government data to conduct analysis on socio-economic issues and provide evidence-based policy recommendations. An example would be KRI’s biennial flagship publication “The State of Households” (SOH) report, which uses data largely obtained from the “Household Income and Expenditure” report published by the Department of Statistics Malaysia (DOSM).
He also points out that there are other open data portals besides the one managed by Mampu. While there may be some overlap in data sets, KRI has utilised open data obtained from DOSM and the Ministry of International Trade and Industry, which have their own open data portals.
To improve the current state of Malaysia’s open data policy, authorities should make government data open by default, and the system should be institutionalised. This initiative is also in line with the government’s digital transformation agenda, says Ashraf.
He adds that the authorities should consider streamlining the roles of government agencies in managing and sharing data. There should also be standardisation of data formats as this allows users to easily combine different data sets. The authorities should also have a proper data inventory for easy data searchability, improve the API of data portals and digitise old archives.
“Based on multiple open government data evaluations that I have conducted for my discussion paper, Malaysia trails behind neighbouring countries [in terms of the quality of open government data] such as the Philippines and Indonesia as well as many other developing countries,” says Ashraf.
“If the data is not published in an open format, it will not be useful for data users, hence it will be underutilised. Therefore, government data needs to be made open first, then we can move towards increasing awareness about data available for use and enhancing data skills among data users and the public.”
One key issue surrounding Malaysian open data currently is that the datasets are not updated often enough, says Fusionex director of big data analytics Gan Chun Yee.
“Without the most current information for us to utilise, data becomes obsolete and users risk their analytics results being no longer accurate or relevant. Without a reason to trust the open data, organisations are less likely to rely on open data, and that hinders its chances of bringing value,” says Gan.
Fusionex is an international data technology provider that uses open government data, both foreign and domestic, as a reference point for its data analytics exercises. When used together with the company’s proprietary software systems, open data has helped the company gain a deeper understanding of local, regional and global business landscapes.
Gan further explains that it is important for open data to be current and updated, as well as trustworthy, accurate, reliable and complete. These attributes help Fusionex have a better understanding of the current state of the industry, benchmark business performances and enable it to adopt the most appropriate predictive models for use in any given scenario.
Having low-quality data will affect the results of any analytical model, he adds. If not addressed immediately, it may lead to undesirable outcomes that may have adverse effects on the decision-making process.
Despite its weaknesses, Malaysia’s open data portal does have a wide variety of datasets available on the platform. Gan says the introduction of the portal in 2014 enabled Fusionex to utilise greater volumes of data from verified sources, and it is able to gain datasets from a central platform.
This has led to better insights with the potential to spark the development of new products and services, accurately segment markets and improve operational efficiencies, which in turn leads to savings on expenditure and resources.
“The content and knowledge found in these datasets can be used to fuel advanced analytics, machine learning and optimisation to fast-track the development of innovative tools and services.
“Malaysia is fortunate as we have numerous sources of data and publicly funded surveys — DOSM has generated a wellspring of solid data. The data compiled regarding various demographics across a multitude of industries can be put to good use by data analysts and scientists for the benefit of all of us, as it provides useful insights that can be used to guide decision-making and strategic planning,”says Gan.
He points out that Malaysia’s open data policy is still at a nascent stage, and requires a lot of work to realise its fullest potential. He hopes to see the authorities implementing a robust framework and process flow to ensure the legitimacy of the data and to maintain data consistency.
If implemented correctly, however, Gan believes Malaysia’s open government data has the capability of creating new opportunities and innovation across all businesses, especially for entrepreneurs and SMEs, which may not have as much resources at their disposal as the titans of their respective industries.
Red Angpow, a data analytics company, uses both local and international open data to conduct analysis on the real estate industry and for urban planning.
However, co-founder Faizal Abd Kadir finds himself using more data from global providers, such as World Bank open data, compared with local ones owing to their better data structure, availability of historical data and granularity.
He also prefers data obtained from private sources, such as property listing portals, associations and even social media and blogs, because many of them have better standards for collecting and publishing data compared with the ones adopted by those handling the official open data portal.
Faizal says he still contacts the respective ministries to request for more detailed data than what is currently available to the public, which is something he did during his previous stint in consumer banking way before the launch of the open data portal in 2014.
He points out that basic information such as housing transactions, granular data from the National Property Information Centre (Napic) and mapping data is not available on the open data portal.
“The government data is quite outdated, and the latest data you can get is from 2018. Some reports are even from as far back as 2010. The data is also not granular, and it is categorised on the state level, or even the country level,” says Faizal.
“If you look at Singapore, in terms of data structure and tools, it would have visualisation tools built in [to the website] where we can immediately visualise the data and publish it. Here, we only have raw data, which is very simple data.”
Faizal notes that the push for open and transparent data started a long time ago in western countries, whereas Malaysia started only in 2014, hence it needs to catch up quite a bit. However, he believes that the issues faced stem from the lack of importance placed on open data.
“Managing open data is not a daily job for some government departments, so it tends to be forgotten. There is a need to monitor the performance of each ministry, and they should comply with proper open data policies. Right now, we do not see that,” he says.
“This is not only just for me, but for other businesses as well. Having well-structured, clean and timely data helps us analyse it more quickly, as compared with preparing and cleaning up the data, which takes a lot of time.
“Imagine, it only takes one or two government employees to clean up the data before publishing it. But now, we have hundreds of companies employing hundreds of employees to clean up the same set of data, and it is a waste of resources.”
Save by subscribing to us for your print and/or digital copy.