Should you use Microsoft Excel as a database?

A blunder by the UK government cleared that one up.
5 October 2020

Not a database. Source: Shutterstock

Excel is the go-to tool for creating spreadsheets and performing calculations with restricted data sets. For that kind of work, it’s invaluable. But can (or should) it be used as a database?

The answer from a number of cybersecurity experts is, roundly, ‘no’. This is not what Excel was intended for. Excel is useful for small tasks but not for handling large quantities of metadata.

One of the key issues is security: the software, while simple to use, easy to share and collaborate on, is not appropriate in providing any significant level of security or user permissions. It’s not designed to be multi-user, so simultaneous use can corrupt a file easily and cause performance issues. If several users do have access, multiple versions could be updated and stored independently, causing confusion over which version is the latest.

Secondly, used as a database, Excel has a limited capacity per sheet and can become unmanageable as the volume of data stored grows and must be segregated between multiple sheets.

Paul Norris, Senior Systems Engineer EMEA at Tripwire told TechHQ Excel was an “excellent tool” to report and filter data, but for large datasets, investments should be made into technology than can securely process large datasets and ensure accurate results.

“It’s not unheard of that organizations today use common tools to process data using desktop tools, however, it’s evident that there is a limit to how much data these tools can handle before it becomes unresponsive and potentially produce reports that may have missing data.”

Additional problems include lack of resiliency and potential data loss, he added, with limited controls on what can be deleted and restored if lost.

Excel spreadsheet blunder misses 16,000 UK Covid-19 cases

It came as a surprise, then, when it emerged today that a recent spike in COVID-19 cases in the UK over the weekend was due to a backlog caused by a Microsoft Excel error.

As reported by Public Health England, 15,841 cases between September 25 to October 2 weren’t uploaded to the government dashboard, because column lists on an Excel spreadsheet reached their maximum size. This stopped new records from being added automatically.

For this reason, details weren’t passed on to the country’s test and trace system and, as a result, health officials are now racing to inform tens of thousands of individuals who may have come into close contact with those who have tested positive but weren’t properly reported.

The issue has since been “solved” by splitting the Excel files into batches.

The UK government’s NHS Test and Trace operation cost approximately £12 billion (US$15.5 billion), with much of the work outsourced to companies like Serco.

Commenting on the use of Excel were a number of cybersecurity experts, including Richard Bingley, founder of Covent Garden-based Global Security Academy speaking to the Evening Standard: “It’s very easy to code in errors, which causes over corruption in the data,” he said.

“It’s surprising that these large population statistics are being collected, inputted, and stored in Excel. These are the types of development glitches that should have been ironed out before the system went live.”

Martin Jartelius, CSO at Outpost24, questioned how the storing of medical information in Excel could be viewed as “anything apart from the outmost temporary of solutions […]”

“It is not strange if this was the solution day one, week one, month one, but to see that it’s still in use and having hit the limits of its capacity is more than embarrassing. And to see that the solution has been to ‘split the file in batches’ rather than finding a proper solution to an actual problem even more so.”