How Many of Your Spreadsheets are Topping Record Limits?

By Michelle Meyer

Depending on the number of columns and your version of Excel, record limits vary. But, generally speaking, in the later versions of Excel, if you’re topping 500K records, you’re starting to push limits.

It’s not an uncommon problem. Offices are managing increased amounts of data (compared to 10 years ago). It isn’t just data volume, there’s a lot of other work involved as well. Just to maintain all the spreadsheets, you most likely have to regularly export a lot of the data out of your proprietary records management system. Then, of course, there are all those columns of information that have to be hand entered, because your records management system doesn’t have the data.

In addition, the information in all those spreadsheets is related. So… beyond all the copying/pasting of data and importing information from your records management system, you also have to edit data across the multiple spreadsheets. If there is a change to the data, you have to make sure that change is reflected in every affected spreadsheet.

On top of it all, if you didn’t create the spreadsheet system, you can’t always understand the logic behind it. This causes a lot of problems when you have to copy/paste/ and start new spreadsheets for new datasets, or new time periods. You’re concerned because there are so many different spreadsheets, with so much data and so much copying and pasting of data, that there may be errors.

Worst, is the time you spend building reports from the data in various spreadsheets. Just trying to bring it all together, synthesize it in a way that reflects reality, and makes sense to those who want the report takes hours of your time.

Data volume is one thing, maintaining large datasets in a spreadsheet is another dimension altogether. Once your Excel record-sets get so large that you’re pushing volume limits, you’re also pushing limits of manageability. Excel is designed to analyse data, it is not designed to store (and manage) large volumes of data.

Data is dynamic and managing it requires dynamic solutions

You’re not alone; this problem is quite common in offices. And there are productive ways to deal with it. Firstly, you’ll want to keep in mind one reality when it comes to managing information. Data is dynamic and managing it requires dynamic solutions.

Look at your data through the lens of evolution. Since data is dynamic, proper management requires the ability to evolve and transform. Managing the information with spreadsheets may have made sense when you started tracking the data years ago. But, if you’re pushing volume limits; then you’re also spending an inordinate amount of time maintaining the data, and you’re probably concerned about data accuracy as well. These are all signs that it is time to move to a new stage in data evolution.

Microsoft created its data management products to accommodate information evolution. The first stage of data evolution is the spreadsheet stage. One person may need to track data specific to his/her job and just starts a new spreadsheet. As time passes, and more people need access to the information, the original spreadsheet application grows and multiplies and data volume explodes. Before long the situation has evolved into the system you have today.

Actually this situation can be viewed as an opportunity. Since the spreadsheets are pushing very unforgiving limits, the potential to take your information management to the next stage in data evolution is high. The second level of data evolution is the local database level. At this level, data from various sources are often combined and streamlined into one core local database.

Managing Local Data with a Database instead of a Spreadsheet

Local databases are unlike your enterprise software solutions. Local databases are “local” to one group of users. These types of databases manage information that doesn’t quite “fit” into system wide proprietary software solutions. Local databases are a step-up from spreadsheet applications. When you move away from spreadsheets to databases, you gain the ability to manage your data in a more streamlined fashion.

In addition databases are designed to handle high volume, so the size limits you have in Excel go away. Databases allow you to organize data in such a way that you can eliminate duplicate data entry and the necessity to reproduce spreadsheets for various datasets. Database structure also allows for building standardized reports that can be used for multiple datasets.

Moving away from a spreadsheet solution, to a local database solution, will solve a lot of the problems you’re experiencing with your spreadsheet applications. You’ll be able to manage related information in one core database, rather than distributed across many different spreadsheets.

It’s still possible to import data from proprietary software. But, since databases are built off of a table structure, with relationship capabilities, you will be able to store imported data in the same database. Database relationships will eliminate the need copy/paste and start new spreadsheets for new datasets. With tables and table relationships the information can all be stored in the same database.

If a local database is properly built, the need for average users to understand database logic decreases. Properly built databases will dramatically reduce the need for average users to build and rebuild reports, or other output vehicles. Since standardised reports, exports, and datasheets can be built into the system and reused over and over again, only the person programming the database really needs to comprehend the underlying logic.

Database table structure allows for more control over data integrity, reducing duplicate records, and keeping your data clean. Spreadsheets are fraught with the capacity to produce errors.

One of the reasons errors occur so much in spreadsheet applications, is because end-users copy/paste and reproduce the same spreadsheet “template” for multiple datasets. This dynamic can cause breakdown in data integrity when users forget to check calculations during the reproduction process. It can cause breakdown in data integrity when users forget to check for data duplication in the reproduction process. With a database solution, this problem is completely eliminated because proper database construction eliminates the need to reproduce databases for new datasets.

The Microsoft Office Suite is the most popular software on the market. This is because the Office Suite comes with different software packages for managing different office jobs. Microsoft Word manages word processing. Microsoft Outlook handles all the email and calendar tasks. Powerpoint is great for professional presentations. Excel is best used for data analysis. And Access is the database portion of MS Office Pro, it is best used for data storage and management. Within the Microsoft suite of products, Excel is the first stage in data management and Access is the next step up from Excel.

Microsoft gives most offices the tools they need for daily operations. But, Microsoft Access is the most underutilized portion of the Microsoft Office Suite. This can be a costly mistake for any office, because using Microsoft Access to collect, store and manage your local data can save you hours of time, a great deal of headaches with your spreadsheet applications, and it can save you money.

Record volume alone makes the move to Access, worth your time and money. MS Access applications can be developed with SQL server database backends. SQL Server is quite capable of handling high data volume, managing referential integrity between different record-sets, and much more. You can find more information about the benefits of using Microsoft Access for local database needs here.

Michelle Meyer is a consultant who specialises in managing data local to one group of people and integrating that data with larger proprietary software systems.