Monday, June 8, 2009

Data Correction and Cleansing Mechanisms


One of the most overlooked issues during system implementations is data cleanup. Before going live with a new system, it's almost always wise to cleanse, fix, and consolidate data from the legacy system.

It's both really hard and pretty naive to look at fixing enterprise data in a vacuum. So, let's consider people, process, and technology together.

It is often the role of the consultant to identify potential or probable duplicates during a project. Whether using any number of specialized data cleanup tools or stalwarts such as Microsoft Access or Excel, I have found that it's typically not terribly difficult to identify potential duplicates--i.e., questionable records. The key word here is "potential", as many records need to be manually examined in order to consolidate, purge, or retire.

However, identification is simply the first step in the process--and often the easiest. After isolating suspect records, they must be investigated and ultimately fixed. Here's where it's usually a good idea to stop using phrases such as "not terribly difficult."

Some people become defensive when presented with data errors. Generally speaking, I try to say very innocently that "someone may have done something wrong." I find that it's much less confrontational than pointing a finger. Often, end-users are quick to plead ignorance or blame predecessors for mistakes. In the event that they themselves have made the mistakes (audit trails are pretty hard to dispute), the tone of the conversation is quite different. There's usually a reason that an end-user did what s/he did.

It's the client's role to ultimately make the final call on what to do with suspect records. Far too often, however, end-users do not have the time, desire, or skill set to make these calls. (See my post last month on the different focuses on consultants and end-users.) Failure to address data issues in a timely manner typically causes many problems, from cascaded delays on other project tasks to incomplete testing.


Sometimes on IT projects vendors during the sales cycle (and project managers during the engagement) underestimate the amount of time required to clean up key enterprise information. Technology helps in conducting this imperative exercise but is no panacea for sloppy data that needs to be cleansed.

With more than a decade of experience, Phil Simon assists organizations in all phases of systems consulting including vendor selection, project management, business needs analysis, gap analysis, system testing and design, end-user training, interface and custom report development, and documentation. The result: providing his clients with superior systems, increased ROI, and a healthier bottom line.

Phil is the author of the book Why New Systems Fail and a seasoned independent systems consultant. He started his company in 2002 after six years of related corporate experience. With his extensive knowledge of both well-known and homegrown applications, he has cultivated over twenty clients from a wide variety of industries, including healthcare, manufacturing, retail, and the public sector.

Phil is a graduate of the School of Industrial and Labor Relations at Cornell University (MILR) and Carnegie Mellon (B.S., Policy and Management). He lives in Northern, NJ, USA.

No comments:

Post a Comment