As an admin who's managed Salesforce for 10+ enterprises (including healthcare, SaaS, and manufacturing), I've seen data quality issues cost companies millions. Bad data isn't just messy—it causes failed campaigns, compliance risks, and broken integrations. You don't need to be a data scientist to fix it. Here's the battle-tested checklist I use on day one of every new org.
Before anything else, verify your master data is clean. I once inherited a retail org where 42% of accounts had duplicate names ("Acme Inc" vs "Acme Inc. Ltd"). This caused revenue leakage in their analytics. Start here:
SELECT Name, COUNT(Id) FROM Account GROUP BY Name HAVING COUNT(Id) > 1
For contacts, check for matching emails across accounts:
SELECT Email, COUNT(Id) FROM Contact GROUP BY Email HAVING COUNT(Id) > 1
Missing or invalid data in key fields triggers regulatory nightmares. In a healthcare client, missing NPI numbers (physician IDs) blocked 30% of claims submissions. Audit these fields:
SELECT COUNT(Id) FROM Account WHERE Industry = null to spot gaps.SELECT Industry FROM Account WHERE Industry NOT IN ('MFG', 'Retail', 'Tech')SELECT Id, Phone FROM Contact WHERE Phone NOT LIKE '%[0-9]%' OR Phone LIKE '%[^0-9]%' -- Filters non-numeric
Integrations fail when source data is bad. A manufacturing client's ERP sync failed daily because 25% of Product SKUs had spaces in their names ("P-100" vs "P-100 "). Fix this before going live:
SELECT External_ID__c, COUNT(Id) FROM Product2 GROUP BY External_ID__c HAVING COUNT(Id) > 1
SELECT OwnerId FROM Account WHERE OwnerId NOT IN (SELECT Id FROM User) to find orphaned owners.Old data buried in archives causes errors. A financial services client had 10K inactive leads from 2018 that were still triggering email campaigns. Audit ruthlessly:
SELECT Id FROM Lead WHERE IsConverted = false AND CreatedDate < LAST_YEAR
SELECT Id FROM Contact WHERE AccountId = null
Don't just scan—act. My rule: If a field isn't used in 80% of reports or workflows, kill it. And always, always validate before deleting. I once deleted a "Lead Source" option that had 200 records—only to realize it was the only one used by a critical partner. Use SELECT LeadSource FROM Lead GROUP BY LeadSource ORDER BY COUNT(Id) DESC LIMIT 5 to find your top 5 sources before pruning.
Quality data isn't a project—it's a daily habit. The checklist above is your shield against chaos. But let's be real: even the best admins miss something. That's why I use OrgScanner to run a 5-minute health scan on every org I touch. It catches duplicates, missing fields, and integration risks I’d otherwise overlook.
Get your free Salesforce health scan today—no strings, no sales pitch. Just actionable insights to fix the 3 data issues that’ll cost you the most this quarter.