- Version
- Download 13
- File Size 2.11 MB
- File Count 1
- Create Date February 14, 2024
- Last Updated February 14, 2024
Data cleaning (also called data cleansing) is the process of finding and dealing with problematic data points within a data set. Data cleaning can involve fixing or removing incomplete data, cross checking data against a validated data set, standardizing inconsistent data, and more. Data scientists usually spend 50% to 80% of their time cleaning data before they can start analyzing it.
In this ebook, we describe simple, yet crucial, techniques to help you clean your data effectively. In case you wish to apply this learning in the future, we have incorporated a number of examples and exercises to give you hands-on learning. All you need is Microsoft Excel (versions 2007 or above). Don’t worry if you are unfamiliar with Excel — each chapter will teach you everything you need to know.