正在加载图片...
Data Cleaning as a Process Data discrepancy detection Use metadata(e. g, domain, range, dependency distribution) Check field overloading Check uniqueness rule consecutive rule and null rule Use commercial tools Data scrubbing use simple domain knowledge( e.g. post code, spell-check to detect errors and make corrections Data auditing by analyzing data to discover rules and relationship to detect violators(e.g, correlation and clustering to find outliers) Data migration and integration Data migration tools: allow transformations to be specified ETL(EXtraction/Transformation/Loading)tools: allow users to specify transformations through a graphical user interface Integration of the two processes Iterative and interactive(e.g. Potter's Wheels) 1111 Data Cleaning as a Process ◼ Data discrepancy detection ◼ Use metadata (e.g., domain, range, dependency, distribution) ◼ Check field overloading ◼ Check uniqueness rule, consecutive rule and null rule ◼ Use commercial tools ◼ Data scrubbing: use simple domain knowledge (e.g., postal code, spell-check) to detect errors and make corrections ◼ Data auditing: by analyzing data to discover rules and relationship to detect violators (e.g., correlation and clustering to find outliers) ◼ Data migration and integration ◼ Data migration tools: allow transformations to be specified ◼ ETL (Extraction/Transformation/Loading) tools: allow users to specify transformations through a graphical user interface ◼ Integration of the two processes ◼ Iterative and interactive (e.g., Potter’s Wheels)
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有