Handling Redundancy in Data Integration Redundant data occur often when integration of multiple databases o Object identification The same attribute or object may have different names in different databases ◆ Derivab/ data: One attribute may be a“ derived attribute in another table, e.g., annual revenue a Redundant attributes may be able to be detected by correlation analysis and covariance analysis a Careful integration of the data from multiple sources may help reduce/avoid redundancies and inconsistencies and improve mining speed and quality 同济大学软件学院 14 ool of Software Engineering. Tongpi Unversity 1414 14 Handling Redundancy in Data Integration ◼ Redundant data occur often when integration of multiple databases ◆ Object identification: The same attribute or object may have different names in different databases ◆ Derivable data: One attribute may be a “derived” attribute in another table, e.g., annual revenue ◼ Redundant attributes may be able to be detected by correlation analysis and covariance analysis ◼ Careful integration of the data from multiple sources may help reduce/avoid redundancies and inconsistencies and improve mining speed and quality