Outlier detection from ETL Execution trace

Computer Science – Databases

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

2011 3rd International Conference on Electronics Computer Technology (ICECT 2011)

Scientific paper

10.1109/ICECTECH.2011.5942112

Extract, Transform, Load (ETL) is an integral part of Data Warehousing (DW) implementation. The commercial tools that are used for this purpose captures lot of execution trace in form of various log files with plethora of information. However there has been hardly any initiative where any proactive analyses have been done on the ETL logs to improve their efficiency. In this paper we utilize outlier detection technique to find the processes varying most from the group in terms of execution trace. As our experiment was carried on actual production processes, any outlier we would consider as a signal rather than a noise. To identify the input parameters for the outlier detection algorithm we employ a survey among developer community with varied mix of experience and expertise. We use simple text parsing to extract these features from the logs, as shortlisted from the survey. Subsequently we applied outlier detection technique (Clustering based) on the logs. By this process we reduced our domain of detailed analysis from 500 logs to 44 logs (8 Percentage). Among the 5 outlier cluster, 2 of them are genuine concern, while the other 3 figure out because of the huge number of rows involved.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Outlier detection from ETL Execution trace does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Outlier detection from ETL Execution trace, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Outlier detection from ETL Execution trace will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-17990

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.