Ads 468x60px

Pages

Subscribe:

Labels

Informatica (18) Integration Service (10) Siebel Business Intelligence (6) ETL (5) Informatica PowerCenter (4) Informatica PowerCenter 8x (4) Oracle (4) Metadata (3) DTM (2) Data Transformation Manager (2) Hexaware Technologies (2) OUD (2) Oracle Unified Directory (2) PowerCenter (2) XML (2) business (2) ASCII (1) Administration Console (1) Application Services (1) Automated Migration (1) BFSI (1) Binary (1) Bulk Load (1) Business Intelligence (1) Business Intelligence Challenge (1) Business Intelligence Company (1) Business Intelligence Consulting (1) Business Objects (1) ByTree (1) CDC (1) CNBC News (1) CNBCTV18’s Shreya Roy (1) COBOL (1) Change Data Capture (1) Collaborative (1) Collaborative Data Management (1) Computing Expression Evaluator (1) Convert Rows To Columns In Inforamtica (1) Data (1) Data Governance (1) Data Management (1) Data Mart (1) Data Type (1) Data Virtualization Services (1) Database (1) Datawarehouse (1) ETL Developers (1) Expression Evaluator (1) Expression Evaluator Debugging (1) Extract (1) FTP (1) File List (1) Flash or Java Applets (1) Flat Files (1) Function (1) HP Diagnostics-Identify (1) HP Diagnostics-Identify bottlenecks (1) HTTP Headers (1) Hexaware Technologies Limited (1) IT Metrics (1) IT companies (1) IT company (1) Index (1) Indirect Source (1) Informatica 8.6 (1) Informatica 8.x (1) Informatica Data Integration Service (1) Informatica Debugger (1) Informatica Debugging Transformation (1) Informatica Development (1) Informatica File Transfer (1) Informatica Power Center (1) Informatica Powercenter 8x Key Concepts (1) Informatica Process Control Audit (1) Informatica Repository Restoration (1) Informatica Server Re-Installation (1) Informatica Server Recovery System (1) Informatica Upgrade Challenge (1) Informatica Workflow (1) Informatica Workflow Process Control (1) Integration Services (1) Integration and Repository services (1) Invalid Objects (1) JavaScript Functions (1) Joiner Transformation (1) LDAP (1) LDAP Directory (1) LDAP Replication (1) Load Balancer (1) LoadRunner 11.5 (1) Looping (1) Manual Correlation (1) Mappings (1) NTLM Resource (1) NTLM authentication (1) Native Driver (1) New Column (1) ODBC (1) ODBC Driver (1) OID (1) OUD Configuration (1) OUD Directory Server (1) OUD Replication (1) Oracle Applications (1) Oracle Client (1) Oracle Hints In SQL (1) Oracle Internet Director (1) Oracle Optimizer (1) Oracle R12 (1) Oracle Solutions (1) Oracle loadrunner (1) OracleErrorActionFile (1) PeopleSoft Jobs (1) PeopleSoft Jobs In Hexaware (1) Peoplesoft Tester In Chennai (1) Peoplesoft Tester Jobs In Chennai (1) Performance (1) Performance Testing (1) PowerCenter 8.5 (1) PowerCenter Server (1) PowerCenter Server Support (1) Powercenter 8.5.1 (1) Pushdown Optimization (1) Re-Import (1) Relational (1) Remote Filename (1) Repository Services (1) Reverse Of A Normalizer In Informatica (1) Rows Read (1) SFTP (1) SJSDS (1) SMP (1) SQL Statement (1) SSH2 (1) SUBSTR Function (1) Session Failed (1) Source Data (1) Source Definition (1) Source Row (1) Store Procedure (1) SuppressNilContentMethod (1) Symmetric Multi-Processing (1) Target Definition (1) Target Row (1) Task Developer (1) Text Flags (1) Transfer Protocol (1) Transformation (1) Transformation Logics (1) Transpose Records (1) UDF (1) User Defined Functions (1) WriteNullXMLFile (1) XML File (1) XML Optimization (1) XML Target (1) XML Tuning (1) XMLSendChildFirst (1) XMLWarnDupRows (1) Zero byte XML file (1) accelerate application (1) employee performance (1) mid-cap it (1) web_reg_save_param (1)

Labels

Blogroll

About

Blogger templates

Blogger news

Monday, 14 September 2009

Merge Rows as Columns / Transpose records


Requirement: Converting rows to columns


Customer
Product
Cost
Cust1
P1
10
Cust1
P2
20
Cust1
P3
30
Cust2
ABC
10
Cust2
P2
25
Cust2
Def
10
Customer
Product1
Cost1
Product2
Cost2
Product3
Cost3
Cust1
P1
10
P2
20
P3
30
Cust2
ABC
10
P2
25
Def
10

The above illustration would help in understanding the requirement. We had to merge multiple records into one record based on certain criteria. The design had to be reusable since each dimension within the data mart required this flattening logic.

1. Approach:
The use of aggregator transformation would group the records by a key, but retrieval of the values for a particular column as individual columns is a challenge, hence designed a component ‘Flattener’ based on expression transformation.
Flattener is a reusable component, a mapplet that performs the function of flattening records.
Flattener consists of an Expression and a Filter transformation. The expression is used to club each incoming record based on certain logic. Decision to write the record to target is taken using the Filter transformation.

2. Design:
The mapplet can receive up to five inputs, of the following data types:
i_Col1 (string),  Customer
i_Col2 (string), Product
i_Col3 (decimal), Cost
i_Col4 (decimal) and
i_Col5 (date/time)
Have kept the names generic trying to accept different data types, so that the mapplet can be used in any scenario where there is a need for flattening records.
The mapplet gives out 15×5 sets of output, in the following manner:
o_F1_1 (string), Customer
o_F2_1 (string), Product1
o_F3_1 (decimal), Cost1
o_F4_1 (decimal) and
o_F5_1 (date/time)
o_F1_2 (string), Customer
o_F2_2 (string), Product2
o_F3_2 (decimal), Cost2
o_F4_2 (decimal) and
o_F5_2 (date/time)
… … and so on
The output record is going to have repetitive sets of 5 columns each (Each set would refer to one incoming row). Based on the requirement the number of occurrence of these sets can be increased. The required fields alone can be used / mapped. For the above example we use just 2 strings and one decimal for mapping Customer, Product and Cost.
The mapplet receives records from its parent mapping. The Expression would initially save each incoming value to a variable and compare it with its counterpart that came in earlier and is held in its cache as long as the condition to flatten is satisfied.
Syntax to store current and previous values:
i_Col2 string i
prv_Col2 string v curr_Col2
curr_Col2 string v i_Col2
The condition/logic to flatten records is parameterized and decided before mapping is called thus increasing codes’ scalability. The parameterized logic is passed to the Expression transformation via a Mapplet parameter. The value is used as an expression to perform the evaluation and the result is a flag value either ‘1’ or ‘2’.
Syntax for port – flag
Flag integer v $$Expr_compare
An example for parameterized expression
$$Expr_compare = iif (curr_Col1 = prv_Col1 AND curr_Col2 !=
prv_Col2, 1, iif (curr_Col1 != prv_Col1,2))
A variable port named “rec_count” is incremented, based on the flag.
Syntax for port – rec_count
rec_count integer v iif (flag=2,0, iif (flag=1,rec_count + 1,rec_count))
The expression transformation now uses the value in ports “flag” and “rec_count” to decide the place holder for each incoming input value, i.e. the column in target table where this data would move into ultimately. This process is an iterative one and goes on until the comparison logic ($$Expr_compare) holds good, i.e. until all records get flattened per the logic. An example of the place holder expression is shown below:
v_Field1 data type v iif(flag=2 AND rec_count=0,curr_Col1, v_Field1)
Port “write_flag_1” is set to 1 when the comparison logic fails (meaning flattening is complete). Filter transformation filters out the record once it is completely transposed.
Filter condition:
write_flag_1 integer v iif (flag=2 AND write_flag>1 ,1,0)

3. Outcome:
After developing the code and implementing the same we found it to be a useful utility, so thought of sharing it and would like to hear suggestions from readers on performing the same functionality in a different way. Please do share your views.

1 comment:

  1. I feel Informatica is the best way of providing a more concrete base through which problems can be solved.

    Informatica Read Rest API

    ReplyDelete