Informatica Real Time Scenarios-By Hari Babu Naidu.Daddala: April 2014

Wednesday, 23 April 2014

How to Extracting Middle Name of employer From Ename?

Suppose in e_name column is like this

empno ename

1 Sudhansu Sekher Dash

2 Amiya Prasad Mishra

In target we have to send middle name like this

empno ename

1 Sekher

2 Prasad

These are the steps for achieving this

Drag the source and connect to an expression transformation
In Expression create two ports one is name1(as variable port) and Middle_Name (o/p port)
In Name1 write the condition like this
In Middle_Name write the condition like this
Then send to target.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

Tuesday, 22 April 2014

Interview Questions Set 5

101.What is the difference between an expression transformation and an aggregator transformation?
Expression transformation permits you to perform calculations row by row basis only. In Aggregator you can perform calculations on groups.
102.What is a source qualifier? What is meant by Query Override?

Source Qualifier represents the rows that the Power Center Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation.

Power Center Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query.

8 103. What are the contents in a session log in general?

- Allocation of system shared memory

- Execution of Pre-session commands/ Post-session commands

- Session Initialization

- Creation of SQL commands for reader/writer threads

- Start/End timings for target loading

- Error encountered during session

- Load summary of Reader/Writer/ DTM statistics

104.what are the types of partitioning in informatica?

I. Round-robin partitioning. The Informatica Server distributes data evenly among all partitions. Use round-robin partitioning where you want each partition to process approximately the same number of rows.

II. Hash partitioning. The Informatica Server applies a hash function to a partition key to group data among partitions. If you select hash auto-keys, the Informatica Server uses all grouped or sorted ports as the partition key. If you select hash user keys, you specify a number of ports to form the partition key. Use hash partitioning where you want to ensure that the Informatica Server processes groups of rows with the same partition key in the same partition.

III. Key range partitioning. You specify one or more ports to form a compound partition key. The Informatica Server passes data to each partition depending on the ranges you specify for each port. Use key range partitioning where the sources or targets in the pipeline are partitioned by key range.

IV. Pass-through partitioning. The Informatica Server passes all rows at one partition point to the next partition point without redistributing them. Choose pass-through partitioning where you want to create an additional pipeline stage to improve performance, but do not want to change the distribution of data across partitions.

105.How will you optimize a Lookup transformation?

I. Be sure to delete the unwanted columns from the lookup as they affect the lookup cache very much.

II. If the Lookup transformation is after the source qualifier and there is no active transformation in-between, you can as well go for the SQL over ride of source qualifier

III. The cache that you assigned for the lookup is not sufficient to hold the data or index of the lookup. Whatever data that doesn't fit into the cache is spilt into the cache files designated in $PMCacheDir. When the PowerCenter doesn't find the data you are lookingup in the cache, it swaps the data from the file to the cache and keeps doing this until it finds the data. This is quite expensive for obvious reasons being an I/O operation. Increase the cache so that the whole data resides in the memory

IV. Sequential and Concurrent caches: The 8.x version of PowerCenter gives us this wonderful option to build the caches of the lookups either concurrently or in a sequential manner depending on the business rule. If no business rule dictates otherwise, concurrent cache building is a very handy option.

106.What is the difference between a connected look up and unconnected look up?

Connected Lookup	Unconnected Lookup
Connected lookup takes input values directly from other transformations in the pipeline.	Unconnected lookup doesn’t take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping.
Dynamic caching can be used	Dynamic caching is not applicable
Returns default value when there is no matching record	Returns NULL when there is no matching record
Returns as many columns in the lookup table	Only one return port

107.What is the use of update override query in the target instance?

I. The update can be performed base on a target column value.

II. The update can be performed without defining a key column in informatica and in the database

108.What is a data warehouse?

It is a process for building decision support systems and knowledge management enviroment that supports

both day-to-day tactical decision making and long-term business strategies.

Bill Inmon’s definition: "Subject-oriented, integrated, time variant, non-volatile collection of data in support

of management's decision making process."

Ralph Kimball’s definition: "A copy of transaction data specifically structured for query and analysis."

109. list out what are active t/r and what are passive t/r?

Aggregator - Active & Connected

source qualifier - Active & Connected

Filter - Active & Connected

Expression - Passive & Connected

joiner - Active & Connected

lookup - passive & connected/unconnected

HTTP - passive & connected

normalizer - active & connected

rank - active & connected

router - active & connected

sequence - passive & connected

sorter - active & connected

stored procedure - passive & connected/unconnected

union - active & connected

Java – active & connected

SQL – active/passive & connected

110. What is the use of index, what is the impact in the performance when DML operation is performed in the indexed table?

Oracle uses indexes to avoid the need for large-table, full-table scans and disk sorts, which are required when the SQL optimizer cannot find an efficient way to service the SELECT query

The performance will be less when the DML operations like INSERT, UPDATE, DELETE are carried out.

111.What are the transformations cannot be used upstream from a union transformation?

Sequence generator and Update strategy

112.Configuring Mapping variables and how to use it ?

Go to the tab Mapping-->parameter and variables-->$$variable_name

There are 3 aggregation types, namely, COUNT, MIN, MAX

Once the variable is created, it can be used in the expression transformation with the functions like SETCOUNTVARIABLE($$variable_name), SETMAXVARIABLE( $$Variable_name, value ), SETVARIABLE ($$Variable_name, value)

SETCOUNTVARIABLE($$variable_name)

Counts the rows evaluated by the function and increments the current value of a mapping variable based on the count. Increases the current value by one for each row marked for insertion. Decreases the current value by one for each row marked for deletion. Keeps the current value the same for each row marked for update or reject.

SETMAXVARIABLE( $$Variable_name, value )

Sets the current value of a mapping variable to the higher among the two values mentioned above. The function executes only if a row is marked as insert.

SETVARIABLE ($$Time, SYSDATE)

Use with mapping variables with Max/Min aggregation type. The SETVARIABLE function executes only if a row is marked as insert or update.

113.what are Error Handling when Pushdown Optimization is enabled?

When the Integration Service pushes transformation logic to the database, it cannot track errors that occur in the database.

When the Integration Service runs a session configured for full pushdown optimization and an error occurs, the database handles the errors. When the database handles errors, the Integration Service does not write reject rows to the reject file.

If we configure a session for full pushdown optimization and the session fails, the Integration Service cannot perform incremental recovery because the database processes the transformations. Instead, the database rolls back the transactions. If the database server fails, it rolls back transactions when it restarts. If the Integration Service fails, the database server rolls back the transaction.

114.what are the general Some error codes shown in the session log?

BR - error related reader process, including ERP, flat file, relation sources

CMN - error related databases, memory allocation, lookup, joiner and internal errors

DBGR - error related to debugger

SE, TM, WID - errors related to transformations

PMF - error related caching in aggregator, lookup, joiner, rank

RR - error related to relational sources

EP - error related to external procedure

LM - error related Load Manager

REP - error related to repository functions

VAR - error related to mapping variable

WRT - error related to writer

115.What are the two types of errors that occur in a session run?

Fatal and Non-Fatal errors

Non-Fatal Errors: It is an error that does not force the session to stop on its first occurrence. Establish the error threshold in the session property tab with the “stop on errors” option. When you enable this option, the server counts Non-Fatal errors that occur in the reader, writer and transformations.

I. Reader errors can include alignment errors while running a session in Unicode mode.

II. Writer errors can include key constraint violations, loading NULL into the NOT-NULL field and database errors.

III. Transformation errors can include conversion errors and any condition set up as an ERROR, Such as NULL Input.

Fatal Errors: This occurs when the server cannot access the source, target or repository. This can include loss of connection or target database errors, such as lack of database space to load data.

If the session uses normalizer (or) sequence generator transformations, the server cannot update the sequence values in the repository, and a fatal error occurs.

116.What are the different types of commit interval?

Target based commit and Source based commit

(a) Target based commit

- Server commits data based on the no of target rows and the key constraints on the target table. The commit point also depends on the buffer block size and the commit interval.

- During a session, the server continues to fill the writer buffer, after it reaches the commit interval. When the buffer block is full, the Informatica server issues a commit command. As a result, the amount of data committed at the commit point generally exceeds the commit interval.

- The server commits data to each target based on primary/foreign key constraints.

(b) Source based commit

- Server commits data based on the number of source rows. The commit point is the commit interval you configure in the session properties.

- During a session, the server commits data to the target based on the number of rows from an active source in a single pipeline. The rows are referred to as source rows.

- A pipeline consists of a source qualifier and all the transformations and targets that receive data from source qualifier.

- Although the Filter, Router and Update Strategy transformations are active transformations, the server does not use them as active sources in a source based commit session.

- When a server runs a session, it identifies the active source for each pipeline in the mapping. The server generates a commit row from the active source at every commit interval.

- When each target in the pipeline receives the commit rows the server performs the commit.

117.Explain about Reject Loading?

I. During a session, the server creates a reject file for each target instance in the mapping. If the writer of the target rejects data, the server writes the rejected row into the reject file.

II. You can correct those rejected data and re-load them to relational targets, using the reject loading utility. (You cannot load rejected data into a flat file target)

III. Reading Rejected data:

Ex: 3,D,1,D,D,0,D,1094345609,D,0,0.00

There are two main things, Row indicator and Column indicator

Row indicator: Row indicator tells the writer, what to do with the row of wrong data.

Row indicator	Meaning	Rejected By
0	Insert	Writer or target
1	Update	Writer or target
2	Delete	Writer or target
3	Reject	Writer

Column indicator: Column indicator is followed by the first column of data, and another column indicator. They appear after every column of data and define the type of data preceding it

Column Indicator	Meaning	Writer Treats as
D	Valid Data	Good Data. The target accepts it unless a database error occurs, such as finding a duplicate key
?	Overflow	Bad Data
N	Null	Bad Data
T	Truncated	Bad Data

118.Explain about the various threads created during a session run?

Various threads	functions
Master thread	Handles stop and abort requests from load manager
Mapping thread	One thread for each session. Fetches session and mapping information. Compiles mapping. Cleans up after execution
Reader thread	One thread for each partition. Relational sources use relational threads and Flat files use file threads
Writer thread	One thread for each partition writes to target
Transformation thread	One or more transformation for each partition

119. How to setup Event Raise and Event Wait tasks?

Ans:

I. Create an event - right click the workflow and edit-> events tab -> create an event

II. Place Eventraise task in the workflow space and give the created event name in the "user defined event" (properties tab)

III. Place eventwait task in the workflow space-> there are 2 options in events tab,

Pre-defined: This is a file watch

User-defined: This is used for the event created on the workflow properties

When the Eventraise is executed, it creates the event, it triggers the event wait to continue

64. Time task:

Ans: You can specify the period of time to wait before the Integration Service runs the next task in the workflow with the Timer task.

Two options:

Absolute time - give the exact time when to start the next task or refer a date-time variable

Relative time - give the hours, minute, seconds - from the start time of this task

- From the start time of the parent workflow/worklet

- From the start time of the top-level workflow

65. Decision task:

Ans: Use the Decision task instead of multiple link conditions in a workflow. Instead of specifying multiple link conditions, use the predefined condition variable in a Decision task to simplify link conditions.

66. Assignment task:

Ans: You can assign a value to a user-defined workflow variable with the Assignment task.

67. CONTROL TASK

Ans: We can use the Control task to stop, abort, or fail the top-level workflow or the parent workflow based on an input link condition.

Control Option	Description
Fail Me	Fails the control task.
Fail Parent	Marks the status of the WF or worklet that contains the Control task as failed.
Stop Parent	Stops the WF or worklet that contains the Control task.
Abort Parent	Aborts the WF or worklet that contains the Control task.
Fail Top-Level WF	Fails the workflow that is running.
Stop Top-Level WF	Stops the workflow that is running.
Abort Top-Level WF	Aborts the workflow that is running.

220 .

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

https://www.facebook.com/InformaticaDevelopers

HOW TO REVERSE THE CONTENTS OF FLAT FILE – INFORMATICA

I have a flat file, want to reverse the contents of the flat file which means the first record should come as last record and last record should come as first record and load into the target file.
As an example consider the source flat file data as

Informatica Enterprise Solution
Informatica Power center
Informatica Power exchange
Informatica Data quality

The target flat file data should look as

Informatica Data quality
Informatica Power exchange
Informatica Power center
Informatica Enterprise Solution

Solution:
Follow the below steps for creating the mapping logic

Create a new mapping.
Drag the flat file source into the mapping.
Create an expression transformation and drag the ports of source qualifier transformation into the expression transformation.
Create the below additional ports in the expression transformation and assign the corresponding expressions

Variable port: v_count = v_count+1
Output port o_count = v_count

Now create a sorter transformation and drag the ports of expression transformation into it.
In the sorter transformation specify the sort key as o_count and sort order as DESCENDING.
Drag the target definition into the mapping and connect the ports of sorter transformation to the target.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.
https://www.facebook.com/InformaticaDevelopers

HOW TO LOAD ALL RECORDS EXCEPT LAST N - INFORMATICA

Q) I want to load all the records from my source, which is a file, except the last 5 records. This question can be asked interview as "How to remove the footer record which is last record"
Example: My source file contains the following records:

Name
----
A
B
C
D
E
F
G

After excluding the last 5 records, i want to load A,B into the target. How to implement a mapping logic for this in informatica?
Solution: Follow the below steps

Connect the source qualifier transformation, NEXTVAL port of sequence generator to the sorter transformation.
In the sorter transformation, check the key box corresponding to NEXTVAL port and change the direction to Descending.
Create one more sequence generator transformation and a filter transformation.
Connect the NEXTVAL port of the second sequence generator transformation to the filter and Name port of sorter transformation to filter.
Specify the filter condition as NEXTVAL > 5.
Save the mapping. Create a workflow and session. Save the workflow and run the workflow.

You can use the same approach to remove the footer record from the source by specifying the filter condition as NEXVAL>1.
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.

Thursday, 17 April 2014

Interview Questions Set 4

76. How do you assign values from one session to another session?
1. Create workflow variable

2. Once the first session succeeded, use “Post-session on success variable assignment” to value to the workflow variable

3. In the second session, use “Pre-session variable assignment” re-assign workflow variable to session.

77. I want to run the second session after running first session 3 times?

1. Create Workflow variable Name: $$Session_Counter, Datatype: Integer, Persistent: Checked.

2. Create Assignment task and Place it in between the two session.

3. Edit “Assignment Task” à Expression Tab

Add follow expression:

IIF($$Session_Counter = 3, 1, $$Session_Counter + 1 )

78. How do you override default lookup order by?

We can override default lookup order by clause, by place two dashes ‘--' as a comment notation after the ORDER BY clause to suppress the ORDER BY clause that the Integration Service generates.

79. What happen if one of the source qualifier port not linked with source instance?

The Integration Service through an internal error. The Source Qualifier contains an unbounded field [LOC].

80. What is difference between TRUE and FALSE in filter transformation?

A filter condition it returns TRUE or FALSE for each row that the Integration Service evaluates, depending on whether a row meets the specified condition.

· For each row that returns TRUE, the Integration Services pass through the transformation.

· For each row that returns FALSE, the Integration Service drops and writes a message to the session log.

81. Can We Use Source and Target objects inside Mapplets?

We can use Source objects inside the mapplets, but we cannot use Target objects inside the mapplets.

82. Explain the User Defined Functions in Informatica?

The User-defined functions extend the PowerCenter transformation language. We can create and manage user-defined functions with the PowerCenter transformation language in the Designer. We can add them to expressions in the Designer or Workflow Manger to reuse expression logic and build complex expressions. User-defined functions are available to other users in a repository.

83. What is pmcmd command and why we use it?

pmcmd is a program you use to communicate with the Integration Service. With pmcmd, we can perform some of the tasks that you can also perform in the Workflow Manager, such as starting and stopping workflows and sessions.

84. How you run workflow or session using command?

Command to run Workflow:
pmcmd startworkflow -sv IntService -d Domain -u user -p password -f Practice wf_s_m_sample

Command to run Session:

pmcmd starttask -sv IntService -d Domain -u user -p password -f Practice –w wf_s_sample s_m_sample

85. How many ways you can run a workflow or session or task?

We can run a workflow or session or task in 2 ways, Command line mode or Interactive mode.

86. What is use of code page?

A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the dat. For example if source contains Japanese text then the code page should be selected to support Japanese text.

When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data.

87. What is a surrogate key?

A surrogate key is a substitution for the natural primary key. It is a unique identifier or number for each record of a dimension table that can be used for the primary key to the table instead of a "real" or natural key.

88. What is difference between Mapplet and reusable transformation?

Mapplet: A mapplet is set of transformation, which is created in Mapplet Designer based on some business logic. This can be re-used more than one mapping or across the other folder if it is created in shared folder.
Reusable Transformation: It is a single transformation having some business logic, which is created in Transformation Developer. This can be reused more than one mapplet or mapping or across the other folder if it is created in shared folder.

89. What is lookup function and how do you call it?

Lookup function is nothing, but searching for a value in a lookup source columns. The lookup function compares data in a lookup source to a value we specify. When the Integration Service finds the search value in the lookup table, it returns the value from a specified column in the same row in the lookup table.

Example:

syntax: lookup(result, search1, value1, search2, value2) where result is output, search is output and value is input.

LOOKUP(:TD.EMP.HIREDATE, :TD.EMP.EMPID, 112233, :TD.EMP.DEPTNO, 10)

Note:

· This Lookup function is used in expression transformation instead of un-connected lookup or uncached lookup.

· NULL if the search does not find any matching values.

· Error if the search finds more than one matching value.

90. What is the difference between truncate and delete options in the session?

The truncate option is used to truncate data from that table.
The Delete option is used to delete the records from table, which are flagged for deletion in update strategy.

91. What is Forward Rejected Rows?

The Forward Rejected Rows option is used to either pass rejected rows to next transformation or drop them. By default the Integration Service forward rejected rows to next transformation. The Integration Service flags the rows for reject and writes them to the session log file and passes to reject file. If you do not select Forward Rejected Rows option the Integration Service drops rejected rows and writes them to the session log file.

92. Under what condition selecting Sorted Input in aggregator may fail the session?

If the un-sorted data is passed to the aggregator and sorted input option is check then the session get failed.

93. What are the difference between joiner transformation and source qualifier transformation?

The Source Qualifier transformation is used to join source data from the relation source residing in the same schema, database or system.
The Joiner transformation is used to join source data from two related heterogeneous sources residing in different locations or file systems. You can also join data from the same source.

94. What are two types of processes that run the session?

The two types of processes that runs the session are Load Manager and DTM process.

Load manager process starts the session, creates DTM process, and sends post session email when the session completes.
DTM process creates threads to initialize the session, read, write and transform data and handle pre-session and post-session operations.

95. What are the Update Strategy transformation Constant and its values?
The Update Strategy transformation Constant are Insert, Update, Delete and Reject.

Operation	Constant	Numeric Value
Insert	DD_INSERT	0
Update	DD_UPDATE	1
Delete	DD_DELETE	2
Reject	DD_REJECT	3

96. What is a shared folder?

A shared folder is global or local folder where we can create objects like Source, Target, Transformation, Reusable Transformation, Mapplet, Mapping, Session, Workflow and Workflow. After create any object in shared folder that can be used in any other folder within the repository as shortcut object.

Any time if you made any modification to the object in the shared folder that will automatically reflect in the other folders.

97. What is a worklet and its uses?

A worklet is an object which represents a set of tasks created for reuse a set of business logic is called worklet. And which can be used more than one workflow within the folder.

98. What is file list and why it is used?

It contains list of files names and the files path which having same definition or structure, when you specify the source file type as “Indirect” then Integration Service read the data from the file list when we run the session.

99. What is Session and Batches?

Session: A Session is a set of instructions that tells to the Integration Service, how and when to move data from source to target.
Batch: A group by session is called as Batch

100. What are slowly changing dimensions?
Over a time period data would get changes, where if you want to maintain history for those changes, that process of maintaining history is called Slowly Changing Dimension.
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.