Tuesday, 22 April 2014

Interview Questions Set 5

101.What is the difference between an expression transformation and an aggregator transformation?
Expression transformation permits you to perform  calculations row by row basis only. In Aggregator you can perform calculations on groups.
102.What is a source qualifier? What is meant by Query Override?
Source Qualifier represents the rows that the Power Center Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation.

Power Center Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query.
8    103. What are the contents in a session log in general?
 -    Allocation of system shared memory
-    Execution of Pre-session commands/ Post-session commands
-    Session Initialization
-    Creation of SQL commands for reader/writer threads
-    Start/End timings for target loading
-    Error encountered during session
-    Load summary of Reader/Writer/ DTM statistics
104.what are the types of partitioning in informatica?
                                I.            Round-robin partitioning. The Informatica Server distributes data evenly among all partitions. Use round-robin partitioning where you want each partition to process approximately the same number of rows.
                              II.            Hash partitioning. The Informatica Server applies a hash function to a partition key to group data among partitions. If you select hash auto-keys, the Informatica Server uses all grouped or sorted ports as the partition key. If you select hash user keys, you specify a number of ports to form the partition key. Use hash partitioning where you want to ensure that the Informatica Server processes groups of rows with the same partition key in the same partition.


                            III.            Key range partitioning. You specify one or more ports to form a compound partition key. The Informatica Server passes data to each partition depending on the ranges you specify for each port. Use key range partitioning where the sources or targets in the pipeline are partitioned by key range.
                            IV.            Pass-through partitioning. The Informatica Server passes all rows at one partition point to the next partition point without redistributing them. Choose pass-through partitioning where you want to create an additional pipeline stage to improve performance, but do not want to change the distribution of data across partitions.
105.How will you optimize a Lookup transformation?
                                I.            Be sure to delete the unwanted columns from the lookup as they affect the lookup cache very much.
                              II.            If the Lookup transformation is after the source qualifier and there is no active transformation in-between, you can as well go for the SQL over ride of source qualifier
                            III.            The cache that you assigned for the lookup is not sufficient to hold the data or index of the lookup. Whatever data that doesn't fit into the cache is spilt into the cache files designated in $PMCacheDir. When the PowerCenter doesn't find the data you are lookingup in the cache, it swaps the data from the file to the cache and keeps doing this until it finds the data. This is quite expensive for obvious reasons being an I/O operation.    Increase the cache so that the whole data resides in the memory
                            IV.            Sequential and Concurrent caches: The 8.x version of PowerCenter gives us this wonderful option to build the caches of the lookups either concurrently or in a sequential manner depending on the business rule. If no business rule dictates otherwise, concurrent cache building is a very handy option.

106.What is the difference between a connected look up and unconnected look up?
Connected Lookup
Unconnected Lookup
Connected lookup takes input values directly from other transformations in the pipeline.
Unconnected lookup doesn’t take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping.
Dynamic caching can be used
Dynamic caching is not applicable
Returns default value when there is no matching record
Returns NULL when there is no matching record
Returns as many columns in the lookup table
Only one return port

107.What is the use of update override query in the target instance?
                                I.            The update can be performed base on a target column value.
                              II.            The update can be performed without defining a key column in informatica and in the database

108.What is a data warehouse?
It is a process for building decision support systems and knowledge management enviroment that supports
both day-to-day tactical decision making and long-term business strategies. 
Bill Inmon’s definition:  "Subject-oriented, integrated, time variant, non-volatile collection of data in support
of management's decision making process."
Ralph Kimball’s definition:  "A copy of transaction data specifically structured for query and analysis."
109. list out what are active t/r and what are passive t/r? 
Aggregator - Active & Connected
source qualifier - Active & Connected
Filter - Active & Connected
Expression - Passive & Connected
joiner - Active & Connected
lookup - passive & connected/unconnected
HTTP - passive & connected
normalizer - active & connected
rank - active & connected
router - active & connected
sequence - passive & connected
sorter - active & connected
stored procedure - passive & connected/unconnected
union - active & connected
Java – active & connected
SQL – active/passive & connected
110. What is the use of index, what is the impact in the performance when DML operation is performed in the indexed table?
Oracle uses indexes to avoid the need for large-table, full-table scans and disk sorts, which are required when the SQL optimizer cannot find an efficient way to service the SELECT query
The performance will be less when the DML operations like INSERT, UPDATE, DELETE are carried out.
111.What are the transformations cannot be used upstream from a union transformation?
Sequence generator and Update strategy
112.Configuring Mapping variables and how to use it ?
Go to the tab Mapping-->parameter and variables-->$$variable_name
There are 3 aggregation types, namely, COUNT, MIN, MAX
Once the variable is created, it can be used in the expression transformation with the functions like SETCOUNTVARIABLE($$variable_name), SETMAXVARIABLE( $$Variable_name, value ), SETVARIABLE ($$Variable_name, value)
SETCOUNTVARIABLE($$variable_name)
Counts the rows evaluated by the function and increments the current value of a mapping variable based on the count. Increases the current value by one for each row marked for insertion. Decreases the current value by one for each row marked for deletion. Keeps the current value the same for each row marked for update or reject.
SETMAXVARIABLE( $$Variable_name, value )
Sets the current value of a mapping variable to the higher among the two values mentioned above. The function executes only if a row is marked as insert.
SETVARIABLE ($$Time, SYSDATE)
Use with mapping variables with Max/Min aggregation type. The SETVARIABLE function executes only if a row is marked as insert or update.
113.what are Error Handling  when Pushdown Optimization is enabled?
When the Integration Service pushes transformation logic to the database, it cannot track errors that occur in the database.
When the Integration Service runs a session configured for full pushdown optimization and an error occurs, the database handles the errors. When the database handles errors, the Integration Service does not write reject rows to the reject file.
If we configure a session for full pushdown optimization and the session fails, the Integration Service cannot perform incremental recovery because the database processes the transformations. Instead, the database rolls back the transactions. If the database server fails, it rolls back transactions when it restarts. If the Integration Service fails, the database server rolls back the transaction.
114.what are the general  Some error codes shown in the session log?
BR - error related reader process, including ERP, flat file, relation sources
CMN - error related databases, memory allocation, lookup, joiner and internal errors
DBGR - error related to debugger
SE, TM, WID - errors related to transformations
PMF - error related caching in aggregator, lookup, joiner, rank
RR - error related to relational sources
EP - error related to external procedure
LM - error related Load Manager
REP - error related to repository functions
VAR - error related to mapping variable
WRT - error related to writer
115.What are the two types of errors that occur in a session run?
Fatal and Non-Fatal errors
Non-Fatal Errors: It is an error that does not force the session to stop on its first occurrence. Establish the error threshold in the session property tab with the “stop on errors” option. When you enable this option, the server counts Non-Fatal errors that occur in the reader, writer and transformations.
                                I.            Reader errors can include alignment errors while running a session in Unicode mode.
                              II.            Writer errors can include key constraint violations, loading NULL into the NOT-NULL field and database errors.
                            III.            Transformation errors can include conversion errors and any condition set up as an ERROR, Such as NULL Input.
Fatal Errors: This occurs when the server cannot access the source, target or repository. This can include loss of connection or target database errors, such as lack of database space to load data.
If the session uses normalizer (or) sequence generator transformations, the server cannot update the sequence values in the repository, and a fatal error occurs.
116.What are the different types of commit interval?
Target based commit and Source based commit
(a)    Target based commit
-    Server commits data based on the no of target rows and the key constraints on the target table. The commit point also depends on the buffer block size and the commit interval.
-    During a session, the server continues to fill the writer buffer, after it reaches the commit interval. When the buffer block is full, the Informatica server issues a commit command. As a result, the amount of data committed at the commit point generally exceeds the commit interval.
-    The server commits data to each target based on primary/foreign key constraints.
(b)    Source based commit
-    Server commits data based on the number of source rows. The commit point is the commit interval you configure in the session properties.
-    During a session, the server commits data to the target based on the number of rows from an active source in a single pipeline. The rows are referred to as source rows.
-    A pipeline consists of a source qualifier and all the transformations and targets that receive data from source qualifier.
-    Although the Filter, Router and Update Strategy transformations are active transformations, the server does not use them as active sources in a source based commit session.
-    When a server runs a session, it identifies the active source for each pipeline in the mapping. The server generates a commit row from the active source at every commit interval.
-    When each target in the pipeline receives the commit rows the server performs the commit.
117.Explain about Reject Loading?
  I.            During a session, the server creates a reject file for each target instance in the mapping. If the writer of the target rejects data, the server writes the rejected row into the reject file.
                              II.            You can correct those rejected data and re-load them to relational targets, using the reject loading utility. (You cannot load rejected data into a flat file target)
                            III.            Reading Rejected data:
Ex: 3,D,1,D,D,0,D,1094345609,D,0,0.00
There are two main things, Row indicator and Column indicator
Row indicator: Row indicator tells the writer, what to do with the row of wrong data.

Row indicator
Meaning
Rejected By
0
Insert
Writer or target
1
Update
Writer or target
2
Delete
Writer or target
3
Reject
Writer


Column indicator: Column indicator is followed by the first column of data, and another column indicator. They appear after every column of data and define the type of data preceding it

Column Indicator
Meaning
Writer Treats as
D
Valid Data
Good Data. The target accepts it unless a database error occurs, such as finding a duplicate key
?
Overflow
Bad Data
N
Null
Bad Data
T
Truncated
Bad Data


118.Explain about the various threads created during a session run?
Various threads
functions
Master thread
Handles stop and abort requests from load manager
Mapping thread
One thread for each session. Fetches session and mapping information. Compiles mapping. Cleans up after execution
Reader thread
One thread for each partition. Relational sources use relational threads and Flat files use file threads
Writer thread
One thread for each partition writes to target
Transformation thread
One or more transformation for each partition

119.  How to setup Event Raise and Event Wait tasks?
Ans:
                                I.            Create an event -  right click the workflow and edit-> events tab -> create an event
                              II.            Place Eventraise task in the workflow space and give the created event name in the "user defined event" (properties tab)
                            III.            Place eventwait task in the workflow space-> there are 2 options in events tab,
                             Pre-defined: This is a file watch
                             User-defined: This is used for the event created on the workflow properties
When the Eventraise is executed, it creates the event, it triggers the event wait to continue

64.   Time task:
Ans:  You can specify the period of time to wait before the Integration Service runs the next task in the workflow with the Timer task.
Two options:
Absolute time - give the exact time when to start the next task or refer a date-time variable
Relative time - give the hours, minute, seconds - from the start time of this task
                                                - From the start time of the parent workflow/worklet
                                                - From the start time of the top-level workflow

65.   Decision task:
Ans: Use the Decision task instead of multiple link conditions in a workflow. Instead of specifying multiple link conditions, use the predefined condition variable in a Decision task to simplify link conditions.

66.   Assignment task:
Ans: You can assign a value to a user-defined workflow variable with the Assignment task.

67.   CONTROL TASK
Ans: We can use the Control task to stop, abort, or fail the top-level workflow or the parent workflow based on an input link condition.  
Control Option
Description 
Fail MeFails the control task.
Fail ParentMarks the status of the WF or worklet that contains the 
Control task as failed.
Stop ParentStops the WF or worklet that contains the Control task.
Abort ParentAborts the WF or worklet that contains the Control task.
Fail Top-Level WF
Fails the workflow that is running.
Stop Top-Level WF
Stops the workflow that is running.
Abort Top-Level WF
Aborts the workflow that is running.

220 .
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.
https://www.facebook.com/InformaticaDevelopers
                       








1 comment:

  1. Thanks Hari, very good information and more helpful.

    ReplyDelete