Informatica Real Time Scenarios-By Hari Babu Naidu.Daddala: Informatica Interview Questions set-8

1.what is a commit interval and explain the types?

A commit interval is the interval at which power center server commits data to targets during a session. The commit interval the number of rows you want to use as a basis for the commit point.

Target Based commit: The power center server commits data based on the number of target rows and the key constraints on the target table. The commit point also depends on the buffer block size and the commit interval.

2.Ddifference between connected and unconnected lookup?

Connected lookup	Unconnected lookup
Receives input values directly from the pipe line.	Receives input values from the result of a clkp expression in a another transformation.
U can use a dynamic or static Cache	U can use a static cache
Cache includes all lokkup columns used in the mapping(that is lookup table columns included in the lookup condition and lookup table columns linked as output ports to other transformations)	Cache includes all lookup/output ports in the lookup condition and the lookup/return port.
Can return multiple columns from the same row or insert into the dynamic lookup cache.	Designate one return port(R).Returns one column from each row.
If there is no match for the lookup condition, the informatica server returns the default value for all output ports.If u configure dynamic caching the informatica server inserts rows into the cache.	If there is no matching for the lookup condition the informatica server returns NULL
Pass multiple output values to another transformatnion.Link lookup/output ports to another transformation	Pass one output value to another transformation.The lookup/output/return port passes the same value to the ---------------------------------------------------------
Supports user-defined default values.	Does not support user-defined default values.

6 3.explain index cache and data cache?

The informatica server stores conditions values in the index cache and output values in the data cache.

4.explain a work flow process?

The power center server uses both process memory and system shared memory to perform these tasks.

Load manager process: stores and locks the workflow tasks and start the DTM run the sessions.

Data Transformation Process DTM: Perform session validations,create threads to initialize the session,read,write and transform data, and handle pre and post session operations.

The default memory allocation is 12,000,000 bytes.

5.How the informatica server sorts the string values in rank transformation?

When the informatica server runs in the ASCII data movement mode it sorts session data using binary sort order.If you configures the session to use a binary sort order, the informatica server calculates the binary value of each string and returns the specified number of rows with the highest binary values for the string.

6.can you use the mapping parameters or variables created in one mapping into another mapping?

NO, we can use mapping parameters or variables in any transformation of the same mapping or mapplet in which have crated mapping parameters or variables.

7.Difference between the source filter and filter?

Source filter is filtering the data only relational sources. Where as filter transformation filter the data any type of source.

8.What is data driven?

The information server follows instructions coded into update strategy transformations with in the session mapping determine how to flag records for insert,update,delete or reject if u do not choose data driven option setting , the informatica server ignores all update strategy transformations in the mapping.

9.Different Between OLTP and OLAP

	OLTP	OLAP
1	Application Oriented (e.g., purchase order it is functionality of an application)	Subject Oriented (subject in the sense customer, product, item, time)
2	Used to run business	Used to analyze business
3	Detailed data	Summarized data
4	Repetitive access	Ad-hoc access
5	Few Records accessed at a time (tens), simple query	Large volumes accessed at a time(millions), complex query
6	Small database	Large Database
7	Current data	Historical data
8	Clerical User	Knowledge User
9	Row by Row Loading	Bulk Loading
10	Time invariant	Time variant
11	Normalized data	De-normalized data
12	E – R schema	Star schema

10.What are the types of datawarehousing?

EDW (Enterprise datawarehousing)

ü It provides a central database for decision support throughout the enterprise

ü It is a collection of DATAMARTS

DATAMART

ü It is a subset of Datawarehousing

ü It is a subject oriented database which supports the needs of individuals depts. in an organizations

ü It is called high performance query structure

ü It supports particular line of business like sales, marketing etc..

ODS (Operational data store)

ü It is defined as an integrated view of operational database designed to support operational monitoring

ü It is a collection of operational data sources designed to support Transaction processing

ü Data is refreshed near real-time and used for business activity

ü It is an intermediate between the OLTP and OLAP which helps to create an instance reports

11.Data Modeling Types:

ü Conceptual Data Modeling

ü Logical Data Modeling

ü Physical Data Modeling

ü Dimensional Data Modeling

1. Conceptual Data Modeling

ü Conceptual data model includes all major entities and relationships and does not contain much detailed level of information about attributes and is often used in the INITIAL PLANNING PHASE

ü Conceptual data model is created by gathering business requirements from various sources like business documents, discussion with functional teams, business analysts, smart management experts and end users who do the reporting on the database. Data modelers create conceptual data model and forward that model to functional team for their review.

ü Conceptual data modeling gives an idea to the functional and technical team about how business requirements would be projected in the logical data model.

2. Logical Data Modeling

ü This is the actual implementation and extension of a conceptual data model. Logical data model includes all required entities, attributes, key groups, and relationships that represent business information and define business rules.

3. Physical Data Modeling

ü Physical data model includes all required tables, columns, relationships, database properties for the physical implementation of databases. Database performance, indexing strategy, physical storage and demoralization are important parameters of a physical model.

Logical vs. Physical Data Modeling

Logical Data Model	Physical Data Model
Represents business information and defines business rules	Represents the physical implementation of the model in a database.
Entity	Table
Attribute	Column
Primary Key	Primary Key Constraint
Alternate Key	Unique Constraint or Unique Index
Inversion Key Entry	Non Unique Index
Rule	Check Constraint, Default Value
Relationship	Foreign Key
Definition	Comment

Dimensional Data Modeling

ü Dimension model consists of fact and dimension tables

ü It is an approach to develop the schema DB designs

Types of Dimensional modeling

ü Star schema

ü Snow flake schema

ü Star flake schema (or) Hybrid schema

ü Multi star schema

12.what are Important aspects of Star Schema & Snow Flake Schema ?

ü In a star schema every dimension will have a primary key.

ü In a star schema, a dimension table will not have any parent table.

ü Whereas in a snow flake schema, a dimension table will have one or more parent tables.

ü Hierarchies for the dimensions are stored in the dimensional table itself in star schema.

ü Whereas hierarchies are broken into separate tables in snow flake schema. These hierarchies help to drill down the data from topmost hierarchies to the lowermost hierarchies.

13.what are the Types of Facts Tables?

Additive - Measures that can be summed up across all dimensions.

Ex: Sales Revenue

Semi Additive - Measures that can be summed up across few dimensions and not with others

Ex: Current Balance

Non Additive - Measures that cannot be summed up across any of the dimensions.

Ex: Student attendance

14.WHY NEED STAGING AREA FOR DWH?

ü Staging area needs to clean operational data before loading into data warehouse.

ü Cleaning in the sense your merging data which comes from different source.

ü It’s the area where most of the ETL is done

Data Cleansing

ü It is used to remove duplications

ü It is used to correct wrong email addresses

ü It is used to identify missing data

ü It used to convert the data types

ü It is used to capitalize name & addresses.

Informatica Architecture

The Power Center domain

It is a primary unit of the Administration.

Can have single and multiple domains.

It is a collection of nodes and services.

Nodes

A node is the logical representation of a machine in a domain

One node in the domain acts as a gateway node to receive service requests from clients and route them to the appropriate service and node

Integration Service:

Integration Service does all the real job. It extracts data from sources, processes it as per the business logic and loads data to targets.

Repository Service:

Repository Service is used to fetch the data from the repository and sends it back to the requesting components (mostly client tools and integration service)

Power Center Repository:

Repository is nothing but a relational database which stores all the metadata created in Power Center.

Power Center Client Tools:

The Power Center Client consists of multiple tools.

Power Center Administration Console:

This is simply a web-based administration tool you can use to administer the Power Center installation.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

Informatica Real Time Scenarios-By Hari Babu Naidu.Daddala

Thursday, 8 May 2014

Informatica Interview Questions set-8

No comments:

Post a Comment

Follow this blogspot