Informatica Real Time Scenarios-By Hari Babu Naidu.Daddala: March 2014

Saturday, 29 March 2014

what are the Code Reusable components in Informatica PowerCenter?

Its general purpose is to reduce unnecessary coding which ultimately reduces development time and increases supportability. In this article lets see different options available in Informatica PowerCenter to make your code reusable.
1.Mapplet 2.Reusable Transformation
.3.Shared Folder                                         4.Global Repository
5.Worklet                                                    6.Reusable Session
7.Mapping Parameter                                8.Mapping Variable
9.WorkFlow Variable                                   10.Worklet Variable

1. Mapplet

Mapplet is a reusable object that you create in the Mapplet Designer. It contains a set of transformations and lets you reuse the transformation logic in multiple mappings. When you use a mapplet in a mapping, you use an instance of the mapplet. Any change made to the mapplet is inherited by all instances of the mapplet.

If you have several fact tables that require a series of dimension keys, you can create a mapplet containing a series of Lookup transformations to find each dimension key. You can then use the mapplet in each fact table mapping.

2. Reusable Transformation

Reusable Transformations can be created in Transformation Developer and can be reused in multiple mappings. When you use a reusable transformation in a mapping, you use an instance of the transformation. Any change made to the transformation is inherited by all its instances.

If you have a business rule to trim spaces from customer name and customer address columns, then you can create a reusable expression transformation to trim spaces from the column, which can be reused in multiple mappings.

3. Shared Folder

A Shared folder can be created in a Repository from the repository manager. The objects from the shared folder can be accessed using shortcuts from different folders in the same Repository.

You can create all the reusable framework objects (CDC Framework, Error Handling Framework etc...) in the Shared Folder and can be accessed using shortcuts from different folders in the same repository.

4. Global Repository

A Global Repository can be created in a Repository Domain which is linked to multiple Local Repositories. An object created in the Global Repository is accessible from the Local Repositories using the shortcuts created to the global objects. Any change in the Global Repository Object will be inherited to all the shortcut objects.

You can create all the reusable framework objects (CDC Framework, Error Handling Framework etc...) in the Global Repository and can be accessed using shortcuts from different local repositories.

5. Worklet

A worklet is an object created by combining set of tasks to build a workflow logic. Worklet can be reused in multiple workflows, which can be configured to run concurrently. You can create a worklet in the Worklet Designer.

Reusable worklet implemented in the Change Data Capture Framework, which is discussed in one of our prior article is a practical application of worklet.

Informatica PowerCenter Worklet Designer

6. Reusable Session

A session is a set of instructions that tells the Integration Service how and when to move data from sources to targets. You can create a reusable Session task in the Task Developer. A reusable session can be used multiple workflows and even in a worklet.

Reusable session used in the Operational Metadata Logging Framework, which is discussed in one of our prior article is a practical implementation of reusable session.

Informatica PowerCenter Reusable Session

7. Reusable Tasks

Apart from the reusable session task, we can create reusable email task and reusable command task in the Task Developer. We can reuse these reusable tasks in multiple workflows and worklets.

A reusable email task can be used to create a standard session failure email notification, which can be reused in different session tasks.

8. Mapping Parameter

Define values that remain constant throughout a session, such as state sales tax rates. When declared in a mapping or mapplet, $$ParameterName is a user-defined mapping parameter.

9. Mapping Variable

Define values that can change during a session. The Integration Service saves the value of a mapping variable to the repository at the end of each successful session run and uses that value the next time you run the session. When declared in a mapping or mapplet, $$VariableName is a mapping variable. Read more about mapping variables implementation from the article Change Data Capture (CDC) Made Easy Using Mapping Variables

10. WorkFlow Variable

You can create user defined workflow variables to pass variable values between mapping, sessions within a workflow. Even if the workflow variables alone do not give code reusability, it works with other components to provide reusability. Read more on workflow variable from the article An ETL Framework for Change Data Capture.

11. Worklet Variable

User defined worklet variables can be create in worklets to pass values between mapping, sessions, worklets within a workflow. Worklet variables by itself do not give code reusability, but it works with other components to facilitate code reusability. Learn more on worklet variable from the article An ETL Framework for Change Data Capture.

In addition to the parameters and variables mentioned above, Informatica PowerCenter provides much more type of Variables and Parameters, which provide more flexibility to build reusable code, such as

o Service variables.

o Service process variables.

o Session parameters.

o $Source. $Target connection variables.

o Email variables.

o Local variables.

o Built-in variables.

Hope you enjoyed this post. Please leave your comments and share how you use these features in your projects.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

How to create the staging area in your database ?

A Staging area in a DW is used as a temporary space to hold all the records from the source system. So more or less it should be exact replica of the source systems except for the load strategy where we use truncate and reload options.
So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

what is the Informatica Architecture ?

The Power Center domain

It is a primary unit of the Administration.

Can have single and multiple domains.

It is a collection of nodes and services.

Nodes

A node is the logical representation of a machine in a domain

One node in the domain acts as a gateway node to receive service requests from clients and route them to the appropriate service and node

Integration Service:

Integration Service does the entire real job. It extracts data from sources, processes it as per the business logic and loads data to targets.

Repository Service:

Repository Service is used to fetch the data from the repository and sends it back to the requesting components (mostly client tools and integration service)

Power Center Repository:

Repository is nothing but a relational database which stores all the metadata created in Power Center.

Power Center Client Tools:

The Power Center Client consists of multiple tools.

Power Center Administration Console:

This is simply a web-based administration tool you can use to administer the Power Center installation.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

What are the types of data warehousing?

i.EDW (Enterprise datawarehousing)

ü It provides a central database for decision support throughout the enterprise

ü It is a collection of DATAMARTS

ii.DATAMART

ü It is a subset of Datawarehousing

ü It is a subject oriented database which supports the needs of individuals depts. in an organizations

ü It is called high performance query structure

ü It supports particular line of business like sales, marketing etc..

iii.ODS (Operational data store)

ü It is defined as an integrated view of operational database designed to support operational monitoring

ü It is a collection of operational data sources designed to support Transaction processing

ü Data is refreshed near real-time and used for business activity

ü It is an intermediate between the OLTP and OLAP which helps to create an instance reports.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

What r the unsupported repository objects for a mapplet?

i.COBOL source definition
ii.Joiner transformations
iii.Normalizer transformations
iv.Non reusable sequence generator transformations.
v.Pre or post session stored procedures
vi.Target definitions
vii.XML source definitions
viii.IBM MQ source definitions

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

Tuesday, 25 March 2014

working with PMCMD COMMAND USAGE IN INFORMATICA and How to run workflow in unix?

Informatica provides four built-in command line programs or utilities to interact with the informatica features. They are:

infacmd
infasetup
pmcmd
pmrep

The pmcmd is a command line utility provided by the informatica to perform the following tasks.

Start workflows.
Start workflow from a specific task.
Stop, Abort workflows and Sessions.
Schedule the workflows.

How to use PMCMD Command in Informatica:
1. Scheduling the workflow
The pmcmd command syntax for scheduling the workflow is shown below:

pmcmd scheduleworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name

You cannot specify the scheduling options here. This command just schedules the workflow for the next run.
2. Start workflow
The following pmcmd command starts the specified workflow:

pmcmd startworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name

3. Stop workflow
Pmcmd command to stop the infromatica workflow is shown below:

pmcmd stopworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name

4. Start workflow from a task
You can start the workflow from a specified task. This is shown below:

pmcmd startask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name -startfrom task-name

5. Stopping a task.
The following pmcmd command stops the specified task instance:

pmcmd stoptask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-name

6. Aborting workflow and task.
The following pmcmd commands are used to abort workflow and task in a workflow:

pmcmd abortworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name

pmcmd aborttask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-name

Running workflow in UNIX? 


Syntax: pmcmd startworkflow -sv <service name> -d <domain
name> -u <user name> -p <password> -f <folder name>
<workflow name>
Example
Pmcmd start workflow –service
${INFA_SERVICE} -domain 
${INFA_DOMAIN} -uv xxx_PMCMD_ID -pv PSWD -folder  
${ETLFolder} -wait ${ETLWorkflow}

What are new features of INFORMATICA 9 compared to INFORMATICA 8.6? OR what is the diffrents between informatica 9x and 8x

Informatica 9 empowers line-of-business managers and business analysts to identify bad data and fix it faster. Architecture wise there are no differences between Informatica 8 and 9 but there are some new features added in powercenter 9.

New Client tools
Informatica 9 includes the Informatica Developer and Informatica Analyst client tools.
The Informatica Developer tool is eclipse-based and supports both data integration and data quality for enhanced productivity.Fromhere you can update/refine those same rules, and create composite data objects - e.g. Get customer details from a number of different sources and aggregate these up to a Customer Data Object.

The Informatica Analyst tool is a browser-based tool for analysts, stewards and line of business managers. This tool supports data profiling, specifying and validating rules (Scorecards), and monitoring data quality.

Informatica Administrator
The powercenter Administration Console has been renamed the Informatica Administrator.
The Informatica Administrator is now a core service in the Informatica Domain that is used to configure and manage all Informatica Services, Security and other domain objects (such as connections) used by the new services.

The Informatica Administrator has a new interface. Some of the properties and configuration tasks from the powercenter Administration Console have been moved to different locations in Informatica Administrator. The Informatica Administrator is expanded to include new services and objects.
Cache Update in Lookup Transformation
You can update the lookup cache based on the results of an expression. When an expression is true, you can add to or update the lookup cache. You can update the dynamic lookup cache with the results of an expression.

Database deadlock resilience

In previous releases, when the Integration Service encountered a database deadlock during a lookup, the session failed. Effective in 9.0, the session will not fail. When a deadlock occurs, the Integration Service attempts to run the last statement in a lookup. You can configure the number of retry attempts and time period between attempts.

Multiple rows return
Lookups can now be configured as an Active transformation to return Multiple Rows.We can configure the Lookup transformation to return all rows that match a lookup condition. A Lookup transformation is an active transformation when it can return more than one row for any given input row.

Limit the Session Log
You can limit the size of session logs for real-time sessions. You can limit the size by time or by file size. You can also limit the number of log files for a session.

Auto-commit

We can enable auto-commit for each database connection. Each SQL statement in a query defines a transaction. A commit occurs when the SQL statement completes or the next statement is executed, whichever comes first.

Passive transformation

We can configure the SQL transformation to run in passive mode instead of active mode. When the SQL transformation runs in passive mode, the SQL transformation returns one output row for each input row.

Connection management

Database connections are centralized in the domain. We can create and view database connections in Informatica Administrator, Informatica Developer, or Informatica Analyst. Create, view, edit, and grant permissions on database connections in Informatica Administrator.

Monitoring

We can monitor profile jobs, scorecard jobs, preview jobs, mapping jobs, and SQL Data Services for each Data Integration Service. View the status of each monitored object on the Monitoring tab of Informatica Administrator.

Deployment

We can deploy, enable, and configure deployment units in the Informatica Administrator. Deploy Deployment units to one or more Data Integration Services. Create deployment units in Informatica Developer.

Model Repository Service

Application service that manages the Model repository. The Model repository is a relational database that stores the metadata for projects created in Informatica Analyst and Informatica Designer. The Model repository also stores run-time and configuration information for applications deployed to a Data.

Data Integration Service

Application service that processes requests from Informatica Analyst and Informatica Developer to preview or run data profiles and mappings. It also generates data previews for SQL data services and runs SQL queries against the virtual views in an SQL data service. Create and enable a Data Integration Service on the Domain tab of Informatica Administrator.

XML Parser

The XML Parser transformation can validate an XML document against a schema. The XML Parser transformation routes invalid XML to an error port. When the XML is not valid, the XML Parser transformation routes the XML and the error messages to a separate output group that We can connect to a target.

Enforcement of licensing restrictions

Powercenter will enforce the licensing restrictions based on the number of CPUs and repositories.

Also Informatica 9 supports data integration for the cloud as well as on premise. You can integrate the data in cloud applications, as well as run Informatica 9 on cloud infrastructure.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

What is the difference between STOP and ABORT?

When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.

In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.

How to segregate the Duplicate and Distinct rows from source table to separate target tables?

Source Table:

Col 1	Col 2	Col 3
A	B	C
L	M	N
A	B	C
O	P	Q
O	P	Q
X	Y	Z

Target Table 1 (Duplicate Records):

Col 1	Col 2	Col 3
A	B	C
O	P	Q

Target Table 2 (Distinct Records):

Col 1	Col 2	Col 3
L	M	N
X	Y	Z

Solution:

Step 1: Drag the source to mapping and connect it to an aggregator transformation.

Step 2: In aggregator transformation, group by the key column and add a new port call it count_rec to count the key column.

Step 3: Connect a router to the aggregator from the previous step. In router make two groups one named "Distinct" and another as "Duplicate"

In Distint write count_rec=1 and in Duplicate write count_rec>1.

The picture below depicting group name and the filter conditions

Step 4: Connect two groups to corresponding target table.

Thanks

Ur's Hari

If you like this post, please share it by clicking on g+1 Button.