Saturday, 29 March 2014

what are the Code Reusable components in Informatica PowerCenter?

Its general purpose is to reduce unnecessary coding which ultimately reduces development time and increases supportability. In this article lets see different options available in Informatica PowerCenter to make your code reusable.
1.Mapplet                                               2.Reusable Transformation
.3.Shared Folder                                         4.Global Repository
5.Worklet                                                    6.Reusable Session
7.Mapping Parameter                                8.Mapping Variable
9.WorkFlow Variable                                   10.Worklet Variable
1. Mapplet
Mapplet is a reusable object that you create in the Mapplet Designer. It contains a set of transformations and lets you reuse the transformation logic in multiple mappings. When you use a mapplet in a mapping, you use an instance of the mapplet. Any change made to the mapplet is inherited by all instances of the mapplet.

If you have several fact tables that require a series of dimension keys, you can create a mapplet containing a series of Lookup transformations to find each dimension key. You can then use the mapplet in each fact table mapping.

Informatica Mapplet
2. Reusable Transformation
Reusable Transformations can be created in Transformation Developer and can be reused in multiple mappings. When you use a reusable transformation in a mapping, you use an instance of the transformation. Any change made to the transformation is inherited by all its instances.

If you have a business rule to trim spaces from customer name and customer address columns, then you can create a reusable expression transformation to trim spaces from the column, which can be reused in multiple mappings.



Informatica Reusable Transformation
3. Shared Folder
A Shared folder can be created in a Repository from the repository manager. The objects from the shared folder can be accessed using shortcuts from different folders in the same Repository.

You can create all the reusable framework objects (CDC Framework, Error Handling Framework etc...) in the Shared Folder  and can be accessed using shortcuts from different folders in the same  repository.
Informatica PowerCenter Shared Folder
4. Global Repository
A Global Repository can be created in a Repository Domain which is linked to multiple Local Repositories. An object created in the Global Repository is accessible from the Local Repositories using the shortcuts created to the global objects. Any change in the Global Repository Object will be inherited to all the shortcut objects.

You can create all the reusable framework objects (CDC Framework, Error Handling Framework etc...) in the Global Repository and can be accessed using shortcuts from different local repositories.
Informatica PowerCenter Object Shortcut
5. Worklet
A worklet is an object created by combining set of tasks to build a workflow logic. Worklet can be reused in multiple  workflows, which can be configured to run concurrently. You can create a worklet in the Worklet Designer.
Reusable worklet implemented in the Change Data Capture Framework, which is discussed in one of our prior article is a practical  application of worklet.
Informatica PowerCenter Worklet Designer
6. Reusable Session
A session is a set of instructions that tells the Integration Service how and when to move data from sources to  targets. You can create a reusable Session task in the Task Developer. A reusable session can be used multiple workflows and even in a worklet.

Reusable session used in the Operational Metadata Logging Framework, which is discussed in one of our prior article is a practical  implementation of reusable session.
Informatica PowerCenter Reusable Session
7. Reusable Tasks
Apart from the reusable session task, we can create reusable email task and reusable command task in the Task  Developer. We can reuse these reusable tasks in multiple workflows and worklets.

A reusable email task can be used to create a standard session failure email notification, which can be reused in different session tasks.
Informatica PowerCenter Task Developer
8. Mapping Parameter
Define values that remain constant throughout a session, such as state sales tax rates. When declared in a mapping or mapplet, $$ParameterName is a user-defined mapping parameter. 
Informatica Mapping Parameter
9. Mapping Variable
Define values that can change during a session. The Integration Service saves the value of a mapping variable to the repository at the end of each successful session run and uses that value the next time you run the session. When declared in a mapping or mapplet, $$VariableName is a mapping variable. Read more about mapping variables implementation from the article Change Data Capture (CDC) Made Easy Using Mapping Variables
Informatica Mapping variables
10. WorkFlow Variable
You can create user defined workflow variables to pass variable values between mapping, sessions within a workflow. Even if the workflow variables alone do not give code reusability, it works with other components to provide reusability. Read more on workflow variable from the article An ETL Framework for Change Data Capture.
Informatica Workflow Variables
11. Worklet Variable
User defined worklet variables can be create in worklets to pass values between mapping, sessions, worklets within a workflow. Worklet variables by itself do not give code reusability, but it works with other components to facilitate code reusability. Learn more on worklet variable from the article An ETL Framework for Change Data Capture.
Informatica Worklet Variables
In addition to the parameters and variables mentioned above, Informatica PowerCenter provides much more type of Variables and Parameters, which provide more flexibility to build reusable code, such as
o Service variables.
o Service process variables.
o Session parameters.
o $Source. $Target connection variables.
o Email variables.
o Local variables.
o Built-in variables.
Hope you enjoyed this post. Please leave your comments and share how you use these features in your projects.
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.


How to create the staging area in your database ?

A Staging area in a DW is used as a temporary space to hold all the records from the source system. So more or less it should be exact replica of the source systems except for the load strategy where we use truncate and reload options.
So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab.

Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.

what is the Informatica Architecture ?

The Power Center domain
            It is a primary unit of the Administration.
            Can have single and multiple domains.
            It is a collection of nodes and services.
Nodes
   A node is the logical representation of a machine in a domain
One node in the domain acts as a gateway node to receive service requests from clients and route them to the appropriate service and node
Integration Service:
Integration Service does the entire real job. It extracts data from sources, processes it as   per the business logic and loads data to targets.
Repository Service:
Repository Service is used to fetch the data from the repository and sends it back to the     requesting components (mostly client tools and integration service)
Power Center Repository:
Repository is nothing but a relational database which stores all the metadata created in Power Center.
Power Center Client Tools:
 The Power Center Client consists of multiple tools.
Power Center Administration Console:
This is simply a web-based administration tool you can use to administer the Power Center installation.
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.

What are the types of data warehousing?

i.EDW (Enterprise datawarehousing)
ü   It provides a central database for decision support throughout the enterprise
ü   It is a collection of DATAMARTS
ii.DATAMART
ü   It is a subset of Datawarehousing
ü   It is a subject oriented database which supports the needs of individuals depts. in an organizations
ü   It is called high performance query structure
ü   It supports particular line of business like sales, marketing etc..
iii.ODS (Operational data store)
ü   It is defined as an integrated view of operational database designed to support operational monitoring
ü   It is a collection of operational data sources designed to support Transaction processing
ü   Data is refreshed near real-time and used for business activity

ü   It is an intermediate between the OLTP and OLAP which helps to create an instance reports.
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.

What r the unsupported repository objects for a mapplet?

      i.COBOL source definition
      ii.Joiner transformations
      iii.Normalizer transformations
      iv.Non reusable sequence generator transformations.
      v.Pre or post session stored procedures
      vi.Target definitions
      vii.XML source definitions
      viii.IBM MQ source definitions 
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.

Tuesday, 25 March 2014

working with PMCMD COMMAND USAGE IN INFORMATICA and How to run workflow in unix?

Informatica provides four built-in command line programs or utilities to interact with the informatica features. They are:
  • infacmd
  • infasetup
  • pmcmd
  • pmrep
The pmcmd is a command line utility provided by the informatica to perform the following tasks.
  • Start workflows.
  • Start workflow from a specific task.
  • Stop, Abort workflows and Sessions.
  • Schedule the workflows.
How to use PMCMD Command in Informatica:
1. Scheduling the workflow 
The pmcmd command syntax for scheduling the workflow is shown below:
pmcmd scheduleworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
You cannot specify the scheduling options here. This command just schedules the workflow for the next run.
2. Start workflow 
The following pmcmd command starts the specified workflow:
pmcmd startworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
3. Stop workflow 
Pmcmd command to stop the infromatica workflow is shown below:
pmcmd stopworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name
4. Start workflow from a task 
You can start the workflow from a specified task. This is shown below:
pmcmd startask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name -startfrom task-name
5. Stopping a task. 
The following pmcmd command stops the specified task instance:
pmcmd stoptask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-name
6. Aborting workflow and task. 
The following pmcmd commands are used to abort workflow and task in a workflow:
pmcmd abortworkflow -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name

pmcmd aborttask -service informatica-integration-Service -d domain-name -u user-name -p password -f folder-name -w workflow-name task-name
Running workflow in UNIX? 

Syntax: pmcmd startworkflow -sv <service name> -d <domain name> -u <user name> -p <password> -f <folder name> <workflow name>
Example
Pmcmd start workflow –service
${INFA_SERVICE} -domain 
${INFA_DOMAIN} -uv xxx_PMCMD_ID -pv PSWD -folder  
${ETLFolder} -wait ${ETLWorkflow} 

What are new features of INFORMATICA 9 compared to INFORMATICA 8.6? OR what is the diffrents between informatica 9x and 8x

Informatica 9 empowers line-of-business managers and business analysts to identify bad data and fix it faster. Architecture wise there are no differences between Informatica 8 and 9 but there are some new features added in powercenter 9. 

New Client tools
Informatica 9 includes the Informatica Developer and Informatica Analyst client tools.
The Informatica Developer tool is eclipse-based and supports both data integration and data quality for enhanced productivity.From
here you can update/refine those same rules, and create composite data objects - e.g. Get customer details from a number of different sources and aggregate these up to a Customer Data Object.
The Informatica Analyst tool is a browser-based tool for analysts, stewards and line of business managers.  This tool supports data profiling, specifying and validating rules (Scorecards), and monitoring data quality.
Informatica Administrator
The powercenter Administration Console has been renamed the Informatica Administrator.
The Informatica Administrator is now a core service in the Informatica Domain that is used to configure and manage all Informatica Services, Security and other domain objects (such as connections) used by the new services.
The Informatica Administrator has a new interface. Some of the properties and configuration tasks from the powercenter Administration Console have been moved to different locations in Informatica Administrator. The Informatica Administrator is expanded to include new services and objects.
Cache Update in Lookup Transformation
You can update the lookup cache based on the results of an expression. When an expression is true, you can add to or update the lookup cache. You can update the dynamic lookup cache with the results of an expression.
Database deadlock resilience
In previous releases, when the Integration Service encountered a database deadlock during a lookup, the session failed. Effective in 9.0, the session will not fail. When a deadlock occurs, the Integration Service attempts to run the last statement in a lookup. You can configure the number of retry attempts and time period between attempts.
Multiple rows return
Lookups can now be configured as an Active transformation to return Multiple Rows.We can configure the Lookup transformation to return all rows that match a lookup condition. A Lookup transformation is an active transformation when it can return more than one row for any given input row.
Limit the Session Log
You can limit the size of session logs for real-time sessions. You can limit the size by time or by file size. You can also limit the number of log files for a session.
Auto-commit
We can enable auto-commit for each database connection. Each SQL statement in a query defines a transaction. A commit occurs when the SQL statement completes or the next statement is executed, whichever comes first. 
Passive transformation
We can configure the SQL transformation to run in passive mode instead of active mode. When the SQL transformation runs in passive mode, the SQL transformation returns one output row for each input row.
Connection management
Database connections are centralized in the domain. We can create and view database connections in Informatica Administrator, Informatica Developer, or Informatica Analyst. Create, view, edit, and grant permissions on database connections in Informatica Administrator.
Monitoring
We can monitor profile jobs, scorecard jobs, preview jobs, mapping jobs, and SQL Data Services for each Data Integration Service. View the status of each monitored object on the Monitoring tab of Informatica Administrator.
Deployment
We can deploy, enable, and configure deployment units in the Informatica Administrator. Deploy Deployment units to one or more Data Integration Services. Create deployment units in Informatica Developer.
Model Repository Service
Application service that manages the Model repository. The Model repository is a relational database that stores the metadata for projects created in Informatica Analyst and Informatica Designer. The Model repository also stores run-time and configuration information for applications deployed to a Data.
Data Integration Service
Application service that processes requests from Informatica Analyst and Informatica Developer to preview or run data profiles and mappings. It also generates data previews for SQL data services and runs SQL queries against the virtual views in an SQL data service. Create and enable a Data Integration Service on the Domain tab of Informatica Administrator.
XML Parser
The XML Parser transformation can validate an XML document against a schema. The XML Parser transformation routes invalid XML to an error port. When the XML is not valid, the XML Parser transformation routes the XML and the error messages to a separate output group that We can connect to a target.
Enforcement of licensing restrictions
Powercenter will enforce the licensing restrictions based on the number of CPUs and repositories.
Also Informatica 9 supports data integration for the cloud as well as on premise. You can integrate the data in cloud applications, as well as run Informatica 9 on cloud infrastructure.
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.

What is the difference between STOP and ABORT?

When we issue the STOP command on the executing session task, the Integration Service stops reading data from source. It continues processing, writing and committing the data to targets. If the Integration Service cannot finish processing and committing data, we can issue the abort command.
In contrast ABORT command has a timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session.
Thanks
Ur's Hari
If you like this post, please share it by clicking on g+1 Button.

How to segregate the Duplicate and Distinct rows from source table to separate target tables?

Source Table:
Col 1Col 2Col 3
ABC
LMN
ABC
OPQ
OPQ
XYZ

Target Table 1 (Duplicate Records): 
Col 1Col 2Col 3
ABC
OPQ

Target Table 2 (Distinct Records):
Col 1Col 2Col 3
LMN
XYZ

Solution: 
Step  1: Drag  the source to mapping and connect it to an aggregator transformation.
Step  2: In aggregator transformation, group by the key column and add a new port  call it count_rec to count  the key column.
Step  3: Connect  a router to the  aggregator from the previous step. In router make two groups one named "Distinct" and another as "Duplicate"
In Distint write count_rec=1 and in Duplicate write count_rec>1.















The picture below depicting group name and the filter conditions





















Step 4: Connect two groups to corresponding target table.



















Ur's Hari
If you like this post, please share it by clicking on g+1 Button.