Data Warehousing Archives

Why Business Intelligence (BI) Needs a Semantic Data Model

In the rapidly evolving digital landscape, data has become the cornerstone of effective decision-making in businesses. The sheer volume and complexity of data collected by organizations today necessitate efficient methods to not only store and manage this data but also to analyze and interpret it in meaningful ways. This is where Business Intelligence (BI) and semantic data models come into play, bridging the gap between raw data and actionable insights. A semantic data model, in particular, organizes and represents corporate data in a manner that reflects the meaning and relationships among data items, enabling end-users to access data using familiar …

DataStage Lookup Types

DataStage provides various lookup types to select from. Learn more about each lookup type to optimize your data. We’ll cover Normal, Range, Sparse, and Case Less lookups. And you’ll understand why each type is useful. This article covers some of the most common lookup types. You can use whichever one works best for you! So, start building your own data pipeline! And, don’t forget to check out the rest of our articles for more tips and tricks! Normal lookup There are two lookup types available in DataStage: the normal lookup type and the sparse lookup type. Normal lookup stores data …

What Are ETL Reconciliation Jobs?

Reconciliation ETL jobs are used to record and report that the expected volumes of data were loaded by establishing the volume of data to be loaded (normally in rows), the data actually loaded, and the variance (if any) between them. This job usually initiates notification and reporting processes when a variance exists for resolution by the designated department and/or agency. Reconciliation ETL jobs are used to monitor and report key process values, often used with control sequences to determine process behaviors. When a data warehouse needs to merge two or more data sets, ETL Reconciliation Jobs can be of great …

What is data staging?

Data staging is a method for moving data out-of-band. It allows for simpler source system queries and auditing, as well as easy recovery of corrupted data. In today’s information-driven society, companies must make information available fast. In addition, it must be reliable, non-volatile, and sufficiently accurate. An industry push is underway to automate data staging to free up resources for analysis. Data staging is an out-of-band method of moving data Data staging is a technique for moving data between different data stores and systems. The data is stored in staging areas that can serve as recovery points if necessary. These …

What are the Core Capability of Infosphere Information Server?

Three Core Capabilities of Information Server InfoSphere Information Server (IIS) has three core capabilities: What the Core Capabilities Provide The three-core capability translate in to the high-level business processes: Information Governance – Understand and collaborate Provides a centrally managed repository and approach, which provides: Data Integration – Transform and deliver A data integration capability, which provides: Data Quality – Cleanse and monitor To turn data assets into trusted information: Related References IBM Knowledge Center, InfoSphere Information Server Version 11.5.0 Overview of IBM InfoSphere Information Server, Introduction to InfoSphere Information Server

ETL Error Handling Effective Practices

ETL Error Handling Effective Practices ETL (extract, transform, and load) error handling practices can vary, but three basic approaches can significantly assist in having effective ETL error handling practices. Effective error handling practices begin in the requirements and design phases. All too often, error handling practices are left to the build phase and the fall to the developer practices. This is an area where standard practices are not well defined or adopted by the ETL developer community. So, here are a few effective error handling practices which will contribute to process stability, information timeliness, information accuracy, and reduce the level …

Infosphere Datastage – Useful Date Transformations

Here are few Datastage date transformation, which I have found useful, many of these can Also be accomplished in SQL, if sourcing your data from an RDBMS. Useful Date Transformations Item Description Result Tomorrow DateFromDaysSince(1, CurrentDate()) Yesterday DateFromDaysSince(-1, CurrentDate()) Convert date to string with dashes DateToString((<< Date_Field or CurrentDate() >>,”%,”%YYYY-%MM-%DD”) 2011-11-02 Convert date to string without dashes DateToString(<< Date_Field or CurrentDate() >>,”%YYYY%MM%DD”) 20111102 Get short month name DateToString(<< Date_Field or CurrentDate() >>,”%,”%mmm”) Feb Get long month name DateToString(<< Date_Field or CurrentDate() >>,”%mmmm”) February Get short weekday name DateToString(<< Date_Field or CurrentDate() >> date_value,”%eee”) Tue Get long weekday name …

Technology – Using Logical Data Lakes

Today, data-driven decision making is at the center of all things. The emergence of data science and machine learning has further reinforced the importance of data as the most critical commodity in today’s world. From FAAMG (the biggest five tech companies: Facebook, Amazon, Apple, Microsoft, and Google) to governments and non-profits, everyone is busy leveraging the power of data to achieve final goals. Unfortunately, this growing demand for data has exposed the inefficiency of the current systems to support the ever-growing data needs. This inefficiency is what led to the evolution of what we today know as Logical Data Lakes. …

Infosphere Datastage – Standard Practice- Sequence Naming Conventions

Standard practices help you and other understand your work. This can be very important when working on large teams, working across team boundaries, or when large complex sets of process and objects may be involved. When you consider the importance of naming convention, when coupled with standard practice, the benefit should be obvious, but often practice doesn’t execute or document their conventions. So, these standard naming conventions may help when none exist or you need to assemble your own naming conventions. <<SomeIdentifier >> = should be replaced with appropriate information Sequence Object Naming Conventions Entity Convention Master Control Sequence (parent) …

DataStage – How to Pass the Invocation ID from one Sequence to another

When you are controlling a chain of sequences in the job stream and taking advantage of reusable (multiple instances) jobs it is useful to be able to pass the Invocation ID from the master controlling sequence and have it passed down and assigned to the job run. This can easily be done with needing to manual enter the values in each of the sequences, by leveraging the DSJobInvocationId variable. For this to work: The job must have ‘Allow Multiple Instance’ enabled The Invocation Id must be provided in the Parent sequence must have the Invocation Name entered The receiving child …

Data Warehousing

DataStage Lookup Types

Like this:

What Are ETL Reconciliation Jobs?

Like this:

What is data staging?

Like this:

What are the Core Capability of Infosphere Information Server?

Like this:

ETL Error Handling Effective Practices

Like this:

Infosphere Datastage – Useful Date Transformations

Like this:

DataStage – How to Pass the Invocation ID from one Sequence to another

Like this:

Data Warehousing

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this:

Share this:

Like this: