Author: Bert Swope

Where do data models fit in the Software Development Life Cycle (SDLC) Process?

In the classic Software Development Life Cycle (SDLC) process, Data Models are typically initiated, by model type, at key process steps and are maintained as data model detail is added and refinement occurs. The Concept Data Model (CDM) is, usually, created in the Planning phase.   However,  creation the Concept Data Model can slide forwarded or backward, somewhat, within the System Concept Development, Planning, and Requirements Analysis phases, depending upon whether the application being modeled is a custom development effort or a modification of a Commercial-Off-The-Shelf (COTS) application.  The CDM is maintained, as necessary, through the remainder of the SDLC

Continue reading

IBM InfoSphere DataStage – Parallel Environment Variables

Most of this list of Parallel Environment Variables can be found in the IBM InfoSphere DataStage, Version 11.5 documentation.  However, I have started to find variables,  which I use and are not included in the IBM list.  So, for simplicity, I will make additions and clarifications to the IBM list, as I run across them, on this page. Performance Tuning These environment variables are frequently used in tuning Datastage performance. APT_BUFFER_FREE_RUN (See, also, Buffering) APT_BUFFER_MAXIMUM_MEMORY (See, also,  Buffering) APT_COMPRESS_BOUNDED_FIELDS APT_FILE_IMPORT_BUFFER_SIZE  (See, also, Reading and Writing Files) APT_FILE_EXPORT_BUFFER_SIZE (See, also, Reading and Writing Files) TMPDIR (This variable also specifies the directory for

Continue reading

DataStage Large VARCHAR Performance

To the IBM Infosphere Information Server Parallel Environment Variables, I would add APT_COMPRESS_BOUNDED_FIELDS variable under “Reading and Writing Files” or, perhaps, create a new category called “Performance” for VARCHAR handling.  Using this parameter can help with the overall performance of jobs having a significant number of Varchar fields, especially, large Varchar fields.  This parameter can also reduce sort and dataset storage space consumption within parallel jobs. This parameter is applicable to IBM Infosphere Information Server 8.0.1 fp3 and more recent versions. To enable the parameter to ensure the APT_COMPRESS_BOUNDED_FIELDS is set to 1 in your project environment variables, you may

Continue reading

Netezza / PureData – Now() Command For Current Date

There is more than one way to retrieve current date information from PureData Analytics (Netezza).  Using the Now() command is an easy way to retrieve the current date from within SQL. Example SQL Select now() as “Today” from _V_DUAL; Related Reference PureData – Current Date Function Netezza/PureData SQL Date Formatting Examples Netezza/PureData – How to add days to a date field Date/time functions PureData System for Analytics, PureData System for Analytics 7.1.0, IBM Netezza Database User’s Guide, Netezza SQL basics, Netezza SQL extensions, Date/time functions

Continue reading

Netezza / PureData Current Date Function

There is more than one way to retrieve current date information (Similar to the GETDATE() function) from PureData Analytics (Netezza), using the Current_Date Function is an easy way to retrieve the current date via SQL. Example SQL Select current_date as “Today” from _V_DUAL; Related Reference PureData System for Analytics, PureData System for Analytics 7.2.1, IBM Netezza database user documentation, Netezza, SQL command reference, Functions

Continue reading