Sunday, November 6, 2016

Datastage important links

 BI and DWH


  • Dimensional Modelling
  • OLAB and BI

Sequencer Jobs


  • ExecCommand activity
  • Exception activity
  • Job Activity
  • Routine activity
  • User variables activity
  • Sequencer activity
  • Start Loop and End Loop activity
  • Terminator activity
  • Nested condition activity
  • Notification activity
  • Oozie Workflow Activity
  • Wait-For-File activity

Tuesday, December 1, 2009

IBM Certified Solution Developer - InfoSphere DataStage syllabus

1. IBM Certified Solution Developer - InfoSphere DataStage v8.0
Target Audience:

  • Professionally design and develop an efficient and scalable DataStage solution to a complex enterprise level business problem
  • Configure a scalable parallel environment including clustered and distributed configurations
  • Collect, report on, and resolve issues identified through key application performance indicators
  • Be proficient in extending the capabilities of the parallel framework using the provided APIs (buildup, wrappers, and components).
  • Responsible for the primary customer interface looked to for expertise on product usage and functionality.
  • Develop and integrates DataStage with schedules, general infrastructure, and operational environments.
  • Work as part of a data integration group supporting data warehouse initiatives.

Recommended Prerequisite Skills:

  • Design, implement, test, and deploy parallel solutions.
  • DataStage job development, testing, implementation, problem solving/performance tuning.
  • DataStage environment management
  • ProfileStage analysis
  • QualityStage rule development
  • Data cleansing
  • MetaStage metadata management.
Requirements:

Knowledge of parallel concepts (data collection, data skew, data partitioning, buffers, sorting, aggregation, data collecting), complex algorithm implementations, best practices with regards to naming, deploying, etc. Knowledge of product extension via wrappers and buildops and configuration and setting up configuration files.

  • Use of DataStage tools
      • Administrator
      • Designer
      • Manager
  • UNIX and/or Windows system
  • Enterprise Scheduling Tools (e.g. CRON, Autosys, Unicenter), C/C++.
  • Operating system proficiency
  • Ability to construct SQL statements
  • Programming skills in a high-level language.

Test(s) Required:

  • Test 418 - IBM Information Platform Solutions Certification ( Price - $200)

Test information:

    • Number of questions: 70
    • Time allowed in minutes: 90
    • Required passing score: 75%
    • Test languages: English

Test Objectives:

1. DataStage v8 Configuration (5%)
  1. Describe how to properly configure DataStage V.8.0.
  2. Identify tasks required to create and configure a project to be used for V.8.0 jobs.
  3. Given a configuration file, identify its components and its overall intended purpose.
2. MetaData (5%)
  1. Demonstrate knowledge of Orchestrate schema.
  2. Identify the method of importing metadata.
  3. Given a scenario, demonstrate knowledge of runtime column propagation.
3. Persistent Storage (10%)
  1. Given a scenario, explain the process of importing/exporting data to/from the framework (e.g., the sequential file, external source/target).
  2. Given a scenario, describe the proper use of a sequential file.
  3. Given a scenario, describe the proper usage of CFF (native not plug-in).
  4. Describe the proper usage of FileSets and DataSets.
  5. Describe the use of FTP stage for remote data.
  6. Identify importing/exporting of XML data.
4. Parallel Architecture (10%)
  1. Given a scenario, demonstrate proper use of data partitioning and collecting.
  2. Given a scenario, demonstrate knowledge of parallel execution.
5. Databases (10%)
  1. Given a scenario, demonstrate a proper selection of database stages and database specific stage properties.
  2. Identify source database options.
  3. Given a scenario, demonstrate knowledge of target database options.
  4. Given a scenario, describe how to design v.8.0 ETL job that will extract data from a DBMS, combine with data from another source and load to another DBMS target.
  5. Demonstrate knowledge of working with NLS database sources and targets.
6. Data Transformation (10%)
  1. Given a scenario, demonstrate knowledge of default type conversions, output mappings, and associated warnings.
  2. Given a scenario, demonstrate proper selections of Transformer stage vs. other stages.
  3. Given a scenario, describe Transformer stage capabilities (including stage variables, link variables, DataStage macros, constraints, system variables, link ordering, @PART NUM, functions.
  4. Demonstrate the use of Transformer stage variables (e.g., to identify key grouping boundaries on incoming data).
  5. Identify process to add functionality not provided by existing DataStage stages. (e.g., wrapper, buildops, user def functions/routines).
  6. Given a scenario, demonstrate proper use of SCD stage
  7. Demonstrate job design knowledge of using RCP (modify, filter, dynamic transformer).
7. Job Components (10%)
  1. Demonstrate knowledge of Join, Lookup and Merge stages.
  2. Given a scenario, demonstrate knowledge of SORT stage.
  3. Given a scenario, demonstrate an understanding of Aggregator stage.
  4. Describe the proper usage of change capture/change apply.
  5. Demonstrate knowledge of Real-time components.
8. Job Design (10%)
  1. Demonstrate knowledge of shared containers.
  2. Given a scenario, describe how to minimize SORTS and repartitions.
  3. Demonstrate knowledge of creating restart points and methodologies.
  4. Given a scenario, demonstrate proper use of standards.
  5. Explain the process necessary to run multiple copies of the source (job multi-instance).
  6. Demonstrate knowledge of real-time vs. batch job design.
9. Monitoring and Troubleshooting (10%)
  1. Given a scenario, demonstrate knowledge of parallel job score.
  2. Given a scenario, identify and define environment variables that control DataStage v.8.0 with regard to added functionality and reporting.
  3. Given a process list (scenario), identify conductor, section leader, and player process.
  4. Given a scenario, identify areas that may improve performance (e.g., buffer size, repartitioning, config files, operator combination, etc.).
  5. Demonstrate knowledge of runtime metadata analysis and performance monitoring.
10. Job Management and Deployment (10%)
  1. Demonstrate knowledge of advanced find.
  2. Given a scenario, demonstrate knowledge and the purpose of impact analysis.
  3. Demonstrate knowledge and purpose of job compare.
  4. Given a scenario, articulate the change control process.
11. Job Control and Runtime Management (10%)
  1. Demonstrate knowledge of message handlers.
  2. Identify the use of dsjob command line utility.
  3. Given a scenario, demonstrate an ability to use job sequencers (e.g., exception hunting, re-startable, dependencies, passing return value from routing, parameter passing and job status).


Sunday, November 15, 2009

unix shell programs ,shell script ,basic examples

-->
//Date :
Shell script to find simple Interest
Shell script :
read p
read n
read r
si=`expr $p \* $n \* $r`
#si=`expr $si / 100`
PI=$(echo "scale=6; $si / 100" | bc)
#echo $si
echo $PI
//Date :
Shell script to find leap year
Shell script :
yy=0
isleap="false"
echo -n "Enter year (yyyy) : "
read yy
# find out if it is a leap year or not
if [ $((yy % 4)) -ne 0 ] ; then
# not a leap year : means do nothing and use old value of isleap
elif [ $((yy % 400)) -eq 0 ] ; then
# yes, it's a leap year
isleap="true"
elif [ $((yy % 100)) -eq 0 ] ; then
# not a leap year do nothing and use old value of isleap
else
# it is a leap year
isleap="true"
fi
if [ "$isleap" == "true" ];
then
echo "$yy is leap year"
else
echo "$yy is NOT leap year"
fi
//Date:
Shell script to find the number is odd or even
Shell script :
echo -n "Enter numnber : "
read n
rem=$(( $n % 2 ))
if [ $rem -eq 0 ]
then
echo "$n is even number"
else
echo "$n is odd number"
fi
//Date :
Shell script to perform arithmetic operations
Shell script :
a=$1
op="$2"
b=$3
if [ $# -lt 3 ]
then
echo "$0 num1 opr num2"
echo "opr can be +, -, / , x"
exit 1
fi
case "$op" in
+) echo $(( $a + $b ));;
-) echo $(( $a - $b ));;
/) echo $(( $a / $b ));;
x) echo $(( $a * $b ));;
*) echo "Error ";;
Esac
//Date:
Shell script to print Fibonacci series
Shell script :
i=1
a=0
b=1
echo ${a},
while [ $i != 10 ]
do
b=`expr $a + $b`
echo ${b},
c=$a
a=$b
b=$c
i=`expr $i + 1`
done
//Date :
Menu Driven shell script
Shell script :
while :
do
clear
echo " M A I N - M E N U"
echo "1. Date"
echo "2. List of users currently logged"
echo "3. Present handling directory"
echo "4. List of files"
echo "5. Exit"
echo -n "Please enter option [1 - 5]"
read opt
case $opt in
1) echo " $(date) ";
read enterKey;;
2) echo "*********** List of users currently logged $(who) ";
read enterKey;;
3) echo "You are in $(pwd) directory";
echo "Press [enter] key to continue. . .";
read enterKey;;
4) echo " $(ls) ";
read enterKey;;
5) exit 1;;
*) echo "$opt is an invaild option. Please select option between 1-4 only";
echo "Press [enter] key to continue. . .";
read enterKey;;
esac
done
//Date:
Shell script to print sum of digits
Shell script :
read n
sum=0
sd=0
while [ $n -gt 0 ]
do
sd=`expr $n % 10`
sum=`expr $sum + $sd`
n=`expr $n / 10`
done
echo "Sum of digit for numner is $sum"
//Date:
Shell script to find biggest of three numbers
Shell script :
read n1
read n2
read n3
if [ $n1 -gt $n2 ] && [ $n1 -gt $n3 ]
then
echo "$n1 is Bigest number"
elif [ $n2 -gt $n1 ] && [ $n2 -gt $n3 ]
then
echo "$n2 is Bigest number"
elif [ $n3 -gt $n1 ] && [ $n3 -gt $n2 ]
then
echo "$n3 is Bigest number"
elif [ $1 -eq $2 ] && [ $1 -eq $3 ] && [ $2 -eq $3 ]
then
echo "All the three numbers are equal"
else
echo "I cannot figure out "
fi
//Date :
Shell script to find factorial of a number
Shell script:
read n
i=1
j=1
k=0
while [[ $n -gt 1 ]]
do
j=`expr $n - 1`
i=`expr $n \* $j`
n=`expr $n - 1`
k=`expr $k + $i`
done
echo " $k "
Sequential stage in datastage