datastage 5.7.1a中server 不能使用resulttransformerr吗?

DataStage 错误集(持续更新)
DataStage序列文章
1 执行dsadmin命令时报错
Cannot load program dsadmin because of the following errors:
Cannot load module /opt/IBM/InformationServer/Server/DSEngine/lib/libvmdsapi.so.
Dependent module libACS_client_cpp.a(shr.so) could not be loaded.
Cannot load module libACS_client_cpp.a(shr.so).
System error: A file or directory in the path name does not exist.
Cannot load module dsadmin.
Dependent module /opt/IBM/InformationServer/Server/DSEngine/lib/libvmdsapi.so could not be loaded.
Cannot load module .
1.1 错误描述
在AIX6.0命令行上执行dsadmin命令时报错无法加载相关联的.so文件,当时DS环境变量已设置
#DataStage
export DSHOME=/opt/IBM/InformationServer/Server/DSEngine
#parallel engine
export APT_ORCHHOME=/opt/IBM/InformationServer/Server/PXEngine
#parallel engine
export APT_CONFIG_FILE=/opt/IBM/InformationServer/Server/Configurations/default.apt
export PATH=$PATH:$DSHOME/bin:$APT_ORCHHOME/bin
#AIX LIBPATH,linux LD_LIBRARY_PATH
export LIBPATH=$LIBPATH:$DSHOME/lib:$APT_ORCHHOME/lib
export ASBHOME=/opt/IBM/InformationServer/ASBNode
#environment
$DSHOME/dsenv
使用ldd检查报如下错误
$ ldd /opt/IBM/InformationServer/Server/DSEngine/lib/libvmdsapi.so
/opt/IBM/InformationServer/Server/DSEngine/lib/libvmdsapi.so needs:
/lib/libc.a(shr_64.o)
/lib/libpthread.a(shr_xpg5_64.o)
Cannot find libACS_client_cpp.a(shr.so)
Cannot find libACS_common_cpp.a(shr.so)
Cannot find libinvocation_cpp.a(shr.so)
Cannot find libxmogrt-xlC6.a
Cannot find libIISCrypto.so
/lib/libC.a(shr_64.o)
/lib/libC.a(ansi_64.o)
/lib/libcrypt.a(shr_64.o)
/lib/libC.a(ansicore_64.o)
/lib/libC.a(shrcore_64.o)
/lib/libC.a(shr3_64.o)
/lib/libC.a(shr2_64.o)
找不到相关的库,但在某个子目录下发现有这些文件存在
$ ls -l /opt/IBM/InformationServer/ASBNode/lib/cpp/
-rwxr-xr-x
4117562 Nov 09 2013
libACS_client_cpp.a
-rwxr-xr-x
Nov 09 2013
libACS_common_cpp.a
-rwxr-xr-x
2010742 Nov 09 2013
libASB_agent_config_client_cpp.a
-rwxr-xr-x
Nov 09 2013
libinvocation_cpp.a
在命令行中输出某些dsenv文件里面的环境变量值时没有任何输出。
1.2 解决方法
依据上面的错误判定是环境配置问题,在文档的介绍中$DSHOME/dsenv 是个非常重要的文件,在profile要引用,可是我已经引用了,只是没有生效,原因就在于没有正确引用,再次检查dsenv文件的引用时发现少了前缀".";
#environment
$DSHOME/dsenv
把它改写为
#environment
. $DSHOME/dsenv
那怎么知道环境变量是否生效呢?简单的方法就是查询当前的环境是否有UDTHOME和UDTBIN两个变量设置,这两个变量在8.5、8.7、9.1的dsenv中都是有设置的。
#if [ -z "$UDTHOME" ]
UDTHOME=/opt/IBM/InformationServer/Server/DSEngine/ud41 ; export UDTHOME
UDTBIN=/opt/IBM/InformationServer/Server/DSEngine/ud41/ export UDTBIN
2 关闭WAS时报错
/opt/IBM/InformationServer/ASBServer/bin/MetadataServer.sh
ADMU0116I: Tool information is being logged in file
/opt/IBM/WebSphere/AppServer/profiles/InfoSphere/logs/server1/stopServer.log
ADMU0128I: Starting tool with the InfoSphere profile
ADMU3100I: Reading configuration for server: server1
ADMU0509I: The server "server1" cannot be reached. It appears to be stopped.
ADMU0211I: Error details may be seen in the file:
/opt/IBM/WebSphere/AppServer/profiles/InfoSphere/logs/server1/stopServer.log
2.1 错误描述
在关闭WAS时无法关闭Application server,查看日志文件
FFDC Incident emitted on /opt/IBM/WebSphere/AppServer/bin/./client_ffdc/ffdc.8567577.txt com.ibm.websphere.
management.AdminClientFactory.createAdminClient 275
[1/21/15 10:09:16:236 GMT+08:00]
WsServerStop
ADMU3002E: Exception attempting to process server server1
[1/21/15 10:09:16:236 GMT+08:00]
WsServerStop
ADMU3007E: Exception com.ibm.websphere.management.exception.Conne
ctorException: com.ibm.websphere.management.exception.ConnectorException: ADMC0016E: The system cannot create a SOAP connecto
r to connect to host nhdbtest07 at port 8881.
[1/21/15 10:09:16:237 GMT+08:00]
WsServerStop
ADMU3007E: Exception com.ibm.websphere.management.exception.Conne
ctorException: com.ibm.websphere.management.exception.ConnectorException: ADMC0016E: The system cannot create a SOAP connecto
r to connect to host nhdbtest07 at port 8881.
at com.ibm.ws.management.connector.ConnectorHelper.createConnector(ConnectorHelper.java:606)
at com.ibm.ws.management.tools.WsServerStop.runTool(WsServerStop.java:372)
at com.ibm.ws.management.tools.AdminTool.executeUtility(AdminTool.java:269)
at com.ibm.ws.management.tools.WsServerStop.main(WsServerStop.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at com.ibm.wsspi.bootstrap.WSLauncher.launchMain(WSLauncher.java:234)
at com.ibm.wsspi.bootstrap.WSLauncher.main(WSLauncher.java:95)
at com.ibm.wsspi.bootstrap.WSLauncher.run(WSLauncher.java:76)
java.security.cert.CertificateExpiredException: NotAfter: Wed Sep 09 10:51:29 GMT+08:00 2015]
at com.ibm.ws.management.connector.soap.SOAPConnectorClient.reconnect(SOAPConnectorClient.java:422)
at com.ibm.ws.management.connector.soap.SOAPConnectorClient.&(SOAPConnectorClient.java:222)
... 40 more
Caused by: [SOAPException: faultCode=SOAP-ENV:C msg=Error opening socket: javax.net.ssl.SSLHandshakeException: com.ibm.
jsse2.util.h: PKIX path validation failed: java.security.cert.CertPathValidatorException: The certificate expired at Wed Sep
09 10:51:29 GMT+08:00 2015; internal cause is:
java.security.cert.CertificateExpiredException: NotAfter: Wed Sep 09 10:51:29 GMT+08:00 2015; targetException=java.la
ng.IllegalArgumentException: Error opening socket: javax.net.ssl.SSLHandshakeException: com.ibm.jsse2.util.h: PKIX path valid
ation failed: java.security.cert.CertPathValidatorException: The certificate expired at Wed Sep 09 10:51:29 GMT+08:00 2015; i
nternal cause is:
java.security.cert.CertificateExpiredException: NotAfter: Wed Sep 09 10:51:29 GMT+08:00 2015]
at org.apache.soap.transport.http.SOAPHTTPConnection.send(SOAPHTTPConnection.java:475)
at org.apache.soap.rpc.Call.WASinvoke(Call.java:451)
at com.ibm.ws.management.connector.soap.SOAPConnectorClient$4.run(SOAPConnectorClient.java:372)
at com.ibm.ws.security.util.AccessController.doPrivileged(AccessController.java:118)
at com.ibm.ws.management.connector.soap.SOAPConnectorClient.reconnect(SOAPConnectorClient.java:365)
... 41 more
[10/9/15 20:09:02:685 GMT+08:00]
ADMU0509I: The server "server1" cannot be reached. It appears to
be stopped.
[10/9/15 20:09:02:685 GMT+08:00]
ADMU0211I: Error details may be seen in the file: /opt/IBM/W
ebSphere/AppServer/profiles/InfoSphere/logs/server1/stopServer.log
2.3 解决办法
进入项目server的目录,通常路径是这样的:/opt/IBM/WebSphere/AppServer/profiles/InfoSphere/bin,然后在/opt/IBM/WebSphere/AppServer/profiles/InfoSphere/logs/server1目录下查看SystemErr.log、SystemOut.log等日志;注意路径可能不同,使用命令时会有日志路径输出。遇到过的错误有
java.sql.SQLException: [IBM][Oracle JDBC Driver][Oracle]ORA-28001:
the password has expired
3 导入表定义信息时报错
An unexpected exception occurred accessing the repository:
&com/ascential/asb/cas/shared/ConnectorServiceException&
&&![CDATA[An unexpected exception occurred accessing the repository: ]]&&
&&![CDATA[com.ascential.asb.cas.shared.ConnectorServiceException: An unexpected exception occurred accessing the repository:
at com.ascential.asb.cas.shared.ConnectorAccessServiceBeanSupport.persist(ConnectorAccessServiceBeanSupport.java:5345)
at com.ascential.asb.cas.shared.ConnectorAccessServiceBeanSupport.discoverSchema(ConnectorAccessServiceBeanSupport.java:3549)
at com.ascential.asb.cas.service.impl.ConnectorAccessServiceBean.discoverSchema(ConnectorAccessServiceBean.java:3177)
at com.ascential.asb.cas.service.EJSRemoteStatelessConnectorAccess_6ccddb18.discoverSchema(Unknown Source)
at com.ascential.asb.cas.service._EJSRemoteStatelessConnectorAccess_6ccddb18_Tie.discoverSchema__com_ascential_asb_cas_shared_util_ConnectionHandle__com_ascential_xmeta_emf_util_EObjectMemento__CORBA_WStringValue__boolean__boolean__boolean__boolean__CORBA_WStringValue(_EJSRemoteStatelessConnectorAccess_6ccddb18_Tie.java:820)
at com.ascential.asb.cas.service._EJSRemoteStatelessConnectorAccess_6ccddb18_Tie._invoke(_EJSRemoteStatelessConnectorAccess_6ccddb18_Tie.java:355)
at com.ibm.CORBA.iiop.ServerDelegate.dispatchInvokeHandler(ServerDelegate.java:669)
at com.ibm.CORBA.iiop.ServerDelegate.dispatch(ServerDelegate.java:523)
at com.ibm.rmi.iiop.ORB.process(ORB.java:523)
at com.ibm.CORBA.iiop.ORB.process(ORB.java:1575)
at com.ibm.rmi.iiop.Connection.doRequestWork(Connection.java:2992)
at com.ibm.rmi.iiop.Connection.doWork(Connection.java:2875)
at com.ibm.rmi.iiop.WorkUnitImpl.doWork(WorkUnitImpl.java:64)
at com.ibm.ejs.oa.pool.PooledThread.run(ThreadPool.java:118)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1783)
3.1 错误描述
在ds项目中导入Oracle表定义信息时报错(注意不是使用ODBC导入),ODBC此时可以导入,检查ds日志没有任何错误,最后ConnectorServiceException的信息,官方文档中给出的解释是在尝试导入表定义信息时操作的用户没有足够的权限执行该操作。
3.2 解决方法
导致这个错误发生的原因是当前用户没有足够的权限;登录到console(http://hostname:9080/ibm/iis/console),单击Administrator选项,选择Users and Groups=&Users,然后在右中部选择用户,单击右边的Open User选项,授权Common Metadata Administrator和Common Metadata Importer权限给该用户并保存,然后客户端重新登录即可。
4 编译JOB报错
4.1 编译JOB报错一、缺少编译器组件
Output from transformer compilation follows:
##I IIS-DSEE-TFCN-:21(000) &
IBM InfoSphere DataStage Enterprise Edition 9.1.0.6791
Copyright (c) -2012 IBM Corporation. All rights reserved
##I IIS-DSEE-TFCN-:21(001) & conductor uname: -s=AIX; -r=1; -v=6; -n=nhdbtest07; -m=00F
##I IIS-DSEE-TOSH-:21(002) & orchgeneral: loaded
##I IIS-DSEE-TOSH-:21(003) & orchsort: loaded
##I IIS-DSEE-TOSH-:21(004) & orchstats: loaded
##W IIS-DSEE-TOSH-:21(007) & Parameter specified but not used in flow: DSPXWorkingDir
##E IIS-DSEE-TBLD-:21(009) & Error when checking composite operator: Subprocess command failed with exit status 32,256.
##E IIS-DSEE-TFSR-:21(010) & Could not check all operators because of previous error(s)
##W IIS-DSEE-TFTM-:21(011) & Error when checking composite operator: The number of reject datasets "0" is less than the number of input datasets "1".
##W IIS-DSEE-TBLD-:21(012) & Error when checking composite operator: Output from subprocess: sh: /usr/vacpp/bin/xlC_r:
not found.
##I IIS-DSEE-TBLD-:21(013) & Error when checking composite operator: /usr/vacpp/bin/xlC_r
-I/opt/IBM/InformationServer/Server/PXEngine/include -O -q64 -qtbtable=full -c /opt/IBM/dsprojects/dstest/RT_BP7.O/V0S9_JoinDataFromTabToTable_Tran_Joined.C -o /opt/IBM/dsprojects/dstest/RT_BP7.O/V0S9_JoinDataFromTabToTable_Tran_Joined.tmp.o.
##E IIS-DSEE-TCOS-:21(014) & Creation of a step finished with status = FAILED. (JoinDataFromTabToTable.Tran_Joined)
*** Internal Generated Transformer Code follows:
0002: // Generated file to implement the V0S9_JoinDataFromTabToTable_Tran_Joined transform operator.
0005: // define our input/output link names
0006: inputname 0 DSLink15;
0007: outputname 0 Select_
0009: initialize {
// define our control variables
int8 RowRejected0;
int8 NullSetVar0;
0016: mainloop {
// initialise the rejected row variable
RowRejected0 = 1;
// evaluate columns (no constraints) for link: Select_tran
Select_tran.OBJECT_ID = DSLink15.DATA_OBJECT_ID;
writerecord 0;
RowRejected0 = 0;
0027: finish {
*** End of Internal Generated Transformer Code
4.1 错误描述
在AIX6.0的DS上编译一个含有Transformer stage的parallel job时在Transformer stage上发生了改错误,官网的解释是机器上没有安装XLC编辑器,当时检查安装表情况输出如下
$lslpp -l |grep -i xlC
xlC.aix61.rte
XL C/C++ Runtime for AIX 6.1
C for AIX Preprocessor
xlC.msg.en_US.cpp
C for AIX Preprocessor
xlC.msg.en_US.rte
XL C/C++ Runtime
XL C/C++ Runtime
xlC.sup.aix50.rte
XL C/C++ Runtime for AIX 5.2
$lslpp -l ipfx.rte
Fileset ipfx.rte not installed.
$lslpp -ch|grep vac
并且没有可执行文件(/usr/vacpp/bin/xlC_r),改文件默认配置为ds编译器,在创建好的项目环境中可以看到如下配置
APT_COMPILEOPT:-O -q64 -qtbtable=full -c
APT_COMPILER:/usr/vacpp/bin/xlC_r
APT_LINKER:/usr/vacpp/bin/xlC_r
APT_LINKOPT:-G -q64
4.2 解决方法
下载包XL_C_C_plus_plus_for_AIX_V11.1包,解压进入XL_C_C_plus_plus_for_AIX_V11.1/usr/sys/inst.images目录,然后执行smitty installp进行安装。
4.3 安装后正常显示
$lslpp -l |grep -i xlC
xlC.adt.include
C Set ++ Application
xlC.aix61.rte
XL C/C++ Runtime for AIX 6.1
C for AIX Preprocessor
xlC.msg.en_US.cpp
C for AIX Preprocessor
xlC.msg.en_US.rte
XL C/C++ Runtime
XL C/C++ Runtime
xlC.sup.aix50.rte
XL C/C++ Runtime for AIX 5.2
$lslpp -l ipfx.rte
Fileset ipfx.rte not installed.
[nhsjjhetl01:root]lslpp -ch|grep vac
/usr/lib/objrepos:vac.Bnd:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;59
/usr/lib/objrepos:vac.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;10
/usr/lib/objrepos:vac.aix50.lib:99.99.::COMMIT:COMPLETE:06/12/12:17;06;59
/usr/lib/objrepos:vac.aix52.lib:99.99.::COMMIT:COMPLETE:06/12/12:17;06;59
/usr/lib/objrepos:vac.aix53.lib:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;58
/usr/lib/objrepos:mon.search:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;10
/usr/lib/objrepos:vac.html.en_US.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;09
/usr/lib/objrepos:vac.html.ja_JP.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;09
/usr/lib/objrepos:vac.html.zh_CN.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;08
/usr/lib/objrepos:vac.include:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;30
/usr/lib/objrepos:vac.lib:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;28
/usr/lib/objrepos:vac.lic:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;29
/usr/lib/objrepos:vac.licAgreement:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;08
/usr/lib/objrepos:vac.man.EN_US:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;04
/usr/lib/objrepos:vac.man.ZH_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;00
/usr/lib/objrepos:vac.man.Zh_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;01
/usr/lib/objrepos:vac.man.en_US:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;05
/usr/lib/objrepos:vac.man.zh_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;02
/usr/lib/objrepos:vac.msg.en_US.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;18
/usr/lib/objrepos:vac.ndi:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;07
/usr/lib/objrepos:vac.pdf.en_US.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;52
/usr/lib/objrepos:vac.pdf.zh_CN.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;51
/usr/lib/objrepos:vacpp.Bnd:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;50
/usr/lib/objrepos:vacpp.cmp.aix50.lib:99.99.::COMMIT:COMPLETE:06/12/12:17;06;39
/usr/lib/objrepos:vacpp.cmp.aix50.tools:99.99.::COMMIT:COMPLETE:06/12/12:17;06;39
/usr/lib/objrepos:vacpp.cmp.aix52.lib:99.99.::COMMIT:COMPLETE:06/12/12:17;06;39
/usr/lib/objrepos:vacpp.cmp.aix52.tools:99.99.::COMMIT:COMPLETE:06/12/12:17;06;39
/usr/lib/objrepos:vacpp.cmp.aix53.lib:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;38
/usr/lib/objrepos:vacpp.cmp.aix53.tools:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;38
/usr/lib/objrepos:vacpp.cmp.core:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;22
/usr/lib/objrepos:vacpp.cmp.include:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;40
/usr/lib/objrepos:vacpp.cmp.lib:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;40
/usr/lib/objrepos:vacpp.cmp.rte:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;40
/usr/lib/objrepos:vacpp.cmp.tools:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;40
/usr/lib/objrepos:mon:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;38
/usr/lib/objrepos:vacpp.html.en_US:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;37
/usr/lib/objrepos:vacpp.html.ja_JP:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;36
/usr/lib/objrepos:vacpp.html.zh_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;36
/usr/lib/objrepos:vacpp.lic:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;35
/usr/lib/objrepos:vacpp.licAgreement:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;34
/usr/lib/objrepos:vacpp.man.EN_US:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;32
/usr/lib/objrepos:vacpp.man.ZH_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;30
/usr/lib/objrepos:vacpp.man.Zh_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;28
/usr/lib/objrepos:vacpp.man.en_US:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;26
/usr/lib/objrepos:vacpp.man.zh_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;24
/usr/lib/objrepos:vacpp.memdbg.aix50.lib:99.99.::COMMIT:COMPLETE:06/12/12:17;06;23
/usr/lib/objrepos:vacpp.memdbg.aix50.rte:99.99.::COMMIT:COMPLETE:06/12/12:17;06;23
/usr/lib/objrepos:vacpp.memdbg.aix52.lib:99.99.::COMMIT:COMPLETE:06/12/12:17;06;23
/usr/lib/objrepos:vacpp.memdbg.aix52.rte:99.99.::COMMIT:COMPLETE:06/12/12:17;06;23
/usr/lib/objrepos:vacpp.memdbg.aix53.lib:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;23
/usr/lib/objrepos:vacpp.memdbg.aix53.rte:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;23
/usr/lib/objrepos:vacpp.memdbg.lib:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;21
/usr/lib/objrepos:vacpp.memdbg.rte:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;21
/usr/lib/objrepos:vacpp.msg.en_US.cmp.core:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;44
/usr/lib/objrepos:vacpp.msg.en_US.cmp.tools:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;20
/usr/lib/objrepos:vacpp.ndi:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;22
/usr/lib/objrepos:vacpp.pdf.en_US:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;20
/usr/lib/objrepos:vacpp.pdf.zh_CN:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;06;20
/usr/lib/objrepos:vacpp.samples.ansicl:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;31
/etc/objrepos:vac.C:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;17
/etc/objrepos:vacpp.cmp.core:11.1.0.0::COMMIT:COMPLETE:06/12/12:17;07;26
4.2 编译JOB报错二、JOB状态异常
4.2.1 错误描述
在重新编译一个异常停止的JOB时报错,编译前JOB因为Hung住无任何的操作,也无法通过正常方式或通过Dire工作停止,最后在后台中kill了该进程,并删除了$PH$下的ds运行时文件。之后查看JOB的状态为:"Crashed";
4.2.2 解决方法
进入DSHOME目录,进入uvsh命令行,键入LIST.READU EVERY查询当前ds lock;
DataStage Command Language 9.1 Licensed Materials - Property of IBM
(c) Copyright IBM Corp.
All Rights Reserved.
DSEngine logged on: Thursday, October 29,
&LIST.READU EVERY
Active Group Locks:
Record Group Group Group
Device.... Inode....
Netnode Userno
Lmode G-Address.
Locks ...RD ...SH ...EX
Active Record Locks:
Device.... Inode....
Netnode Userno
Pid Login Id Item-ID.............
28607 dsadm
RT_CONFIG11
28607 dsadm
dstage1&!DS.ADMIN!&
1219 dsadm
dstage1&!DS.ADMIN!&
8734 dsadm
dstage1&!DS.ADMIN!&
28607 dsadm
ClusterMergeDataFromTabToSeqFile.fifo
找到相关的lock信息,然后键入LOGTO UV,再通过UNLOCK INODE #Inode USER #User ALL命令释放lock,这里的#Inode表示上面查询到的Inode....列信息,#User表示Userno列信息;
&UNLOCK INODE 1228935 USER 36929 ALL
Clearing Record locks.
Clearing GROUP locks.
Clearing FILE locks.
&LIST.READU EVERY
Active Group Locks:
Record Group Group Group
Device.... Inode....
Netnode Userno
Lmode G-Address.
Locks ...RD ...SH ...EX
Active Record Locks:
Device.... Inode....
Netnode Userno
Pid Login Id Item-ID.............
28607 dsadm
RT_CONFIG11
28607 dsadm
dstage1&!DS.ADMIN!&
1219 dsadm
dstage1&!DS.ADMIN!&
8734 dsadm
dstage1&!DS.ADMIN!&
在释放某个lock后同样可以通过LIST.READU EVERY查询当前锁信息。注意:如果你的JOB中包含多个Stage,并且Stage的操作很复杂,这种情况下可能造成ds产生很多个额外的lock,这些lock的Item-ID内容有可能不是JOB名称,可能像上面的(dstage1&!DS.ADMIN!&)一样,这时如果你只释放了带JOB名的那个索依据解决不了该问题,要解决问题你必须还得释放其它的额外锁,so be carefully。
然后尝试重新编译job,如果还是不行
# uv -admin -info
Details for DataStage Engine release 9.1.0.0 instance "ade"
===============================================================================
Install history
: Installed by root (admin:dsadm) on T15:17:42.766
Instance tag
Engine status
: Running w/active nls
Engine location
: /disk2/IBM/EngineTier/Server/DSEngine
Binary location
: /disk2/IBM/EngineTier/Server/DSEngine/bin
Impersonation
Administrator
Autostart mode
Autostart link(s) : /etc/rc.d/init.d/ds.rc
: /etc/rc.d/rc2.d/S999ds.rc
: /etc/rc.d/rc3.d/S999ds.rc
: /etc/rc.d/rc4.d/S999ds.rc
: /etc/rc.d/rc5.d/S999ds.rc
Startup script
: /disk2/IBM/EngineTier/Server/DSEngine/sample/ds.rc
Cache Segments
User Segments
3 phantom printer segments!
C Stime Tty
52053 dsadm
00:00:04 dsapi_slave 7 6 0 4
52169 dsadm
00:00:00 phantom DSD.RUN ClusterMer
52413 dsadm
00:02:13 dsapi_slave 7 6 0 4
# kill -9 13367
kill 掉DSD的进程和dsapi_slave进程。这样做通常会导致进程异常终止,并且job的状态为:Crashed;
# dsjob -jobinfo dstage1 ClusterMergeDataFromTabToSeqFile
Job Status
: CRASHED (96)
Job Controller
: not available
Job Start Time
: Thu Oct 29 15:22:49 2015
Job Wave Number : 1
User Status
: not available
Job Control
Interim Status
: NOT RUNNING (99)
Invocation ID
: not available
Last Run Time
: Fri Oct 30 09:08:37 2015
Job Process ID
Invocation List : ClusterMergeDataFromTabToSeqFile
Job Restartable : 0
Status code = 0
CRASHED在ds中代表很多含义,有可能是JOB异常终止,有可能是编译失败,有可能是内部错误;这时可以通过重置job来让job回到初始化状态;
dsjob -run -mode RESET dstage1 ClusterMergeDataFromTabToSeqFile
Status code = 0
# dsjob -jobinfo dstage1 ClusterMergeDataFromTabToSeqFile
Job Status
: RESET (21)
Job Controller
: not available
Job Start Time
: Fri Oct 30 09:37:53 2015
Job Wave Number : 0
User Status
: not available
Job Control
Interim Status
: NOT RUNNING (99)
Invocation ID
: not available
Last Run Time
: Fri Oct 30 09:37:53 2015
Job Process ID
Invocation List : ClusterMergeDataFromTabToSeqFile
Job Restartable : 0
Status code = 0
完成RESET后,你可以尝试编译或VALIDATE操作,如果还是不能解决问题,请重启Engine。
# dsjob -run -mode VALIDATE dstage1 ClusterMergeDataFromTabToSeqFile
Status code = 0
# dsjob -jobinfo dstage1 ClusterMergeDataFromTabToSeqFile
Job Status
: RUNNING (0)
Job Controller
: not available
Job Start Time
: Fri Oct 30 09:42:24 2015
Job Wave Number : 1
User Status
: not available
Job Control
Interim Status
: NOT RUNNING (99)
Invocation ID
: not available
Last Run Time
1 08:00:00 1970
Job Process ID
Invocation List : ClusterMergeDataFromTabToSeqFile
Job Restartable : 0
Status code = 0
5 Agent attach出错
5.1 错误描述
在通过Connector import导入Oracle表定义信息时报错31531 not available,检查AIX6.0端口信息;
$netstat -Ana|grep 31531
feb3b8 tcp
192.168.1.12.31531
192.168.1.12.33436
CLOSE_WAIT
fcbb8 tcp4
192.168.1.12.31531
192.168.1.12.33438
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33440
CLOSE_WAIT
febb8 tcp4
192.168.1.12.31531
192.168.1.12.33442
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33444
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33446
CLOSE_WAIT
fad9bb8 tcp4
192.168.1.12.31531
192.168.1.12.33449
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33452
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33454
CLOSE_WAIT
faf23b8 tcp4
192.168.1.12.31531
192.168.1.12.33456
CLOSE_WAIT
febb8 tcp4
192.168.1.12.31531
192.168.1.12.33458
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33460
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33462
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33464
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33468
CLOSE_WAIT
fad0bb8 tcp4
192.168.1.12.31531
192.168.1.12.33470
CLOSE_WAIT
fcd6bb8 tcp4
192.168.1.12.31531
192.168.1.12.33472
CLOSE_WAIT
fabb8 tcp4
192.168.1.12.31531
192.168.1.12.33474
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33477
CLOSE_WAIT
ff3b8 tcp4
192.168.1.12.31531
192.168.1.12.33479
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33482
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33484
CLOSE_WAIT
fdd23b8 tcp4
192.168.1.12.31531
192.168.1.12.33486
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33488
CLOSE_WAIT
fac03b8 tcp4
192.168.1.12.31531
192.168.1.12.33490
CLOSE_WAIT
fc3b8 tcp4
192.168.1.12.31531
192.168.1.12.33492
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33495
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33497
CLOSE_WAIT
fc3b8 tcp4
192.168.1.12.31531
192.168.1.12.33499
CLOSE_WAIT
fe3b8 tcp4
192.168.1.12.31531
192.168.1.12.33501
CLOSE_WAIT
fc8bb8 tcp4
192.168.1.12.31531
192.168.1.12.33503
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33505
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33507
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33509
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33511
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33513
CLOSE_WAIT
fcd23b8 tcp4
192.168.1.12.31531
192.168.1.12.33515
CLOSE_WAIT
fae6bb8 tcp4
192.168.1.12.31531
192.168.1.12.33517
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33519
CLOSE_WAIT
fc3b8 tcp4
192.168.1.12.31531
192.168.1.12.33521
CLOSE_WAIT
fd13b8 tcp4
192.168.1.12.31531
192.168.1.12.33523
CLOSE_WAIT
ff3b8 tcp4
192.168.1.12.31531
192.168.1.12.33525
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33528
CLOSE_WAIT
ffdebb8 tcp4
192.168.1.12.31531
192.168.1.12.33530
CLOSE_WAIT
ffc2bb8 tcp4
192.168.1.12.31531
192.168.1.12.33532
CLOSE_WAIT
fc93b8 tcp4
192.168.1.12.31531
192.168.1.12.33534
CLOSE_WAIT
fae43b8 tcp4
192.168.1.12.31531
192.168.1.12.33536
CLOSE_WAIT
ffd73b8 tcp4
192.168.1.12.31531
192.168.1.12.33538
CLOSE_WAIT
fbbbb8 tcp4
192.168.1.12.31531
192.168.1.12.33540
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33542
CLOSE_WAIT
fdbb8 tcp4
192.168.1.12.31531
192.168.1.12.33544
CLOSE_WAIT
fcca3b8 tcp4
192.168.1.12.31531
192.168.1.12.33546
CLOSE_WAIT
faabb8 tcp4
192.168.1.12.31531
192.168.1.12.33548
CLOSE_WAIT
fabb8 tcp4
192.168.1.12.31531
192.168.1.12.33550
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33552
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33555
CLOSE_WAIT
fdbb8 tcp4
192.168.1.12.31531
192.168.1.12.33557
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33559
CLOSE_WAIT
fc3b8 tcp4
192.168.1.12.31531
192.168.1.12.33561
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33563
CLOSE_WAIT
ff3b8 tcp4
192.168.1.12.31531
192.168.1.12.33565
CLOSE_WAIT
ffbb8 tcp4
192.168.1.12.31531
192.168.1.12.33567
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33571
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33573
CLOSE_WAIT
fbfbb8 tcp4
192.168.1.12.31531
192.168.1.12.33575
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33577
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33579
CLOSE_WAIT
fec4bb8 tcp4
192.168.1.12.31531
192.168.1.12.33581
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33584
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33586
CLOSE_WAIT
fab4bb8 tcp4
192.168.1.12.31531
192.168.1.12.33588
CLOSE_WAIT
febb8 tcp4
192.168.1.12.31531
192.168.1.12.33590
CLOSE_WAIT
192.168.1.12.31531
192.168.1.12.33592
CLOSE_WAIT
fdcbb8 tcp4
192.168.1.12.31531
192.168.1.12.33594
CLOSE_WAIT
ffd3bb8 tcp4
192.168.1.12.31531
192.168.1.12.33597
CLOSE_WAIT
fb7bb8 tcp4
192.168.1.12.31531
192.168.1.12.33601
CLOSE_WAIT
fd7bb8 tcp4
192.168.1.12.31531
192.168.1.12.33603
CLOSE_WAIT
ffba3b8 tcp
192.168.1.12.35035
192.168.1.12.31531
192.168.1.12.35038
192.168.1.12.31531
fee1bb8 tcp
192.168.1.12.35041
192.168.1.12.31531
192.168.1.12.35044
192.168.1.12.31531
ffde3b8 tcp
192.168.1.12.35047
192.168.1.12.31531
192.168.1.12.31531
192.168.1.12.35050
CLOSE_WAIT
fbcf3b8 tcp
192.168.1.12.35050
192.168.1.12.31531
FIN_WAIT_2
ff43b8 tcp
192.168.1.12.35053
192.168.1.12.31531
192.168.1.12.35056
192.168.1.12.31531
发现有很多状态为CLOSE_WAIT的进程,用rmsock检查会发现有些状态为CLOSE_WAIT的进程已经不存在了,这些连接已经关闭;
$rmsock f3b8 tcpcb
socket 0xf97008 is removed.
但某些原因导致它发生CLOSE_WAIT,比如客户端出错程序异常退出、客户端与服务端网络连接异常断开。
5.2 处理方法
在AIX上可以通过rmsock(Removes a socket that does not have a file descriptor)remove 空的进程,如果进程非空可以通过进程ID查询该进程的信息,然后kill。
rmsock feb3b8 tcpcb
The socket 0xfeb008 is being held by proccess
(RunAgent).
$ps -ef|grep
root 7:13:02
pts/1 60:33 /opt/IBM/InformationServer/ASBNode/bin/RunAgent -Xbootclasspath/a:conf -Djava.ext.dirs=apps/jre/lib/ext:lib/java:eclipse/plugins:eclipse/plugins/com.ibm.isf.client -Djava.class.path=conf -Djava.security.auth.login.config=/opt/IBM/InformationServer/ASBNode/eclipse/plugins/com.ibm.isf.client/auth.conf -Dcom.ibm.CORBA.ConfigURL=file:/opt/IBM/InformationServer/ASBNode/eclipse/plugins/com.ibm.isf.client/sas.client.props -Dcom.ibm.SSL.ConfigURL=file:/opt/IBM/InformationServer/ASBNode/eclipse/plugins/com.ibm.isf.client/ssl.client.props -Dcom.ibm.CORBA.enableClientCallbacks=true -Dcom.ibm.CORBA.FragmentSize=128000 -class com/ascential/asb/agent/impl/AgentImpl run
[nhdbtest07:root]kill -9
状态为CLOSE_WAIT的进程清除后连接正常了。
netstat -Ana|grep 31531
fd33b8 tcp
192.168.1.12.35538
192.168.1.12.31531
6 JOB运行时找不到可执行文件
JOB运行时找不到libccora11g.so和libclntsh.so.11.1
Error loading connector library libccora11g.so. libclntsh.so.11.1: cannot open shared object file: No such file or directory
6.1 错误描述
当运行包含Oracle Connector或其它操作Oracle数据库的stage的JOB时报错;在stage上测试可能是成功的。
6.2 解决方法
1)首先在Oracle用户下确认可以正常连接和访问数据库;
2)确认dsenv下的Oracle环境变量配置无误;
3)确认$ORACLE_HOME/lib目录下是否有libccora11g.so文件或link;
4)如果以上都没有问题,在Engine安装目录下找到libccora11g.so文件,该文件通常是在EngineTier/Server/StagingArea/Installed/OracleConnector/Server/linux/libccora11g.so,然后将该文件软link到$ORACLE_HOME/lib目录下;
# find /disk2/IBM/EngineTier -name "libccora11g.so"
# ln -s /disk2/IBM/EngineTier/Server/StagingArea/Installed/OracleConnector/Server/linux/libccora11g.so $ORACLE_HOME/lib
接着找到install.liborchoracle文件,编辑给文件找到如下内容:
install_driver() {
case $version in
9 ) VER='9i';;
10 ) VER='10g';;
如果你使用的数据库是11G就把内容改为:
install_driver() {
case $version in
9 ) VER='9i';;
10|11) VER='10g';;
然后保存退出并执行该文件。
7 JOB运行时异常
main_program: Fatal Error: The set of available nodes for op2 (parallel inserted tsort operator {key={value=DEPTNO, subArgs={desc, nulls={value=first}}}}(0)).
This set is influenced by calls to addNodeConstraint(),
addResourceConstraint() and setAvailableNodes().
If none of these
functions have been called on this operator, then the default node
pool must be empty.
This step has 5 datasets:
ds0: {op0[] (sequential Select_department)
eOther(APT_HashPartitioner { key={ value=DEPTNO,
subArgs={ desc }
})&eCollectAny
op3[] (parallel inserted tsort operator {key={value=DEPTNO, subArgs={desc}}}(1))}
ds1: {op1[] (parallel Select_employee)
eOther(APT_HashPartitioner { key={ value=DEPTNO,
subArgs={ desc }
})&eCollectAny
op2[] (parallel inserted tsort operator {key={value=DEPTNO, subArgs={desc, nulls={value=first}}}}(0))}
ds2: {op2[] (parallel inserted tsort operator {key={value=DEPTNO, subArgs={desc, nulls={value=first}}}}(0))
[pp] eSame&eCollectAny
op4[] (parallel Merge_2)}
ds3: {op3[] (parallel inserted tsort operator {key={value=DEPTNO, subArgs={desc}}}(1))
[pp] eSame&eCollectAny
op4[] (parallel Merge_2)}
ds4: {op4[] (parallel Merge_2)
&eCollectAny
op5[] (sequential APT_RealFileExportOperator1 in Sequential_File_3)}
It has 6 operators:
op0[] {(sequential Select_department)
op1[] {(parallel Select_employee)
op2[] {(parallel inserted tsort operator {key={value=DEPTNO, subArgs={desc, nulls={value=first}}}}(0))
op3[] {(parallel inserted tsort operator {key={value=DEPTNO, subArgs={desc}}}(1))
op4[] {(parallel Merge_2)
op5[] {(sequential APT_RealFileExportOperator1 in Sequential_File_3)
7.1 错误描述
我创建了一个这样的JOB:
Oracle_Connector1 --& Merge --& Sequential_File
Oracle_Connector2 --^
并且JOB运行在两个节点的集群环境中,在非集群环境下JOB成功运行。
7.2 错误分析
仔细查看上面的错误内容发现在op0(Oracle_Connector1)、op1(Oracle_Connector2)之后ds自动产生了op2和op3操作(parallel inserted tsort operator),但在运行过程中又有一个集群环境中的操作产生了排序操作导致了报错,也就是说ds在运行时会自动将数据进行排序;so 检查APT_CONFIG_FILE配置文件发现有个节点pools设置为了"io";
node "node1"
fastname "dsconductor01"
pools "conductor"
resource disk "/tmp/ds/resource" {pools ""}
resource scratchdisk "/tmp/ds/scratch" {pools ""}
node "node2"
fastname "dscompute01"
pools "io"
resource disk "/tmp/ds/resource" {pools ""}
resource scratchdisk "/tmp/ds/scratch" {pools ""}
pools设置为"io"表示该节点有较好的io功能。
7.3 解决方法
有两种方法可以解决这个问题:
1) 将APT_CONFIG_FILE配置文件中的io节点设置为默认的ds pool(pools "")节点,但这样做显示的去除了某些较好、有用、可以显著提高集群性能的资源。
2) 在该JOB中配置参数APT_NO_SORT_INSERTION值为True。但是这样做存在一定的风险,比如某些情况下我们并不知道ds什么时候会对数据进行排序,如果这样做就等于显示的告诉了它不用自动排序了,这些数据已经排好序了,但实际上有些stage之前是需要它自动排序的,比如join、merge,导致的后果就是数据会不正确,引发其它类型的错误等。
阅读(...) 评论()}

我要回帖

更多关于 transformer 的文章

更多推荐

版权声明:文章内容来源于网络,版权归原作者所有,如有侵权请点击这里与我们联系,我们将及时删除。

点击添加站长微信