Oozie Shell Action 配置

Shell Action

Shell action运行一个shell命令,需要配置的有job-tracker,name-node和一些必要的参数。

经过配置,在启动Shell Action之前可以创建或删除HDFS文件夹。

可以通过配置文件(通过job-xml元素)给定配置信息,或者是用内嵌的configuration元素进行配置。

可以在内嵌的configuration里面使用EL表达式,在configuration里面配置的信息会覆盖job-xml里面相同的值。

需要注意的是,Hadoop的mapred.job.tracker和fs.default.name属性不可以在内嵌的configuration里面配置。

跟hadoop的map-reduce jobs一样,可以添加附件到sqoop job里面。具体参见【http://archive.cloudera.com/cdh/3/oozie/WorkflowFunctionalSpec.html#a3.2.2.1_Adding_Files_and_Archives_for_the_Job

shell任务的标准输出(STDOUT)在shell运行结束之后是可用的。这些信息可以被决策结点使用。如果shell job的输出被配置成可用的,那shell命令必须包含以下两个参数:

  • 输出的格式必须是合法的java属性文件。
  • 输出的大小不能超过2KB。

语法:

<workflow-app name="[WF-DEF-NAME]" xmlns="uri:oozie:workflow:0.2">
    ...
    <action name="[NODE-NAME]">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>[JOB-TRACKER]</job-tracker>
            <name-node>[NAME-NODE]</name-node>
            <prepare>
               <delete path="[PATH]"/>
               ...
               <mkdir path="[PATH]"/>
               ...
            </prepare>
            <job-xml>[SHELL SETTINGS FILE]</job-xml>
            <configuration>
                <property>
                    <name>[PROPERTY-NAME]</name>
                    <value>[PROPERTY-VALUE]</value>
                </property>
                ...
            </configuration>
            <exec>[SHELL-COMMAND]</exec>
            <argument>[ARG-VALUE]</argument>
                ...
            <argument>[ARG-VALUE]</argument>
            <env-var>[VAR1=VALUE1]</env-var>
               ...
            <env-var>[VARN=VALUEN]</env-var>
            <file>[FILE-PATH]</file>
            ...
            <archive>[FILE-PATH]</archive>
            ...
            <capture-output/>
        </shell>
        <ok to="[NODE-NAME]"/>
        <error to="[NODE-NAME]"/>
    </action>
    ...
</workflow-app>

prepare元素里面配置启动job前要删除或者创建的文件夹,文件夹路径必须是以hdfs://HOST:PORT开头。

job-xml指定一个存在的配置文件。

configuration里面配置传递给sqoop job的参数。

exec元素包含要执行的shell命令的路径。可以给shell命令添加参数。

argument元素指定要传递给shell脚本的参数。

env-var包含传递给shell命令的环境变量。env-var只能包含一个环境变量和值。如果这个环境变量包含像$PATH一样的,那它必须写成PATH=$PATH:mypath。不能用${PATH},因为它将会被EL解析。

capture-output元素指定用来捕获shell脚本的标准输出。可以通过String action:output(String node, String key)函数【EL函数】来获得输出。

例子:

<workflow-app xmlns='uri:oozie:workflow:0.2' name='shell-wf'>
    <start to='shell1' />
    <action name='shell1'>
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                  <name>mapred.job.queue.name</name>
                  <value>${queueName}</value>
                </property>
            </configuration>
            <exec>${EXEC}</exec>
            <argument>A</argument>
            <argument>B</argument>
            <file>${EXEC}#${EXEC}</file> <!--Copy the executable to compute node's current working directory -->
        </shell>
        <ok to="end" />
        <error to="fail" />
    </action>
    <kill name="fail">
        <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name='end' />
</workflow-app>

其中,job属性文件如下:

oozie.wf.application.path=hdfs://localhost:8020/user/kamrul/workflows/script#Execute is expected to be in the Workflow directory.
#Shell Script to run
EXEC=script.sh
#CPP executable. Executable should be binary compatible to the compute node OS.
#EXEC=hello
#Perl script
#EXEC=script.pl
jobTracker=localhost:8021
nameNode=hdfs://localhost:8020
queueName=default

运行jar里面的java程序:

<workflow-app xmlns='uri:oozie:workflow:0.2' name='shell-wf'>
    <start to='shell1' />
    <action name='shell1'>
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                  <name>mapred.job.queue.name</name>
                  <value>${queueName}</value>
                </property>
            </configuration>
            <exec>java</exec>
            <argument>-classpath</argument>
            <argument>./${EXEC}:$CLASSPATH</argument>
            <argument>Hello</argument>
            <file>${EXEC}#${EXEC}</file> <!--Copy the jar to compute node current working directory -->
        </shell>
        <ok to="end" />
        <error to="fail" />
    </action>
    <kill name="fail">
        <message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name='end' />
</workflow-app>

<file>属性会复制指定的文件到运行该脚本的机器上。当提示找不到文件的时候,试试file

对应的属性文件是:

oozie.wf.application.path=hdfs://localhost:8020/user/kamrul/workflows/script#Hello.jar file is expected to be in the Workflow directory.
EXEC=Hello.jar
jobTracker=localhost:8021
nameNode=hdfs://localhost:8020
queueName=default

Shell Action 日志

shell action的stdout和stderr输出被重定向到运行该脚本的oozie执行器上的map-reduce任务的stdout。

除了在Oozie的web网页上可以看到少部分日志,还可以在hadoop的job-tracker的网页上看到详细的日志信息。

Shell Action 限制

虽然Shell Action可以执行任意的shell命令,但是有以下几个限制:

不支持交互命令。

不能通过sudo来让不同用户执行命令。

用户必须明确的上传所需要的第三方库。Oozie通过Hadoop的分布式缓冲来上传、打标签、使用。

Shell命令会在任意一个hadoop 计算节点上运行,但是计算节点上默认安装的工具集可能会不一样。不过在所有的计算节点上,通常都装有大部分普通的unix工具。因此需要明确的很重要的一点是:Oozie只支持有被安装到计算节点上的命令或者通过分布式缓存上传的命令。也就是说,我们必须通过file上传我们要用到的文件。

http://archive.cloudera.com/cdh/3/oozie/DG_ShellActionExtension.html

转载请注明: 转载自http://jyd.me/

17 Responses so far.

  1. Quest bars says:
    Fascinating blog! Is your theme custom made or did
    you download it from somewhere? A theme like yours with a few simple tweeks would really make my blog stand out.
    Please let me know where you got your design. Appreciate it
  2. quest Bars says:
    Wow, this piece of writing is fastidious, my younger sister is analyzing these things,
    therefore I am going to let know her.
  3. quest bars says:
    Spot on with this write-up, I really think this web site needs a lot more
    attention. I’ll probably be returning to see more, thanks for the advice!
  4. Hi there! I just wanted to ask if you ever have any issues with hackers?

    My last blog (wordpress) was hacked and I ended up losing many months of hard work due
    to no backup. Do you have any solutions to protect against hackers?

  5. Hi, I do think this is a great blog. I stumbledupon it 😉 I’m
    going to come back yet again since i have book-marked
    it. Money and freedom is the greatest way to change, may you be rich
    and continue to guide others.
  6. Useful info. Lucky me I discovered your website unintentionally,
    and I’m stunned why this twist of fate didn’t happened in advance!
    I bookmarked it.
  7. Hello There. I found your blog using msn. This
    is a very well written article. I will be sure to bookmark it and
    return to read more of your useful information. Thanks for the post.

    I will definitely return.

  8. I’m gone to say to my little brother, that he should also pay a quick visit this website on regular basis to take
    updated from hottest news.
  9. Howdy! This blog post couldn’t be written much better! Looking through this article reminds me
    of my previous roommate! He always kept preaching about this.
    I will forward this information to him. Pretty
    sure he’ll have a good read. Thanks for sharing!
  10. You can definitely see your skills within the article you write.
    The world hopes for even more passionate writers
    such as you who are not afraid to say how they believe.
    Always follow your heart.
  11. Awesome things here. I am very glad to look your post. Thank you a lot and I’m looking forward
    to touch you. Will you please drop me a e-mail?
  12. Very good blog you have here but I was wondering if you knew of any forums that cover the same topics talked about here?
    I’d really love to be a part of group where I can get suggestions from other knowledgeable individuals that share the same interest.
    If you have any suggestions, please let me know.

    Appreciate it!

  13. I do not know if it’s just me or if everyone else encountering problems with your website.

    It looks like some of the written text within your posts are running off the screen. Can someone
    else please comment and let me know if this is happening to them as well?

    This may be a problem with my internet browser because I’ve had this happen previously.

    Kudos

  14. You really make it appear so easy along with your
    presentation however I in finding this matter to be
    actually something that I feel I might never understand.
    It sort of feels too complex and very wide for me. I am having a look ahead for your subsequent
    post, I’ll attempt to get the dangle of it!
  15. Oh my goodness! Impressive article dude! Many thanks, However I am
    having difficulties with your RSS. I don’t understand the reason why I can’t join it.

    Is there anybody else having the same RSS issues? Anybody who knows the solution can you kindly respond?
    Thanx!!

  16. Denny Tango says:
    My husband and i got really fulfilled Edward managed to do his investigation from your ideas he got when using the weblog. It’s not at all simplistic just to always be freely giving helpful hints which often other folks might have been selling. And we also figure out we’ve got you to appreciate for this. The specific illustrations you have made, the easy site menu, the relationships you will make it easier to instill – it’s got most excellent, and it’s really aiding our son in addition to our family recognize that the theme is cool, which is certainly particularly indispensable. Thank you for the whole thing!
  17. Jessalyn says:
    It’s a real plarusee to find someone who can think like that

Leave a Reply to Jessalyn Cancel reply