本文主要是记录使用yum安装CDH Hadoop集群的过程,包括HDFS、Yarn、Hive和HBase。本文使用CDH5.4版本进行安装,故下文中的过程都是针对CDH5.4版本的。 0. 环境说明 系统环境: 操作系统:CentOs... Read More | Share it now!
centos提示database disk image is malformed解决方法
在执行yum更新,即运行 yum update 命令时,提示以下错误: Error:database disk image is... Read More | Share it now!
查看 SELinux状态及关闭SELinux
查看SELinux状态: 1、/usr/sbin/sestatus -v ##如果SELinux status参数为enabled即为开启状态 SELinux status: enabled 2、getenforce ... Read More | Share it now!
ubuntu iptables
https://www.digitalocean.com/community/tutorials/how-to-set-up-a-firewall-using-iptables-on-ubuntu-14-04 Introduction Setting up a good firewall is an essential step to take in securing any modern operating system. Most Linux distributions ship with a... Read More | Share it now!
AVRO 规范-Schema的定义和声明
参考自http://avro.apache.org/docs/current/spec.html Avro... Read More | Share it now!
最强日期正则表达式
一、简单的日期判断(YYYY/MM/DD): ^\d{4}(\-|\/|\.)\d{1,2}\1\d{1,2}$ 二、演化的日期判断(YYYY/MM/DD|... Read More | Share it now!
全国各地行政区划代码及身份证号前6位查询
110000 北京 110101 东城区 110102 西城区 110105 朝阳区 110106 丰台区 110107 石景山区 110108 海淀区 110109 门头沟区 110111 房山区 110112 通州区 110113... Read More | Share it now!
Hive中配置Parquet(CDH4.3)
CDH4.3版本中并没有提供现成的Parquet安装包,所以如果在Hive或Impala中需要使用Parquet格式,需要手动进行安装,当创建Parquet格式的表时,需要定义Parquet相关的InputFormat,OutputFormat,Serde,建表语句如下 hive> create table parquet_test(x int, y string) > row format serde 'parquet.hive.serde.ParquetHiveSerDe' > stored as inputformat 'parquet.hive.DeprecatedParquetInputFormat' > outputformat 'parquet.hive.DeprecatedParquetOutputFormat'; FAILED: SemanticException : Output Format must implement HiveOutputFormat, otherwise it should be either IgnoreKeyTextOutputFormat or SequenceFileOutputFormat 提交语句会报错,原因是parquet.hive.DeprecatedParquetOutputFormat类并没有在Hive的CLASSPATH中配置,此类属于$IMPALA_HOME/lib目录下的parquet-hive-1.2.5.jar,所以在$HIVE_HOME/lib目录下建立个软链就可以了 cd $HIVE_HOME/lib ln -s $IMPALA_HOME/lib/parquet-hive-1.2.5.jar 继续提交建表语句,报错如下 hive> create table parquet_test(x int, y string) > row format serde 'parquet.hive.serde.ParquetHiveSerDe' > stored as inputformat 'parquet.hive.DeprecatedParquetInputFormat' > outputformat 'parquet.hive.DeprecatedParquetOutputFormat'; Exception in thread "main" java.lang.NoClassDefFoundError: parquet/hadoop/api/WriteSupport at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.hive.ql.plan.CreateTableDesc.validate(CreateTableDesc.java:403) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:8858) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8190) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:459) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:938) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.lang.ClassNotFoundException: parquet.hadoop.api.WriteSupport at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 20 more 报错的原因是因为缺少一些Parquet相关的jar文件,直接下载到$HIVE_HOME/lib目录下即可 cd /usr/lib/hive/lib for f in parquet-avro parquet-cascading parquet-column parquet-common parquet-encoding parquet-generator parquet-hadoop parquet-hive parquet-pig parquet-scrooge parquet-test-hadoop2 parquet-thrift > do > curl -O https://oss.sonatype.org/service/local/repositories/releases/content/com/twitter/${f}/1.2.5/${f}-1.2.5.jar > done > curl -O https://oss.sonatype.org/service/local/repositories/releases/content/com/twitter/parquet-format/1.0.0/parquet-format-1.0.0.jar 继续提交建表语句,正常通过。成功建表后,需要将其他表中的数据Load到Parquet格式的表中,在执行HQL过程中,需要使用Parquet相关的jar文件,有两种方法,一种是在运行语句前对每一个jar都执行add... Read More | Share it now!
Hadoop Hive sql语法详解
Hive 是基于Hadoop 构建的一套数据仓库分析系统,它提供了丰富的SQL查询方式来分析存储在Hadoop... Read More | Share it now!
Honda CR-V DA屏 Iphone 屏幕复制 行车可用
刚刚在老美论坛上找到了方法,亲测可用。 2015 CR-V... Read More | Share it now!