Category Archives: Hadoop

“bad substitution” error running Spark on Yarn

you can update the hdp.version property through Ambari GUI. below are the steps: 1. Go to ‘Ambari -> YARN -> configs’ and go to ‘Advanced’ tab. 2. scroll down the page to till end, there will find an option to add custom... Read More | Share it now!

新旧MapReduce 的API对比

Hadoop 的版本0.20包含一个新的java MapReduce API,我们也称他为上下文对象(context... Read More | Share it now!

Exception in thread “main” java.lang.IllegalArgumentException: Not a host:port pair:

...... 380 INFO zookeeper.ZooKeeper - Initiating client connection, connectString=hdp1.dg:2181 sessionTimeout=180000 watcher=hconnection 431 INFO zookeeper.ClientCnxn - Opening socket connection to server hdp1.dg/192.168.6.228:2181 441 INFO... Read More | Share it now!

Change IP address of a Hadoop HDFS data node server and avoid Block pool errors

CHANGE HOST IP IN CLOUDERA MANAGER Change Host IP on all node sudo nano /etc/hosts Edit the ip cloudera config.ini on all node if the master node ip changes sudo nano /etc/cloudera-scm-agent/config.ini Change IP in PostgreSQL Database For the... Read More | Share it now!

UGI – UserGroupInformation

Using user group information for hadoop job submitting. UserGroupInformation ugi = UserGroupInformation.createRemoteUser("hdfs"); try { ugi.doAs(new PrivilegedExceptionAction() { public Void run()... Read More | Share it now!

DistributedCache Hadoop

Create a util for adding jar to distributed cache. path: local jar path conf: hadoop conf private static void addJarToDistributedCache( String path, Configuration conf) throws IOException { File jarFile = new... Read More | Share it now!

java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.()V

java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.()V from class org.apache.hadoop.mapred.FileInputFormat at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:312) at... Read More | Share it now!

bdutil

bdutil bdutil is a command-line script used to manage Hadoop instances on Google Compute Engine. bdutil manages deployment, configuration, and shutdown of your Hadoop instances. Requirements bdutil depends on the Google Cloud SDK. bdutil is supported... Read More | Share it now!

Installing the MapR Client on CentOS or Red Hat

Mapr-Repositories-centos Installing the MapR Client on CentOS or Red Hat Remove any previous MapR software. You can use rpm -qa | grep mapr to get a list of installed MapR packages, then type the packages separated by spaces after the rpm... Read More | Share it now!

How to dump the output to a file from Beeline?

https://community.hortonworks.com/questions/25789/how-to-dump-the-output-from-beeline.html Hi, I am trying to dump the output of a beeline query (below) to a file but it prints all the logs along with the output in the file. beeline –u... Read More | Share it now!