Wednesday, March 25, 2015

Hive - S3NativeFileSystem - Insert Overwrite Bug

We store all our data in S3. We create external tables pointing to the data in S3 and run hive queries on these tables.

In one of the use cases, we needed to update a table incrementally with incoming data.
The query was something like this –

INSERT OVERWRITE TABLE main_table1
SELECT distinct my_field
FROM
(
    SELECT my_field
        FROM new_table1
    UNION ALL
    SELECT my_field
      FROM main_table1
) s;

Basically what we are doing here is that we union new data with existing data and eliminate duplicates from this union.
The expected behavior is that the ‘main_table1’ should get updated  on each run with new data from ‘new_table’.
What we observed is this query was overwriting ‘main_table1’ with the data of ‘new_table’. That means on each run, we will lose all the old data and only new data will remain in the table.

The behavior is due to a bug in EMR’s S3 library. S3NativeFileSystem class deletes the S3 file pointed by ‘main_table1’ while preparing query plan itself!

Even simple ‘Explain’ statement for this insert-overwrite-query deletes the data file in S3! This can result in SERIOUS DATA LOSS!

Not-So-Good-Solution:
Use a staging area (tmp table) to store your results and then copy the result from tmp table to main table.

CREATE TABLE tmp_table….

INSERT OVERWRITE TABLE tmp_table
SELECT distinct my_field
FROM
(
    SELECT my_field
        FROM new_table1
    UNION ALL
    SELECT my_field
      FROM main_table1
) s;

INSERT OVERWRITE main_table1
SELECT my_field from tmp_table;

There is one problem with this solution though. If you stop the last INSERT OVERWRITE statement (INSERT OVERWRITE main_table1 SELECT my_field from tmp_table;) before it completes successfully, you lose all your data! (Remember? - The S3 file is deleted while preparing query plan itself!)

Stable Solution:
Use ‘INSERT INTO’ instead of ‘INSERT Overwrite’.

INSERT INTO main_table1
SELECT t1.my_field FROM new_table1 t1
WHERE t1.my_field NOT IN (SELECT t2.my_field FROM main_table1);

This will ensure that you incrementally update the ‘main_table1’ without using INSERT OVERWRITE!

- Sarang

Thursday, March 12, 2015

Tuning Yarn container for Oozie

Oozie is a popular workflow management tool for BigData applications.
To give some high level idea, following is the container allocation for a typical oozie workflow application with hive action.




If you are running a heavy job through Oozie, there are chances that the yarn container which runs oozie job (‘Oozie workflow’ container in above image) may give out of memory.
The memory allocation for yarn container for oozie can be increased with property ‘oozie.launcher.mapreduce.map.memory.mb’ & 'oozie.launcher.mapreduce.map.java.opts'. The default value is typically 1536.
The property can be updated in oozie application workflow definition (workflow.xml) to allocate additional memory to container.

   <global>
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <job-xml>/user/hive/conf/hive-default.xml</job-xml>
        <configuration>
            <property>
                <name>hive.metastore.uris</name>
                <value>${metastore}</value>
            </property>
            <property>
                <name>hive.metastore.client.socket.timeout</name>
                <value>3600</value>
            </property>
            <property>
                <name>mapred.reduce.tasks</name>
                <value>-1</value>
            </property>
            <property>
                <name>oozie.launcher.mapreduce.map.memory.mb</name>
                <value>3072</value>
            </property>
            <property>
                <name>oozie.launcher.mapreduce.map.java.opts</name>
                <value>-Xmx2560m</value>
            </property>          
            <property>
                <name>oozie.launcher.yarn.app.mapreduce.am.resource.mb</name>
                <value>3072</value>
            </property>
            <property>
                <name>oozie.launcher.yarn.app.mapreduce.am.command-opts</name>
                <value>-Xmx2560m</value>
            </property>
        </configuration>
    </global>

- Sarang