How to use Impala with Avro files ?

Common erros when exploring Avro with Impala and Hive :

XXXX is nullable in the file schema but not the table schema. (1 of 3 similar)

OR 

TFetchResultsResp(status=TStatus(errorCode=0, errorMessage='java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.avro.AvroGenericRecordWritable cannot be cast to org.apache.hadoop.io.BinaryComparable', sqlState=None, infoMessages=['*org.apache.hive.service.cli.HiveSQLException:java.io.IOException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.avro.AvroGenericRecordWritable cannot be cast to org.apache.hadoop.io.BinaryComparable:14:13', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:328', 'org.apache.hive.service.cli.operation.OperationManager:getOperationNextRowSet:OperationManager.java:262', 'org.apache.hive.service.cli.session.HiveSessionImpl:fetchResults:HiveSessionImpl.java:732', 'org.apache.hive.service.cli.CLIService:fetchResults:CLIService.java:438', 'org.apache.hive.service.cli.thrift.ThriftCLIService:FetchResults:ThriftCLIService.java:692', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1553', 'org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults:getResult:TCLIService.java:1538', 'org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39', 'org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39', 'org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor:process:HadoopThriftAuthBridge.java:695', 'org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:285', 'java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142', 'java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617', 'java.lang.Thread:run:Thread.java:745', '*java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.hive.serde2.avro.AvroGenericRecordWritable cannot be cast to org.apache.hadoop.io.BinaryComparable:18:4', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:507', 'org.apache.hadoop.hive.ql.exec.FetchOperator:pushRow:FetchOperator.java:414', 'org.apache.hadoop.hive.ql.exec.FetchTask:fetch:FetchTask.java:138', 'org.apache.hadoop.hive.ql.Driver:getResults:Driver.java:1662', 'org.apache.hive.service.cli.operation.SQLOperation:getNextRowSet:SQLOperation.java:323', '*java.lang.ClassCastException:org.apache.hadoop.hive.serde2.avro.AvroGenericRecordWritable cannot be cast to org.apache.hadoop.io.BinaryComparable:20:2', 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe:doDeserialize:LazySimpleSerDe.java:162', 'org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe:deserialize:AbstractEncodingAwareSerDe.java:76', 'org.apache.hadoop.hive.ql.exec.FetchOperator:getNextRow:FetchOperator.java:488'], statusCode=3), results=None, hasMoreRows=None)

Solution :

 

Leave a Reply