Monthly Archives: April 2016

HBase Basics Quizz

1.  What is HBase? [premium_content] Distributed scalable database Distributed non-relational, open source database Distributed row-oriented database Distributed scalable Big Data store 2. HBase and HDFS Are both great for batch processing HBase is designed for sequential access only HDFS is designed for fast lookups for large tables All of the above None of the above […]

How to use Talend with Dimelo API ?

How to use Talend with Dimelo API ? TalendExpert.com can help you on your process of Ingesting your data from Dimelo using Talend ETL. Our teams have developed and delivered successfully this integration journey, we have developed many template jobs reusable and ready for use regarding Dimelo integration. Also, we do propose ready-to-use components that extract/update data with Dimelo API using a very […]

Tables saved with the Spark SQL DataFrame.saveAsTable method are not compatible with Hive

Writing a DataFrame directly to a Hive table creates a table that is not compatible with Hive; the metadata stored in the metastore can only be correctly interpreted by Spark. For example: val hsc = new HiveContext(sc) import hsc.implicits._ val val df = sc.parallelize(data).toDF() df.write.format(“parquet”).saveAsTable(tableName) creates a table with this metadata: inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat This is […]