We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
原程序hbase版本为1.3.6时mysql数据无法写入hbase,错误显示找不到类! hbase升级到2.1.0后spark读取mysql数据后写入hbase时报! 报错信息如下: 2022-02-24 12:08:09:174[INFO]: [数据采集]:[HBASE]:检查表是否存在:t1 2022-02-24 12:08:09:179[INFO]: [数据采集]:[HBASE]:检查表已存在,检查 列族:cf1 2022-02-24 12:08:09:186[INFO]: [数据采集]:[HBASE]:tableDescriptor:'t1', {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'} 2022-02-24 12:08:09:190[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all 2022-02-24 12:08:09:189[INFO]: [数据采集]:[HBASE]:[WRITE]:writeDS:=====开始======= 2022-02-24 12:08:10:191[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all 2022-02-24 12:08:10:201[INFO]: Code generated in 308.11907 ms 2022-02-24 12:08:10:263[INFO]: [数据采集]:[HBASE]:[WRITE]:DataFrame:=====MapPartitionsRDD[3] at rdd at HbaseDataSources.scala:214 2022-02-24 12:08:10:294[INFO]: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 2022-02-24 12:08:10:299[INFO]: Using output committer class org.apache.hadoop.mapred.FileOutputCommitter 2022-02-24 12:08:10:301[INFO]: File Output Committer Algorithm version is 2 2022-02-24 12:08:10:301[INFO]: FileOutputCommitter skip cleanup temporary folders under output directory:false, ignore cleanup failures: false 2022-02-24 12:08:10:301[WARN]: Output Path is null in setupJob() 2022-02-24 12:08:10:325[INFO]: Starting job: runJob at SparkHadoopWriter.scala:78 2022-02-24 12:08:10:341[INFO]: Got job 0 (runJob at SparkHadoopWriter.scala:78) with 1 output partitions 2022-02-24 12:08:10:342[INFO]: Final stage: ResultStage 0 (runJob at SparkHadoopWriter.scala:78) 2022-02-24 12:08:10:342[INFO]: Parents of final stage: List() 2022-02-24 12:08:10:344[INFO]: Missing parents: List() spark.rdd.scope.noOverride===true spark.jobGroup.id===946377927967121408 spark.rdd.scope==={"id":"6","name":"saveAsHadoopDataset"} spark.job.description===mysql2hbase_2022-02-24 12:07:58_946377927967121408 spark.job.interruptOnCancel===false =====jobStart.properties:{spark.rdd.scope.noOverride=true, spark.jobGroup.id=946377927967121408_, spark.rdd.scope={"id":"6","name":"saveAsHadoopDataset"}, spark.job.description=mysql2hbase_2022-02-24 12:07:58_946377927967121408, spark.job.interruptOnCancel=false} Process:null 2022-02-24 12:08:10:348[INFO]: Submitting ResultStage 0 (MapPartitionsRDD[4] at map at HbaseDataSources.scala:215), which has no missing parents 2022-02-24 12:08:10:348[ERROR]: Listener ServerSparkListener threw an exception scala.MatchError: null at com.zyc.common.ServerSparkListener.onJobStart(ServerSparkListener.scala:32) at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91) at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92) at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
The text was updated successfully, but these errors were encountered:
你好,已确认,属于bug,影响范围4.7.18及之前版本都无法完成hbase,写入, 将于5.0.0版本修复此bug, 临时解决方案如下:修改zdh_server源码,spark listener 如下图:
使用安装包的同伴,可下载4.7.10版本源码,修改此文件,然后编译,将编译好的class拷贝到jar即可,如下图:
Sorry, something went wrong.
zhaoyachao
No branches or pull requests
原程序hbase版本为1.3.6时mysql数据无法写入hbase,错误显示找不到类!
hbase升级到2.1.0后spark读取mysql数据后写入hbase时报!
报错信息如下:
2022-02-24 12:08:09:174[INFO]: [数据采集]:[HBASE]:检查表是否存在:t1
2022-02-24 12:08:09:179[INFO]: [数据采集]:[HBASE]:检查表已存在,检查 列族:cf1
2022-02-24 12:08:09:186[INFO]: [数据采集]:[HBASE]:tableDescriptor:'t1', {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2022-02-24 12:08:09:190[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all
2022-02-24 12:08:09:189[INFO]: [数据采集]:[HBASE]:[WRITE]:writeDS:=====开始=======
2022-02-24 12:08:10:191[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all
2022-02-24 12:08:10:201[INFO]: Code generated in 308.11907 ms
2022-02-24 12:08:10:263[INFO]: [数据采集]:[HBASE]:[WRITE]:DataFrame:=====MapPartitionsRDD[3] at rdd at HbaseDataSources.scala:214
2022-02-24 12:08:10:294[INFO]: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2022-02-24 12:08:10:299[INFO]: Using output committer class org.apache.hadoop.mapred.FileOutputCommitter
2022-02-24 12:08:10:301[INFO]: File Output Committer Algorithm version is 2
2022-02-24 12:08:10:301[INFO]: FileOutputCommitter skip cleanup temporary folders under output directory:false, ignore cleanup failures: false
2022-02-24 12:08:10:301[WARN]: Output Path is null in setupJob()
2022-02-24 12:08:10:325[INFO]: Starting job: runJob at SparkHadoopWriter.scala:78
2022-02-24 12:08:10:341[INFO]: Got job 0 (runJob at SparkHadoopWriter.scala:78) with 1 output partitions
2022-02-24 12:08:10:342[INFO]: Final stage: ResultStage 0 (runJob at SparkHadoopWriter.scala:78)
2022-02-24 12:08:10:342[INFO]: Parents of final stage: List()
2022-02-24 12:08:10:344[INFO]: Missing parents: List()
spark.rdd.scope.noOverride===true
spark.jobGroup.id===946377927967121408
spark.rdd.scope==={"id":"6","name":"saveAsHadoopDataset"}
spark.job.description===mysql2hbase_2022-02-24 12:07:58_946377927967121408
spark.job.interruptOnCancel===false
=====jobStart.properties:{spark.rdd.scope.noOverride=true, spark.jobGroup.id=946377927967121408_, spark.rdd.scope={"id":"6","name":"saveAsHadoopDataset"}, spark.job.description=mysql2hbase_2022-02-24 12:07:58_946377927967121408, spark.job.interruptOnCancel=false}
Process:null
2022-02-24 12:08:10:348[INFO]: Submitting ResultStage 0 (MapPartitionsRDD[4] at map at HbaseDataSources.scala:215), which has no missing parents
2022-02-24 12:08:10:348[ERROR]: Listener ServerSparkListener threw an exception
scala.MatchError: null
at com.zyc.common.ServerSparkListener.onJobStart(ServerSparkListener.scala:32)
at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:37)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91)
at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92)
at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
The text was updated successfully, but these errors were encountered: