问题描述/异常栈

    **问题描述/异常栈**
    PlanExecutor error during aggregation :: caused by :: $sample stage could not find a non-duplicate document after 100 while using a random cursor. This is likely a sporadic failure, please try again.

    FAQ-mongodb2hive报错采样问题caused by :: $sample stage - 图1

    解决方案

    #任务级别加上这两个参数
    source.spark.mongodb.input.partitionerOptions.samplesPerPartition=1
    source.spark.mongodb.input.partitionerOptions.partitionSizeMB=128

    问题原因

    mongodb存在碎片化


    作者:魏璐璐