Skip to content

Big NDArray generate Operand Tensor meet protobuf exceeded maximum protobuf size of 2GB ? #464

Open
@mullerhai

Description

@mullerhai

Hi :
from spark DataFrame generate org.tensorflow.ndarray.DoubleNdArray , after I want to generate Operand[TFloat64] tensor , meet error


scala> val featureVector = SparkConverter.sparkDataframeFeatureVectorConvertTfTensor(finalInputDf,"final_features" )
featureVector: org.tensorflow.ndarray.DoubleNdArray = org.tensorflow.ndarray.impl.dense.DoubleDenseNdArray@e3f6a6a0
scala> val ft  = tf.constant(featureVector)
[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/message_lite.cc:451] tensorflow.AttrValue exceeded maximum protobuf size of 2GB: 6279090916
org.tensorflow.exceptions.TFInvalidArgumentException: AttrValue missing value with expected type 'tensor'
         for attr 'value'
        ; NodeDef: {{node Const}}; Op<name=Const; signature= -> output:dtype; attr=value:tensor; attr=dtype:type>
  at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:87)
  at org.tensorflow.EagerOperationBuilder.execute(EagerOperationBuilder.java:314)
  at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:77)
  at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:64)
  at org.tensorflow.op.core.Constant.create(Constant.java:1350)
  at org.tensorflow.op.core.Constant.tensorOf(Constant.java:521)
  at org.tensorflow.op.Ops.constant(Ops.java:1669)
  ... 59 elided

but if I filter some small part Dataframe is ok


scala> val featureVector = SparkConverter.sparkDataframeFeatureVectorConvertTfTensor(finalInputDf.filter(col("pay_status").equalTo(1)),"final_features" )
featureVector: org.tensorflow.ndarray.DoubleNdArray = org.tensorflow.ndarray.impl.dense.DoubleDenseNdArray@627077a

scala> val ft_small  = tf.constant(featureVector)
ft_small: org.tensorflow.op.core.Constant[org.tensorflow.types.TFloat64] = <Const 'Const_2'>

scala> ft_small.asTensor().numBytes()
res43: Long = 1058424696

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions