Persisting a DenseMatrix in Apache Spark -
is there recommended/proven efficient format or mechanism persist densematrix in apache spark? or should write file?
i generating densematrix post svd operation , need refer , when user queries come in , hence looked often.
any appreciated.
if densematrix
mean org.apache.spark.mllib.linalg.densematrix
(v) local data structure , there no spark specific way handle type of objects.
one way handle write serialized object directly file:
val oos = new java.io.objectoutputstream( new java.io.fileinputstream("/tmp/foo"))) oos.writeobject(svd.v) oos.close()
and read later using fileinputstream
, objectinputstream.readobject
. can use human readable serialization of choice json:
import net.liftweb.json.{notypehints, serialization} import net.liftweb.json.serialization.{read, write} implicit val formats = serialization.formats(notypehints) val serialized: string = write(svd.v) // write file , read if needed ... // deserialize val deserialized = read[densematrix](serialized)
Comments
Post a Comment