CaffeSpark configuration
the layer index in the network protocol file
batch size
batch size
make a dummy data blob to be used by Solver threads
make a dummy data blob to be used by Solver threads
a dummy data blob
make a dummy data blob to be used by Solver threads
make a dummy data blob to be used by Solver threads
a dummy data blob
initialization of a Source within a process
initialization of a Source within a process
true if successfully initialized
construct a sample RDD
construct a sample RDD
spark context
RDD created from this source
create a batch of samples extracted from source queue
create a batch of samples extracted from source queue
This method is Invoked by Transformer thread. You should extract samples from source queue, parse it and produce a batch.
holder for sample Ids
holder for label blob
true if successful
adjust batch size
ImageDataFrame is a built-in data source class using Spark dataframe format.
ImageDataFrame expects dataframe with 2 required columns (lable:String, data:byte[]), and 5 optional columns (id: String, channels :Int, height:Int, width:Int, encoded: Boolean).
ImageDataFrame could be configured via the following MemoryDataLayer parameter: (1) dataframe_column_select ... a collection of dataframe SQL selection statements (ex. "sampleId as id", "abs(height) as height") (2) image_encoded ... indicate whether image data are encoded or not. (default: false) (3) dataframe_format ... Dataframe Format. (default: parquet)