com.here.platform.data.processing.broadcast

BroadcastCompiler

abstract class BroadcastCompiler[T] extends InputLayers with InputOptPartitioner

Offers the interface for the data compilation supposed to be stored in a Spark org.apache.spark.broadcast.Broadcast variable.

Compiled data is an object instance of type T. The suggested use of this functionality is to:

perform the broadcast creations during the driver setting up.
send the broadcast variables as a parameter to the compiler which is under creation.

This compiler class is the preferred way to work with broadcast variables in the Data Processing Library.

T: The type of the object to be stored in the broadcast variable.

Linear Supertypes

InputOptPartitioner, InputLayers, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

BroadcastCompiler
InputOptPartitioner
InputLayers
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new BroadcastCompiler(context: DriverContext, id: String)(implicit arg0: ClassTag[T])
BroadcastCompiler constructor that specifies the compiler ID.
BroadcastCompiler constructor that specifies the compiler ID.
context
The com.here.platform.data.processing.driver.DriverContext object that the compiler is running in.
id
The ID for the compiler. Used to stored the fingerprint of the broadcast variable with an ID, which makes it independent of creation order.
new BroadcastCompiler(context: DriverContext)(implicit arg0: ClassTag[T])
BroadcastCompiler constructor.
BroadcastCompiler constructor.
context
The com.here.platform.data.processing.driver.DriverContext object that the compiler is running in.

Abstract Value Members

abstract def calculateFingerprint(t: T): Int
Calculate the fingerprint of the compiled data.
Calculate the fingerprint of the compiled data.
t
The data to calculate the fingerprint of.
returns
The calculated fingerprint.

Attributes
protected
Note
The fingerprint calculation needs to be stable, otherwise the correctness of incremental compilation will be compromised.
abstract def compile(input: RDD[(InKey, InMeta)], parallelism: Int): T
Compiles an input metadata RDD into an object of data T.
Compiles an input metadata RDD into an object of data T.
This function needs to be overridden to perform the compilation of the input meta RDD.
input
The input metadata RDD.
parallelism
The parallelism of the RDDs. This parameter is normally needed to get partitioners from com.here.platform.data.processing.compiler.InputOptPartitioner and/or from com.here.platform.data.processing.compiler.OutputOptPartitioner traits.
returns
The compilation result.

Attributes
protected
abstract def inLayers: Map[Id, Set[Id]]
Represents layers of the input catalogs that you should query and provide to the compiler.
Represents layers of the input catalogs that you should query and provide to the compiler. These layers are grouped by input catalog and identified by catalog ID and layer ID.

Definition Classes
InputLayers
abstract def inPartitioner(parallelism: Int): Option[Partitioner[InKey]]
Specifies the partitioner to use when querying the input catalogs.
Specifies the partitioner to use when querying the input catalogs. If no partitioner is provided, by returning None from this function, then the Executor uses the default partitioner.
parallelism
The number of partitions the partitioner should partition the catalog into, this should match the parallelism of the Spark RDD containing the input partitions.
returns
The optional input partitioner with the parallelism specified.

Definition Classes
InputOptPartitioner

Concrete Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native() @IntrinsicCandidate()
final def compileBroadcast(parallelism: Int): Broadcast[T]
Performs the compilation and the related operations to create the broadcast variable and to update the fingerprints.
Performs the compilation and the related operations to create the broadcast variable and to update the fingerprints. The id passed to the constructor, if specified, is used as fingerprint ID.
Class users should call this method. Note that this method updates a mutable content in the com.here.platform.data.processing.driver.DriverContext object and changes a shared status. Multiple calls to this function should then happen in a deterministic order.
parallelism
The parallelism of the RDDs. This parameter is normally needed to get partitioners from com.here.platform.data.processing.compiler.InputOptPartitioner and/or from com.here.platform.data.processing.compiler.OutputOptPartitioner traits
returns
The resulting broadcast variable.
final def compileBroadcastNoUpdateFingerprints(parallelism: Int): Broadcast[T]
Performs the compilation and the related operations to create the broadcast variable without updating the fingerprints.
Performs the compilation and the related operations to create the broadcast variable without updating the fingerprints. WARNING: the recompilation won't be triggered if a change of the broadcast object is detected.
parallelism
The parallelism of the RDDs. This parameter is normally needed to get partitioners from com.here.platform.data.processing.compiler.InputOptPartitioner and/or from com.here.platform.data.processing.compiler.OutputOptPartitioner traits
returns
The resulting broadcast variable.
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native() @IntrinsicCandidate()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native() @IntrinsicCandidate()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native() @IntrinsicCandidate()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native() @IntrinsicCandidate()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Deprecated Value Members

def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] ) @Deprecated
Deprecated

Packages

BroadcastCompiler

abstract class BroadcastCompiler[T] extends InputLayers with InputOptPartitioner

Instance Constructors

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from InputOptPartitioner

Inherited from InputLayers

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

BroadcastCompiler 

abstract class BroadcastCompiler[T] extends InputLayers with InputOptPartitioner

Instance Constructors

Abstract Value Members

Concrete Value Members

Deprecated Value Members

Inherited from InputOptPartitioner

Inherited from InputLayers

Inherited from AnyRef

Inherited from Any

Ungrouped

BroadcastCompiler