Packages

package broadcast

Provides the basic functionality to perform org.apache.spark.broadcast.Broadcast creation.

To use a broadcast variable in the processing library, developers are required to add the variable's hash to the fingerprints. This ensures that incremental compilations are not compromised. This package object provides the method toBroadcast() to use for that purpose. The package also offers a functionality to query catalogs by properly managing versions. The functionality is based on the org.apache.spark.broadcast.Broadcast which offers a self-contained mechanism to create a single broadcast variable of a generic type.

This compiler class is the preferred way of working with broadcast variables in the Data Processing Library.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. broadcast
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. abstract class BroadcastCompiler[T] extends InputLayers with InputOptPartitioner

    Offers the interface for the data compilation supposed to be stored on a Spark org.apache.spark.broadcast.Broadcast variable.

    Offers the interface for the data compilation supposed to be stored on a Spark org.apache.spark.broadcast.Broadcast variable.

    Compiled data is an object instance of type T. The suggested use of this functionality is to: - perform the broadcast creations during the driver setting up. - send the broadcast variables to the compiler which is under creation as a parameter.

    T

    the type of the object to be stored in the broadcast variable

Value Members

  1. def queryInputMeta(context: DriverContext, inLayers: Map[String, Set[String]], inPartitioner: InputOptPartitioner, parallelism: Int): JavaPairRDD[InKey, InMeta]

    Gets the input metadata from the input catalogs at the version that needs to be compiled.

    Gets the input metadata from the input catalogs at the version that needs to be compiled.

    context

    the com.here.platform.data.processing.java.driver.DriverContext object

    inLayers

    the input layers

    inPartitioner

    the input partition option, if empty a default hash based partitioner on the default parallelism will be used

    parallelism

    the parallelism of both the output RDD

    returns

    the input metadata RDD

  2. def toBroadcast[T](context: DriverContext, t: T, fingerprint: Int, fingerprintId: String, clazz: Class[T]): Broadcast[T]

    Creates a broadcast variable out of an object instance of type T.

    Creates a broadcast variable out of an object instance of type T.

    Please note that this method updates a mutable content in the com.here.platform.data.processing.java.driver.DriverContext object: it changes a shared status. Multiple calls to this functionality should then happen in a deterministic order.

    The normal use of this functionality is to: - perform the broadcast creations during the driver setting up. - send the broadcast variables to the compiler which is under creation as a parameter.

    T

    the type of the broadcast variable under creation

    context

    the com.here.platform.data.processing.java.driver.DriverContext object

    t

    the object to create a broadcast variable out ot it

    fingerprint

    the fingerprint of the parameter t that will be stored in the com.here.platform.data.processing.driver.Fingerprints

    fingerprintId

    The ID used to store the fingerprint.

    clazz

    the Class of type T

    returns

    the broadcast variable

    Note

    the stability of the provided fingerprint calculation is crucial to not compromise subsequent incremental compilations

  3. def toBroadcast[T](context: DriverContext, t: T, fingerprint: Int, clazz: Class[T]): Broadcast[T]

    Creates a broadcast variable out of an object instance of type T.

    Creates a broadcast variable out of an object instance of type T.

    Please note that this method updates a mutable content in the com.here.platform.data.processing.java.driver.DriverContext object: it changes a shared status. Multiple calls to this functionality should then happen in a deterministic order.

    The normal use of this functionality is to: - perform the broadcast creations during the driver setting up. - send the broadcast variables to the compiler which is under creation as a parameter.

    T

    the type of the broadcast variable under creation

    context

    the com.here.platform.data.processing.java.driver.DriverContext object

    t

    the object to create a broadcast variable out ot it

    fingerprint

    the fingerprint of the parameter t that will be stored in the com.here.platform.data.processing.driver.Fingerprints

    clazz

    the Class of type T

    returns

    the broadcast variable

    Note

    the stability of the provided fingerprint calculation is crucial to not compromise subsequent incremental compilations

  4. def toBroadcast[T](context: DriverContext, t: T, fingerprintId: String, clazz: Class[T]): Broadcast[T]

    Creates a broadcast variable out of an object instance of type T.

    Creates a broadcast variable out of an object instance of type T.

    Please note that this method updates a mutable content in the com.here.platform.data.processing.java.driver.DriverContext object: it changes a shared status. Multiple calls to this functionality should then happen in a deterministic order.

    The normal use of this functionality is to: - perform the broadcast creations during the driver setting up. - send the broadcast variables to the compiler which is under creation as a parameter.

    T

    the type of the broadcast variable under creation

    context

    the com.here.platform.data.processing.java.driver.DriverContext object

    t

    the object to create a broadcast variable out ot it

    fingerprintId

    The ID used to store the fingerprint.

    clazz

    the Class of type T

    returns

    the broadcast variable

    Note

    as object hashes are added to fingerprints, objects of type T are supposed to have stable hashCode() methods, implemented based of the actual object content

  5. def toBroadcast[T](context: DriverContext, t: T, clazz: Class[T]): Broadcast[T]

    Creates a broadcast variable out of an object instance of type T.

    Creates a broadcast variable out of an object instance of type T.

    Please note that this method updates a mutable content in the com.here.platform.data.processing.java.driver.DriverContext object: it changes a shared status. Multiple calls to this functionality should then happen in a deterministic order.

    The normal use of this functionality is to: - perform the broadcast creations during the driver setting up. - send the broadcast variables to the compiler which is under creation as a parameter.

    T

    the type of the broadcast variable under creation

    context

    the com.here.platform.data.processing.java.driver.DriverContext object

    t

    the object to create a broadcast variable out ot it

    clazz

    the Class of type T

    returns

    the broadcast variable

    Note

    as object hashes are added to fingerprints, objects of type T are supposed to have stable hashCode() methods, implemented based of the actual object content

  6. def toBroadcastNoUpdateFingerprints[T](context: DriverContext, t: T, clazz: Class[T]): Broadcast[T]

    Creates a broadcast variable out of an object instance of type T.

    Creates a broadcast variable out of an object instance of type T. WARNING: the recompilation won't be triggered if a change of the broadcast object is detected.

    The normal use of this functionality is to: - perform the broadcast creations during the driver setting up. - send the broadcast variables to the compiler which is under creation as a parameter.

    T

    the type of the broadcast variable under creation

    context

    the com.here.platform.data.processing.java.driver.DriverContext object

    t

    the object to create a broadcast variable out ot it

    clazz

    the Class of type T

    returns

    the broadcast variable

    Note

    As object hashes AREN'T added to the context fingerprints, so the recompilation won't be triggered if a change of the object is detected.

Inherited from AnyRef

Inherited from Any

Ungrouped