object RefTreeUtils
- Alphabetic
- By Inheritance
- RefTreeUtils
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
type
Refs = Map[(Id, Id), RDD[(Key, Map[RefName, Set[Key]])]]
Type for storing References in an RDD.
Type for storing References in an RDD. Represents a map. The key of the map is (catalog for source layer, source layer). The value of the map is (source partition key, [reference name -> target partition key]). In other words it is mapping between partitions on the source layer to partitions on the target layer. This mapping is references.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
calcRefs(refTree: RefTree, inData: RDD[(Key, Meta)], resolveIn: ResolveInFn, numThreads: Int, storageLevel: StorageLevel, layerPartitioning: Map[(Id, Id), Scheme], debugConfig: ExecutorDebugConfig): Refs
Calculates all references using the given resolver.
Calculates all references using the given resolver.
This function takes all the source layers from RefTree which doesn't have empty references and resolves all their partitions, producing a map from subject catalog-layer pairs to a mapping from source to target partition keys.
If numThreads > 1, calls are executed in parallel in an ExecutorService.
- refTree
a RefTree defining the reference structure
- inData
input data, will be accessed multiple times and must therefore be persisted or easy to rebuild from persisted data
- resolveIn
the interface implementing the function that resolves references
- numThreads
the number of resolving threads
- storageLevel
the storage level to be used when persisting
- layerPartitioning
layer partitioning configuration
- returns
a map of one RDD per layer, with all the references for this layer
-
def
calcSubRefs(refTree: RefTree, refStructure: Seq[RefBase], subset: RDD[(Key, Meta)], inData: RDD[(Key, Meta)], resolveIn: ResolveInFn, numThreads: Int, storageLevel: StorageLevel, layerPartitioning: Map[(Id, Id), Scheme], debugConfig: ExecutorDebugConfig, precomputedRefs: Refs = Map.empty): Refs
Calculates all references required to compile a given subset of input partitions using the given resolver.
Calculates all references required to compile a given subset of input partitions using the given resolver.
Iterates over RefTree structure and if layer has references, resolves all partitions from subset on inData for this layer and filters for relevant references. Recursively calculating inner references and merging them.
If numThreads > 1, calls are executed in parallel in an ExecutorService.
- refTree
a RefTree defining the reference structure. It is only used for consistency checks of the data returned by the resolver
- refStructure
this should be the sequence of entries on a given level in the passed refTree. Resolving will be started in this level, using the given partitions in the passed subset
- subset
a subset on inData which will be resolved. All resolved tiles will again be recursively resolved
- inData
entire Input Data, will be accessed multiple times and must therefore be persisted or easy to recompute
- resolveIn
the interface implementing the function that resolves references
- numThreads
the number of resolving threads
- storageLevel
the storage level to be used when persisting
- layerPartitioning
layer partitioning configuration
- debugConfig
debug configuration of the executor
- precomputedRefs
precomputed references that can be used to skip some resolution steps. If for a given layer an RDD exists, for a given step of the resolution, corresponding to a specific level of the reftree (i.e. a set of reference names), if a key contains a set of keys for all reference names, the precomputed references are used, otherwise
resolveIn
is called- returns
a Map of one RDD per layer, with all the references for this layer
- Note
This implementation may resolve the same tile multiple times, in case the same layer appears on different levels in the tree. For large subsets it can be significantly slower than using RefTreeUtils.calcRefs and resolving the entire input data
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
collectRefSources(refTree: RefTree): Map[RefName, (Id, Id)]
Creates a Map that for each ref name contains the source catalog and layer.
Creates a Map that for each ref name contains the source catalog and layer.
- refTree
a RefTree to convert to the map.
- returns
a Map that maps reference names to source layers.
-
def
collectRefTargets(refTree: RefTree): Map[RefName, (Id, Id)]
Creates a Map that for each ref name contains the target catalog and layer.
Creates a Map that for each ref name contains the target catalog and layer.
- refTree
RefTree to convert to the map
- returns
a Map which contains the target catalog and layer for each ref name
-
def
collectSourceLayers(refTree: RefTree): Set[(Id, Id)]
Collects all layers which are the source of a reference.
Collects all layers which are the source of a reference.
- refTree
a RefTree
- returns
a Set with all layers which are the source of a reference
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
gatherReferences(refTree: RefTree, impacted: RDD[(Key, Unit)], refs: Refs, inData: RDD[(Key, Meta)], storageLevel: StorageLevel): RDD[(Key, (Key, Meta))]
Gathers all references needed by the partitions in impacted.
Gathers all references needed by the partitions in impacted.
- refTree
a RefTree defining the structure of the references
- impacted
partition keys for which references shall be gathered
- refs
the Refs object with references
- inData
input data, needed to add Meta to references. Will be accessed multiple times and must therefore be persisted to a StorageLevel other than None
- storageLevel
the storage level to be used when persisting
- returns
all references for the impacted partitions (srcKey, (refKey, refMeta))
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getImpacted(refTree: RefTree, inChanges: RDD[(Key, Change)], refs: Refs): RDD[(Key, Unit)]
Calculates the partitions impacted by a change.
Calculates the partitions impacted by a change. It uses a RefTree to know the structure of the references, and pre-resolved references in a Refs data structure to follow them.
- refTree
a RefTree defining the structure of the references
- inChanges
the partitions that need to be checked for impact
- refs
the Refs pre-resolved references to use for reference resolution
- returns
an RDD with all partitions impacted by the inChanges according to the given RefTree. This RDD does NOT have duplicate entries
-
def
groupReferences(gathered: RDD[(Key, (Key, Meta))]): RDD[(Key, Map[Key, Meta])]
Groups the references gathered by gatherReferences into Maps.
Groups the references gathered by gatherReferences into Maps.
A sanity check is implemented to enforce that for the same input key there is one and only one metadata.
- gathered
references gathered and partitioned by key, one entry per reference
- returns
the references grouped in a Map, for every subject key
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
updateRefs(refTree: RefTree, inChanges: RDD[(Key, Change)], oldRefs: Refs, resolveIn: ResolveInFn, numThreads: Int, storageLevel: StorageLevel, layerPartitioning: Map[(Id, Id), Scheme], debugConfig: ExecutorDebugConfig): Refs
Updates the references using the given resolver.
Updates the references using the given resolver.
- refTree
a RefTree defining the reference structure
- inChanges
input Data with the changes, will be accessed multiple times and must therefore be persisted or easy to recompute
- oldRefs
the references from the previous map version, where the impact of inChanges has to be applied. It must have been created with the same reference structure as given in RefTree
- resolveIn
the interface implementing the function that resolves references
- numThreads
the number of resolving threads
- storageLevel
the storage level to be used when persisting
- layerPartitioning
layer partitioning configuration
- returns
the Refs data structures, where all RDDs are persisted
-
def
verifyRefTreeAgainstInputLayers(refTree: RefTree, inLayers: Map[Id, Set[Id]]): Unit
Verifies that catalogs and layers mentioned in RefTree are declared as input layers from compiler.
Verifies that catalogs and layers mentioned in RefTree are declared as input layers from compiler. Otherwise throws
IllegalArgumentException
.- refTree
refStructure to be verified
- inLayers
input layers configured for compiler
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()