SPARK-21195 - Automatically register new metrics from sources and wire default registry
SPARK-20952 - ParquetFileFormat should forward TaskContext to its forkjoinpool
SPARK-20001 (SPARK-13587) - Support PythonRunner executing inside a Conda env (and R)
SPARK-17059 - Allow FileFormat to specify partition pruning strategy via splits
SPARK-24345 - Improve ParseError stop location when offending symbol is a token
SPARK-23795 - Make AbstractLauncher#self() protected
SPARK-18079 - CollectLimitExec.executeToIterator should perform per-partition limits
SPARK-15777 (Partial fix) - Catalog federation
- make ExternalCatalog configurable beyond in memory and hive
- FileIndex for catalog tables is provided by external catalog instead of using default impl
Better pushdown for IN expressions in parquet via UserDefinedPredicate (SPARK-17091 for original issue)
SafeLogging implemented for the following files:
- core: Broadcast, CoarseGrainedExecutorBackend, CoarseGrainedSchedulerBackend, Executor, MemoryStore, SparkContext, TorrentBroadcast
- kubernetes: ExecutorPodsAllocator, ExecutorPodsLifecycleManager, ExecutorPodsPollingSnapshotSource, ExecutorPodsSnapshot, ExecutorPodsWatchSnapshotSource, KubernetesClusterSchedulerBackend
- yarn: YarnClusterSchedulerBackend, YarnSchedulerBackend
SPARK-26626 - Limited the maximum size of repeatedly substituted aliases
- Gradle plugin to easily create custom docker images for use with k8s
- Filter rLibDir by exists so that daemon.R references the correct file (#460)
- Add pre-installed conda configuration and use to find rlib directory (#700)
- Supports Arrow-serialization of Python 2 strings (#678)
- SPARK-25908 - Removal of
- SPARK-25862 - Removal of
- SPARK-26127 - Removal of deprecated setters from tree regression and classification models
- SPARK-25867 - Removal of KMeans computeCost
- SPARK-26216 - Change to UserDefinedFunction type
- SPARK-26323 - Scala UDF null checking
- SPARK-26580 - Bring back scala 2.11 behaviour of primitive types null behaviour
- SPARK-26133 - Old OneHotEncoder
- SPARK-11215 - StringIndexer multi column support
- SPARK-26616 - No document frequency in IDFModel