Skip to main content

Installation

Add IndexTables to your project.

Maven

<dependency>
<groupId>io.indextables</groupId>
<artifactId>indextables4spark_2.12</artifactId>
<version>0.4.0_3.5.3</version>
</dependency>

SBT

libraryDependencies += "io.indextables" %% "indextables4spark" % "0.4.0_3.5.3"

Spark Shell

spark-shell --packages io.indextables:indextables4spark_2.12:0.4.0_3.5.3

Databricks

  1. Download the shaded JAR from the releases page
  2. Upload it to a Unity Catalog volume (e.g., /Volumes/my_catalog/my_schema/artifacts/)
  3. Create an init script that copies the JAR to the Databricks jars directory:
#!/bin/sh
cp /Volumes/my_catalog/my_schema/artifacts/indextables_spark-0.4.0-linux-x86_64-shaded.jar /databricks/jars
  1. Upload the init script to your volume and configure it in your cluster settings under Advanced Options > Init Scripts

Requirements

ComponentVersion
Apache Spark3.5.3
Java11 or later
Scala2.12

Register SQL Extensions

To use SQL commands like MERGE SPLITS and PREWARM CACHE, register the extensions:

spark.sql("SET spark.sql.extensions=io.indextables.spark.extensions.IndexTables4SparkExtensions")

Or in spark-defaults.conf:

spark.sql.extensions=io.indextables.spark.extensions.IndexTables4SparkExtensions

Next Steps