Installation

Add IndexTables to your project.

Maven

<dependency>
  <groupId>io.indextables</groupId>
  <artifactId>indextables_spark</artifactId>
  <version>0.5.3_spark_3.5.3</version>
  <classifier>linux-x86_64-shaded</classifier>
</dependency>

SBT

libraryDependencies += "io.indextables" % "indextables_spark" % "0.5.3_spark_3.5.3" classifier "linux-x86_64-shaded"

Spark Shell

spark-shell --packages io.indextables:indextables_spark:0.5.3_spark_3.5.3:linux-x86_64-shaded

Databricks

Download the shaded JAR from Maven Central:

https://repo1.maven.org/maven2/io/indextables/indextables_spark/0.5.3_spark_3.5.3/indextables_spark-0.5.3_spark_3.5.3-linux-x86_64-shaded.jar

Upload it to a Unity Catalog volume (e.g., /Volumes/my_catalog/my_schema/artifacts/)
Create an init script that copies the JAR to the Databricks jars directory:

#!/bin/sh
cp /Volumes/my_catalog/my_schema/artifacts/indextables_spark-0.5.3_spark_3.5.3-linux-x86_64-shaded.jar /databricks/jars

Upload the init script to your volume and configure it in your cluster settings under Advanced Options > Init Scripts

Requirements

Component	Version
Apache Spark	3.5.3
Java	11 or later
Scala	2.12

Register SQL Extensions

To use SQL commands like MERGE SPLITS and PREWARM CACHE, register the extensions:

spark.sql("SET spark.sql.extensions=io.indextables.spark.extensions.IndexTables4SparkExtensions")

Or in spark-defaults.conf:

spark.sql.extensions=io.indextables.spark.extensions.IndexTables4SparkExtensions

Next Steps

Quickstart - Create your first index in 5 minutes
First Production Index - Deploy to S3 or Azure

Maven​

SBT​

Spark Shell​

Databricks​

Requirements​

Register SQL Extensions​

Next Steps​

Maven

SBT

Spark Shell

Databricks

Requirements

Register SQL Extensions

Next Steps