Skip to main content

Supported Schema Types

IndexTables supports all common Spark data types.

Primitive Types

Spark TypeTantivy TypeFilter PushdownFast Field
StringTypeText/StringYesNo
IntegerTypeI64YesYes
LongTypeI64YesYes
FloatTypeF64YesYes
DoubleTypeF64YesYes
BooleanTypeBoolYesYes
DateTypeDateYesYes
TimestampTypeDateTimeYesYes
BinaryTypeBytesNoNo

Complex Types

Spark TypeTantivy TypeFilter PushdownNotes
StructTypeJSONYes (nested)Auto-detected
ArrayTypeJSONPartialElement access
MapTypeJSONYes (keys)Keys as strings

String vs Text

String fields (default):

  • Exact value matching
  • Full filter pushdown
  • Use for: IDs, categories, status codes

Text fields:

  • Tokenized for full-text search
  • IndexQuery only
  • Use for: Documents, logs, descriptions
// Configure field type
.option("spark.indextables.indexing.typemap.title", "string")
.option("spark.indextables.indexing.typemap.content", "text")

Date and Timestamp

// Spark DateType -> Tantivy Date
df.filter($"date" === "2024-01-15")

// Spark TimestampType -> Tantivy DateTime
df.filter($"timestamp" >= "2024-01-15T10:00:00")

Binary

Binary fields are stored but not searchable:

// Stored for retrieval, not filterable
val df = spark.read.format("indextables").load("path")
df.select("binary_field").show()