docs
Databricks deployment process
Databricks deployment process
Interger ids with spark
There is a bunch of different ways of generating numeric values in spark that can serve the purpose of unique identifiers. Here are some functions to utilize to accomplish that:
row_number()
monotonically_increasing_id()
rdd.zipWithIndex()
- hash functions