8. Applying spatial analytics in the Era of Big Data¶

Most of the methods in spatial data science were developed in a time when the data volumes were much smaller. In today’s world it is quite typical to have very large datasets (e.g. billions of points) which bring challenges to store, process and analyze such data. For example conducting clustering with billion points using K-Means++ algorithm takes ages, hence, conducting such clustering task have required developing a whole new approach that allows parallelizing the clustering process using a dedicated K-Means|| algorithm (“Scalable K-Means”, Bahmani et al. 2012).

Warning

Nothing here. This is just a placeholder for demo purposes.

Would include information about:

parallel computing (take everything out of your computer)

distributed computing on a cluster

how to measure the performance of your code

common pitfalls that slows down your analytical processes