[Avg. reading time: 4 minutes]

API in Big Data World

Big data and REST APIs are often used together in modern data architectures. Here’s how they interact:

Ingestion gateway

  • Applications push events through REST endpoints
  • Gateway converts to Kafka, Kinesis, or file landing zones
  • REST is entry door, not the pipeline itself

Serving layer

  • Processed data in Hive, Elasticsearch, Druid, or Delta
  • APIs expose aggregated results to apps and dashboards
  • REST is read interface on top of heavy compute

Control plane

  • Spark job submission via REST
  • Kafka topic management
  • cluster monitoring and scaling
  • authentication and governance

Microservices boundary

  • Each service owns a slice of data
  • APIs expose curated views
  • internal pipelines stay streaming or batch

What REST is NOT in Big Data

  • Not used for bulk petabyte transfer
  • Not used inside Spark transformations
  • Not the transport between Kafka and processors

Example of API

https://docs.redis.com/latest/rs/references/rest-api/

https://rapidapi.com/search/big-data

https://www.kaggle.com/discussions/general/315241

#apiinbigdata #kafka #sparksVer 6.0.1

Last change: 2026-01-17