Batch Processing

Back to Data-Science

Is an efficient way of processing high volumes of data where a group of transactions is collected over a period of time. Data is collected and the processed in batches, returning a batch result (Hadoop is focused on batch data processing). It recieves arriving data, combines it with historical data and recomputes results by iterating over the entire combined data set.

Batch layer operates on the entire data set, yielding the most accurate results. Results come at the cost of high latency due to high computation time.

Real Time Data Processing

Involves continual input, process and data output (Radar systems, bank ATMs). Allows the ability to take immediate action when low latency matters.

source