Erlang MapReduce: Learn with Quiz and Flashcards

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What happens to latency when the requests per second increase?

It goes up (correct)

It stays the same

It goes down

It fluctuates randomly

What can cause high percentile latencies?

Reduced packet loss

Network congestion (correct)

Server efficiency

Low network traffic

What is the purpose of using load testing tools like JMeter and Gatling?

To stress the backend (correct)

To improve network efficiency

To decrease latency

To increase server downtime

What do histograms and heat maps help track in relation to latency?

Tail latencies Signup and view all the answers

What does MTTR measure?

Mean Time to Recovery Signup and view all the answers

Why is reducing the MTTR crucial to improving availability?

To repair failed systems faster Signup and view all the answers

What is the purpose of the Protocol Manager in Uber's API gateway?

Translate different protocol payloads Signup and view all the answers

Which layer of the API gateway contains functions like authentication, authorization, rate limiting, and monitoring?

Middleware Signup and view all the answers

What is the role of the Endpoint Handler in Uber's API gateway?

Validate requests and transform them for backend services Signup and view all the answers

Which layer is responsible for making the actual calls to backend microservices in Uber's API gateway?

Requests to Backend Services Signup and view all the answers

In Uber's API gateway, where does the gateway have to deal with different types of protocol payloads like JSON, Thrift, Protobuf?

Protocol Manager Signup and view all the answers

Which layer of Uber's API gateway adds new middleware functions through a user interface?

Middleware Signup and view all the answers

What is the purpose of the partitioning function in the MapReduce process?

To group together key/value pairs with the same key Signup and view all the answers

How does the default partitioning function typically work in MapReduce?

It uses hashing to distribute key/value pairs into regions Signup and view all the answers

What happens after the master assigns reduce tasks to workers in MapReduce?

The workers read the intermediate key/value pairs from their region and sort them Signup and view all the answers

Why might a user want to provide their own partitioning function in MapReduce?

To group specific keys together in output files Signup and view all the answers

What is the main purpose of the Reduce function in MapReduce?

To combine occurrences of the same key Signup and view all the answers

After completing all map and reduce tasks, what action does the master take in MapReduce?

It wakes up the user program Signup and view all the answers

What strategy did Robinhood implement to handle the significant increase in system requests?

Horizontal scalability through sharding Signup and view all the answers

What is the purpose of the Routing Layer in Robinhood's infrastructure?

Handles routing external API requests to the correct shards Signup and view all the answers

What is the main function of the Aggregation Layer in Robinhood's system?

Acts as an intermediary layer for internal API traffic Signup and view all the answers

How did Robinhood make their system horizontally scalable?

Implemented sharding at the application layer Signup and view all the answers

Why does Robinhood split up their large dataset into smaller partitions?

To achieve better performance and scalability Signup and view all the answers

What is a key feature of Erlang according to the text?

Concurrency Signup and view all the answers

Which platform is mentioned as a collection of middleware, libraries, and tools for Erlang?

OTP Signup and view all the answers

What role does each shard play in Robinhood's infrastructure?

Holds a subset of users with its own resources Signup and view all the answers

What is Mnesia in relation to Erlang?

Distributed database part of OTP Signup and view all the answers

Why is Erlang described as requiring getting used to if you're unfamiliar with the paradigm?

It has a steep learning curve Signup and view all the answers

What benefit does Erlang bring in terms of code updating without application restarts?

Hotswapping code Signup and view all the answers

Which OS is mentioned as being used by WhatsApp for their servers?

FreeBSD Signup and view all the answers

Study Notes

Scalability and Latency

As requests per second increase, latency also increases until autoscaling kicks in and more machines are added to the server pool.
Load testing tools like JMeter and Gatling can be used to stress test the backend and measure average and percentile latencies.
High percentile latencies (e.g. 99.9% or 99.99% of all responses) may be important to measure, depending on the application.

Measuring Latency

Histograms and heat maps are commonly used to track tail latencies.
Tail latencies can be caused by network congestion, garbage collection pauses, packet loss, contention, and more.

Other Metrics

Mean Time to Recovery (MTTR) measures the average time it takes to repair a failed system.
Mean Time Between Maintenance (MTBM) measures the average time between maintenance activities on a system.

API Gateway

Uber's API gateway handles data payloads across multiple protocols and client device types.
The gateway has several layers: Protocol Manager, Middleware, Endpoint Handler, and Requests to Backend Services.
The Protocol Manager translates protocol payloads like JSON, Thrift, and Protobuf.
Middleware functions include authentication, authorization, rate limiting, logging, and monitoring.
The Endpoint Handler validates requests and transforms them for backend services.

Erlang

Erlang is a functional language designed for concurrency and fault tolerance.
It has high developer productivity and concise code.
Erlang allows for hot-swapping code without restarting the application.
WhatsApp uses Erlang for their backend, which enables quick iteration and long uptimes.

FreeBSD

WhatsApp uses FreeBSD as their operating system due to the founders' experience at Yahoo.
FreeBSD is used extensively in WhatsApp's servers.

MapReduce

The map phase partitions intermediate key/value pairs into R regions using a partitioning function.
The master assigns reduce tasks to workers and notifies them of the storage locations for their assigned key/value pairs.
Reduce workers read intermediate key/value pairs, sort them, and pass them to the user's Reduce function.

Robinhood's Scalability

In 2019, Robinhood's system received 100k requests per second at peak time, increasing to 750k requests per second in 2020.
To respond to this growth, Robinhood implemented sharding at the application layer to make their brokerage infrastructure horizontally scalable.
Sharding involves splitting a large dataset into smaller partitions (shards) with each shard having its own application servers, database, and deployment pipeline.
Robinhood built a routing layer to route external API requests to the correct shards and an aggregation layer to enable internal API traffic.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Description

Learn about the features and advantages of Erlang programming language, designed for concurrency and fault tolerance, with high developer productivity. Understand how Erlang's functional nature and concise syntax contribute to its power. Explore the Open Telecom Platform (OTP) as part of Erlang's ecosystem.

Erlang Programming Language Overview

Choose a study mode