Amazon Managed Streaming for Kafka (Amazon MSK) is a totally managed service that allows you to construct and run functions that use Apache Kafka to course of streaming knowledge. It runs open-source variations of Apache Kafka. This implies current functions, tooling, and plugins from companions and the Apache Kafka group are supported with out requiring modifications to utility code.
Prospects use Amazon MSK for real-time knowledge sharing with their finish prospects, who could possibly be inside groups or third events. These finish prospects handle Kafka shoppers, that are deployed in AWS, different managed cloud suppliers, or on premises. When migrating from self-managed to Amazon MSK or shifting shoppers between MSK clusters, prospects need to keep away from the necessity for Kafka consumer reconfiguration, to make use of a special Area Title System (DNS) title. Subsequently, it’s necessary to have a {custom} area title for the MSK cluster that the shoppers can talk to. Additionally, having a {custom} area title makes the catastrophe restoration (DR) course of easier as a result of shoppers don’t want to vary the MSK bootstrap tackle when both a brand new cluster is created or a consumer connection must be redirected to a DR AWS Area.
MSK clusters use AWS-generated DNS names which might be distinctive for every cluster, containing the dealer ID, MSK cluster title, two service generated sub-domains, and the AWS Area, ending with amazonaws.com
. The next determine illustrates this naming format.
MSK brokers use the identical DNS title for the certificates used for Transport Layer Safety (TLS) connections. The DNS title utilized by shoppers with TLS encrypted authentication mechanisms should match the first Frequent Title (CN), or Topic Different Title (SAN) of the certificates introduced by the MSK dealer, to keep away from hostname validation errors.
The answer mentioned on this submit gives a method so that you can use a {custom} area title for shoppers to connect with their MSK clusters when utilizing SASL/SCRAM (Easy Authentication and Safety Layer/ Salted Problem Response Mechanism) authentication solely.
Answer overview
Community Load Balancers (NLBs) are a preferred addition to the Amazon MSK structure, together with AWS PrivateLink as a method to expose connectivity to an MSK cluster from different digital personal clouds (VPCs). For extra particulars, see How Goldman Sachs builds cross-account connectivity to their Amazon MSK clusters with AWS PrivateLink. On this submit, we run by learn how to use an NLB to allow using a {custom} area title with Amazon MSK when utilizing SASL/SCRAM authentication.
The next diagram reveals all elements utilized by the answer.
SASL/SCRAM makes use of TLS to encrypt the Kafka protocol site visitors between the consumer and Kafka dealer. To make use of a {custom} area title, the consumer must be introduced with a server certificates matching that {custom} area title. As of this writing, it isn’t doable to change the certificates utilized by the MSK brokers, so this answer makes use of an NLB to sit down between the consumer and MSK brokers.
An NLB works on the connection layer (Layer 4) and routes the TCP or UDP protocol site visitors. It doesn’t validate the applying knowledge being despatched and forwards the Kafka protocol site visitors. The NLB gives the flexibility to make use of a TLS listener, the place a certificates is imported into AWS Certificates Supervisor (ACM) and related to the listener and allows TLS negotiation between the consumer and the NLB. The NLB performs a separate TLS negotiation between itself and the MSK brokers. This NLB TLS negotiation to the goal works precisely the identical no matter whether or not certificates are signed by a public or personal Certificates Authority (CA).
For the consumer to resolve DNS queries for the {custom} area, an Amazon Route 53 personal hosted zone is used to host the DNS data, and is related to the consumer’s VPC to allow DNS decision from the Route 53 VPC resolver.
Kafka listeners and marketed listeners
Kafka listeners (listeners
) are the lists of addresses that Kafka binds to for listening. A Kafka listener consists of a hostname or IP, port, and protocol: <protocol>://<hostname>:<port>
.
The Kafka consumer makes use of the bootstrap tackle to connect with one of many brokers within the cluster and points a metadata request. The dealer gives a metadata response containing the tackle data of every dealer that the consumer wants to connect with discuss to those brokers. Marketed listeners (marketed.listeners
) is a configuration possibility utilized by Kafka shoppers to connect with the brokers. By default, an marketed listener isn’t set. After it’s set, Kafka shoppers will use the marketed listener as an alternative of listeners
to acquire the connection data for brokers.
When Amazon MSK multi-VPC personal connectivity is enabled, AWS units the marketed.listeners
configuration possibility to incorporate the Amazon MSK multi-VPC DNS alias.
MSK brokers use the listener configuration to inform shoppers the DNS names to make use of to connect with the person brokers for every authentication kind enabled. Subsequently, when shoppers are directed to make use of the {custom} area title, you’ll want to set a {custom} marketed listener for SASL/SCRAM authentication protocol. Marketed listeners are distinctive to every dealer; the cluster gained’t begin if a number of brokers have the identical marketed listener tackle.
Kafka bootstrap course of and setup choices
A Kafka consumer makes use of the bootstrap addresses to get the metadata from the MSK cluster, which in response gives the dealer hostname and port (the listeners data by default or the marketed listener if it’s configured) that the consumer wants to connect with for subsequent requests. Utilizing this data, the consumer connects to the suitable dealer for the subject or partition that it must ship to or fetch from. The next diagram reveals the default bootstrap and subject or partition connectivity between a Kafka consumer and MSK dealer.
You have got two choices when utilizing a {custom} area title with Amazon MSK.
Possibility 1: Solely a bootstrap connection by an NLB
You should utilize a {custom} area title just for the bootstrap connection, the place the marketed listeners usually are not set, so the consumer is directed to the default AWS cluster DNS title. This feature is useful when the Kafka consumer has direct community connectivity to each the NLB and the MSK dealer’s Elastic Community Interface (ENI). The next diagram illustrates this setup.
No modifications are required to the MSK brokers, and the Kafka consumer has the {custom} area set because the bootstrap tackle. The Kafka consumer makes use of the {custom} area bootstrap tackle to ship a get metadata request to the NLB. The NLB sends the Kafka protocol site visitors obtained by the Kafka consumer to a wholesome MSK dealer’s ENI. That dealer responds with metadata the place solely listeners
is about, containing the default MSK cluster DNS title for every dealer. The Kafka consumer then makes use of the default MSK cluster DNS title for the suitable dealer and connects to that dealer’s ENI.
Possibility 2: All connections by an NLB
Alternatively, you should use a {custom} area title for the bootstrap and the brokers, the place the {custom} area title for every dealer is about within the marketed listeners configuration. It is advisable to use this selection when Kafka shoppers don’t have direct community connectivity to the MSK brokers ENI. For instance, Kafka shoppers want to make use of an NLB, AWS PrivateLink, or Amazon MSK multi-VPC endpoints to connect with an MSK cluster. The next diagram illustrates this setup.
The marketed listeners are set to make use of the {custom} area title, and the Kafka consumer has the {custom} area set because the bootstrap tackle. The Kafka consumer makes use of the {custom} area bootstrap tackle to ship a get metadata request, which is shipped to the NLB. The NLB sends the Kafka protocol site visitors obtained by the Kafka consumer to a wholesome MSK dealer’s ENI. That dealer responds with metadata the place marketed listeners is about. The Kafka consumer makes use of the {custom} area title for the suitable dealer, which directs the connection to the NLB, for the port set for that dealer. The NLB sends the Kafka protocol site visitors to that dealer.
Community Load Balancer
The next diagram illustrates the NLB port and goal configuration. A TLS listener with port 9000 is used for bootstrap connections with all MSK brokers set as targets. The listener makes use of TLS goal kind with goal port as 9096. A TLS listener port is used to signify every dealer within the MSK cluster. On this submit, there are three brokers within the MSK cluster with TLS 9001, representing dealer 1, as much as TLS 9003, representing dealer 3.
For all TLS listeners on the NLB, a single imported certificates with the area title bootstrap.instance.com
is hooked up to the NLB. bootstrap.instance.com
is used because the Frequent Title (CN) in order that the certificates is legitimate for the bootstrap tackle, and Topic Different Names (SANs) are set for all dealer DNS names. If the certificates is issued by a non-public CA, shoppers have to import the foundation and intermediate CA certificates to the belief retailer. If the certificates is issued by a public CA, the foundation and intermediate CA certificates can be within the default belief retailer.
The next desk reveals the required NLB configuration.
NLB Listener Kind | NLB Listener Port | Certificates | NLB Goal Kind | NLB Targets |
TLS | 9000 | bootstrap.instance.com | TLS | All Dealer ENIs |
TLS | 9001 | bootstrap.instance.com | TLS | Dealer 1 |
TLS | 9002 | bootstrap.instance.com | TLS | Dealer 2 |
TLS | 9003 | bootstrap.instance.com | TLS | Dealer 3 |
Area Title System
For this submit, a Route 53 personal hosted zone is used to host the DNS data for the {custom} area, on this case instance.com
. The personal hosted zone is related to the Amazon MSK VPC, to allow DNS decision for the consumer that’s launched in the identical VPC. In case your consumer is in a special VPC than the MSK cluster, you’ll want to affiliate the personal hosted zone with that consumer’s VPC.
The Route 53 personal hosted zone isn’t a required a part of the answer. Probably the most essential half is that the consumer can carry out DNS decision towards the {custom} area and get the required responses. You’ll be able to as an alternative use your group’s current DNS, a Route 53 public hosted zone or Route 53 inbound resolver to resolve Route 53 personal hosted zones from exterior of AWS, or another DNS answer.
The next determine reveals the DNS data utilized by the consumer to resolve to the NLB. We use bootstrap
for the preliminary consumer connection, and use b-1
, b-2
, and b-3
to reference every dealer’s title.
The next desk lists the DNS data required for a three-broker MSK cluster when utilizing a Route 53 personal or public hosted zone.
File | File Kind | Worth |
bootstrap | A | NLB Alias |
b-1 | A | NLB Alias |
b-2 | A | NLB Alias |
b-3 | A | NLB Alias |
The next desk lists the DNS data required for a three-broker MSK cluster when utilizing different DNS options.
File | File Kind | Worth |
bootstrap | C | NLB DNS A File (e.g. name-id.elb.area.amazonaws.com) |
b-1 | C | NLB DNS A File |
b-2 | C | NLB DNS A File |
b-3 | C | NLB DNS A File |
Within the following sections, we undergo the steps to configure a {custom} area title in your MSK cluster and shoppers connecting with the {custom} area.
Conditions
To deploy the answer, you want the next conditions:
Launch the CloudFormation template
Full the next steps to deploy the CloudFormation template:
- Select Launch Stack.
- Present the stack title as
msk-custom-domain
. - For MSKClientUserName, enter the consumer title of the key used for SASL/SCRAM authentication with Amazon MSK.
- For MSKClientUserPassword, enter the password of the key used for SASL/SCRAM authentication with Amazon MSK.
The CloudFormation template will deploy the next assets:
Arrange the EC2 occasion
Full the next steps to configure your EC2 occasion:
- On the Amazon EC2 console, connect with the occasion
msk-custom-domain-KafkaClientInstance1
utilizing Session Supervisor, a functionality of AWS Techniques Supervisor. - Change to
ec2-user
: - Run the next instructions to configure the SASL/SCRAM consumer properties, create Kafka entry management lists (ACLs), and create a subject named
buyer
:
Create a certificates
For this submit, we use self-signed certificates. Nevertheless, it’s advisable to make use of both a public certificates or a certificates signed by your group’s personal key infrastructure (PKI).
For those who’re are utilizing an AWS personal CA for the personal key infrastructure, confer with Creating a non-public CA for directions to create and set up a non-public CA.
Use the openSSL
command to create a self-signed certificates. Modify the next command, including the nation code, state, metropolis, and firm:
You’ll be able to verify the created certificates utilizing the next command:
Import the certificates to ACM
To make use of the self-signed certificates for the answer, you’ll want to import the certificates to ACM:
After it’s imported, you may see the certificates in ACM.
Import the certificates to the Kafka consumer belief retailer
For the consumer to validate the server SSL certificates through the TLS handshake, you’ll want to import the self-signed certificates to the consumer’s belief retailer.
- Run the next command to make use of the JVM belief retailer to create your consumer belief retailer:
- Import the self-signed certificates to the belief retailer through the use of the next command. Present the keystore password as
changeit
. - It is advisable to embody the belief retailer certificates location config properties utilized by Kafka shoppers to allow certification validation:
Arrange DNS decision for shoppers throughout the VPC
To arrange DNS decision for shoppers, create a non-public hosted zone for the area and affiliate the hosted zone with the VPC the place the consumer is deployed:
Create EC2 goal teams
Goal teams route requests to particular person registered targets, resembling EC2 cases, utilizing the protocol and port quantity that you just specify. You’ll be able to register a goal with a number of goal teams and you’ll register a number of targets to 1 goal group.
For this submit, you want 4 goal teams: one for every dealer occasion and one that may level to all of the brokers and can be utilized by shoppers for Amazon MSK connection bootstrapping.
The goal group will obtain site visitors on port 9096 (SASL/SCRAM authentication) and can be related to the Amazon MSK VPC:
Register goal teams with MSK dealer IPs
It is advisable to affiliate every goal group with the dealer occasion (goal) within the MSK cluster in order that the site visitors going by the goal group might be routed to the person dealer occasion.
Full the next steps:
- Get the MSK dealer hostnames:
This could present the brokers, that are a part of bootstrap tackle. The hostname of dealer 1 appears to be like like the next code:
To get the hostname of different brokers within the cluster, exchange b-1
with values like b-2
, b-3
, and so forth. For instance, when you’ve got six brokers within the cluster, you’ll have six dealer hostnames beginning with b-1
to b-6
.
- To get the IP tackle of the person brokers, use the
nslookup
command:
- Modify the next instructions with the IP addresses of every dealer to create an atmosphere variable that can be used later:
Subsequent, you’ll want to register the dealer IP with the goal group. For dealer b-1
, you’ll register the IP tackle with goal group b-1
.
- Present the goal group title
b-1
to get the goal group ARN. Then register the dealer IP tackle with the goal group.
- Iterate the steps of acquiring the IP tackle from different dealer hostnames and register the IP tackle with the corresponding goal group for brokers
b-2
andb-3
:
- Additionally, you’ll want to register all three dealer IP addresses with the goal group b-all-bootstrap. This goal group can be used for routing the site visitors for the Amazon MSK consumer connection bootstrap course of.
Arrange NLB listeners
Now that you’ve the goal teams created and certificates imported, you’re able to create the NLB and listeners.
Create the NLB with the next code:
Subsequent, you configure the listeners that can be utilized by the shoppers to speak with the MSK cluster. It is advisable to create 4 listeners, one for every goal group for ports 9000–9003. The next desk lists the listener configurations.
Protocol | Port | Certificates | NLB Goal Kind | NLB Targets |
TLS | 9000 | bootstrap.instance.com | TLS | b-all-bootstrap |
TLS | 9001 | bootstrap.instance.com | TLS | b-1 |
TLS | 9002 | bootstrap.instance.com | TLS | b-2 |
TLS | 9003 | bootstrap.instance.com | TLS | b-3 |
Use the next code for port 9000:
Use the next code for port 9001:
Use the next code for port 9002:
Use the next code for port 9003:
Allow cross-zone load balancing
By default, cross-zone load balancing is disabled on NLBs. When disabled, every load balancer node distributes site visitors to wholesome targets in the identical Availability Zone. For instance, requests that come into the load balancer node in Availability Zone A will solely be forwarded to a wholesome goal in Availability Zone A. If the one wholesome goal or the one registered goal related to an NLB listener is in one other Availability Zone than the load balancer node receiving the site visitors, the site visitors is dropped.
As a result of the NLB has the bootstrap listener that’s related to a goal group that has all brokers registered throughout a number of Availability Zones, Route 53 will reply to DNS queries towards the NLB DNS title with the IP tackle of NLB ENIs in Availability Zones with wholesome targets.
When the Kafka consumer tries to connect with a dealer by the dealer’s listener on the NLB, there can be a noticeable delay in receiving a response from the dealer because the consumer tries to connect with the dealer utilizing all IPs returned by Route 53.
Enabling cross-zone load balancing distributes the site visitors throughout the registered targets in all Availability Zones.
Create DNS A data in a non-public hosted zone
Create DNS A data to route the site visitors to the community load balancer. The next desk lists the data.
File | File Kind | Worth |
bootstrap | A | NLB Alias |
b-1 | A | NLB Alias |
b-2 | A | NLB Alias |
b-3 | A | NLB Alias |
Alias document sorts can be used, so that you want the NLB’s DNS title and hosted zone ID:
Create the bootstrap
document, after which repeat this command to create the b-1
, b-2
, and b-3
data, modifying the Title
subject:
Optionally, to optimize cross-zone knowledge fees, you may set b-1
, b-2
, and b-3
to the IP tackle of the NLB’s ENI that’s in the identical Availability Zone as every dealer. For instance, if b-2
is utilizing an IP tackle that’s in subnet 172.16.2.0/24
, which is in Availability Zone A, it is best to use the NLB ENI that’s in the identical Availability Zone as the worth for the DNS document.
The subsequent step particulars learn how to use a {custom} area title for bootstrap connectivity solely. If all Kafka site visitors must undergo the NLB, as mentioned earlier, proceed to the following part to arrange marketed listeners.
Configure the marketed listener within the MSK cluster
To get the listener particulars for dealer 1, you present entity-type as brokers and entity-name as 1 for the dealer ID:
You’ll get an output like the next:
Going ahead, shoppers will join by the {custom} area title. Subsequently, you’ll want to configure the marketed listeners to the {custom} area hostname and port. For this, you’ll want to copy the listener particulars and alter the CLIENT_SASL_SCRAM
listener to b-1.instance.com:9001
.
When you’re configuring the marketed listener, you additionally have to protect the details about different listener sorts within the marketed listener as a result of inter-broker communications additionally use the addresses within the marketed listener.
Primarily based on our configuration, the marketed listener for dealer 1 will seem like the next code, with every thing after delicate=false
eliminated:
Modify the next command as follows:
- <<BROKER_NUMBER>> – Set to the dealer ID being modified (for instance, 1 for dealer 1)
- <<PORT_NUMBER>> – Set to the port quantity similar to dealer ID (for instance, 9001 for dealer 1)
- <<REPLICATION_DNS_NAME>> – Set to the DNS title for
REPLICATION
- <<REPLICATION_SECURE_DNS_NAME>> – Set to the DNS title for
REPLICATION_SECURE
The command ought to look one thing like the next instance:
Run the command so as to add the marketed listener for dealer 1.
It is advisable to get the listener
particulars for the opposite brokers and configure the marketed.listener
for every.
Check the setup
Set the bootstrap tackle to the {custom} area. That is the A document created within the personal hosted zone.
Record the MSK subjects utilizing the {custom} area bootstrap tackle:
It is best to see the subject buyer
.
Clear up
To cease incurring prices, it’s advisable to manually delete the personal hosted zone, NLB, goal teams, and imported certificates in ACM. Additionally, delete the CloudFormation stack to take away any assets provisioned by CloudFormation.
Use the next code to manually delete the aforementioned assets:
It is advisable to wait as much as 5 minutes for the completion of the NLB deletion:
Now you may delete the CloudFormation stack.
Abstract
This submit explains how you should use an NLB, Route 53, and the marketed listener configuration possibility in Amazon MSK to assist {custom} domains with MSK clusters when utilizing SASL/SCRAM authentication. You should utilize this answer to maintain your current Kafka bootstrap DNS title and cut back or take away the necessity to change consumer functions due to a migration, restoration course of, or multi-cluster excessive availability. It’s also possible to use this answer to have the MSK bootstrap and dealer names beneath your {custom} area, enabling you to convey the DNS title in step with your naming conference (for instance, msk.prod.instance.com
).
Attempt the answer out for your self, and go away your questions and suggestions within the feedback part.
In regards to the Authors
Subham Rakshit is a Senior Streaming Options Architect for Analytics at AWS based mostly within the UK. He works with prospects to design and construct streaming architectures to allow them to get worth from analyzing their streaming knowledge. His two little daughters hold him occupied more often than not exterior work, and he loves fixing jigsaw puzzles with them. Join with him on LinkedIn.
Mark Taylor is a Senior Technical Account Supervisor at Amazon Internet Providers, working with enterprise prospects to implement greatest practices, optimize AWS utilization, and tackle enterprise challenges. Previous to becoming a member of AWS, Mark spent over 16 years in networking roles throughout industries, together with healthcare, authorities, training, and funds. Mark lives in Folkestone, England, together with his spouse and two canines. Outdoors of labor, he enjoys watching and enjoying soccer, watching motion pictures, enjoying board video games, and touring.