Anda di halaman 1dari 14

Set up a Lab Environment in

Amazon Web Services


Summer 2014

Error! Reference source not found.Set up a Lab Environment in AWS

This Certified Training Services Partner Program Guide (the Program Guide) is protected under
U.S. and international copyright laws, and is the exclusive property of MapR Technologies,
Inc. 2014, MapR Technologies, Inc. All rights reserved.

PROPRIETARY AND CONFIDENTIAL INFORMATION



2014 MapR Technologies, Inc. All Rights Reserved.

ii

Error! Reference source not found.Set up a Lab Environment in AWS

Contents
Set up a Lab Environment in Amazon Web Services (AWS) ................................ 4
Part 1: Set up a lab environment in Amazon Web Services (AWS) ............................................. 4
Lab Procedure ............................................................................................................................. 4
Create an AWS Account .......................................................................................................... 4
Configure Virtual Private Cloud (VPC) Networking ................................................................. 5
Create AWS Virtual Machine Instances for Hadoop Installation ............................................. 6
Create an AWS VM Instance for NFS Access ........................................................................... 9
Log in to AWS Nodes ............................................................................................................. 10
Managing your Nodes ........................................................................................................... 11
Terminating Your Instances and EBS Storage ....................................................................... 12
Part 2: Setup passwordless ssh access between nodes ............................................................ 14

PROPRIETARY AND CONFIDENTIAL INFORMATION



2014 MapR Technologies, Inc. All Rights Reserved.

iii

Set up a Lab Environment in


Amazon Web Services (AWS)
Part 1: Set up a lab environment in Amazon Web
Services (AWS)
This set up procedure will show you how to create your lab environment in AWS for the MapR
Hadoop Operations on-demand training. For a classroom or virtual, instructor led training
session, these AWS environments will already be set up for you, and your instructor will give you
further instructions for how to access your lab environment.
The steps below need to be followed in order to properly set up the AWS lab environment.

Lab Procedure
Create an AWS Account
You need to have an account on Amazon Web Services. If you already have an AWS account,
you can skip this task. Note that you will need to provide your email address, billing information
(credit card), and a phone number that you may be contacted at in order to create the account.
1. Point your Web browser to http://aws.amazon.com
2. Click the "Sign Up" button at the top right-hand side of the Web page
3. Select the "I am a new user" radio button
4. Type your email address in the "My e-mail address is:" text field
5. Click the "Sign in using our secure server" button
6. Fill out the "Login Credentials" Web form and click the "Continue" button
7. Fill out the "Contact Information" Web form and click the "Create Account and
Continue" button
8. Fill out the "Payment Information" Web form and click the "Continue" button
9. Fill out the "Identity Verification" Web form and click the "Call Me Now" button. Once
you reply to the phone call using your 4-digit code from this Web form, click the
"Continue to select your Support Plan" button
10. Fill out the "Support Plan" Web form. Note you will not need support services from
Amazon in order to run the labs in this class. Click the "Continue" button

Error! Reference source not found.Set up a Lab Environment in AWS

11. Your AWS account is now provisioned and you can begin setting up the virtual machines
for your class

Configure Virtual Private Cloud (VPC) Networking


AWS provides two types of network configurations: VPC and "classic". The lab guide has been
written using the recommended VPC network. The configuration steps are below.
1. Point your Web browser to http://aws.amazon.com
2. Select "AWS Management Console" from the "My Account / Console" drop-down list
3. Type your email address in the "My e-mail address is:" text field. Select the "I am a
returning user and my password is:" radio button. Click the "Sign in using our secure
server" button
4. In the "Compute & Networking" section of your AWS management console, click the
"VPC" link
5. In the "Virtual Private Cloud" section of your navigation pane, click "Your VPCs"
6. Cliick the "Create VPC" button and fill out the Web form as follows:
a. Name tag: mapr-odt-vpc
b. CIDR block: 10.0.0.0/16
c. Tenancy: Default
d. Click the "Yes, Create" button
7. In the "Virtual Private Cloud" section of your navigation pane, click "Subnets"
8. Click the "Create Subnet" button and fill out the Web form as follows:
a. Name tag: mapr-odt-subnet
b. VPC: mapr-odt-vpc
c. Availability Zone: No Preference
d. CIDR block: 10.0.0.0/24
e. Click the "Yes, Create" button
9. Select the "mapr-odt-subnet" checkbox and click the "Modify Auto-Assign Public IP"
button as follows:
a. Select the "Enable auto-assign Public IP" checkbox
b. Click the "Save" button
10. In the "Virtual Private Cloud" section of your navigation pane, click "Route Tables"
11. Click the "Create Route table" button and fill out the Web form as follows:
a. Name tag: mapr-odt-routes
b. VPC: mapr-odt-vpc
c. Click the "Yes, Create" button

PROPRIETARY AND CONFIDENTIAL INFORMATION


2014 MapR Technologies, Inc. All Rights Reserved.

Error! Reference source not found.Set up a Lab Environment in AWS

12. In the "Virtual Private Cloud" section of your navigation pane, click "Internet Gateways"
13. Click the "Create Internet Gateway" button and fill out the Web form as follows:
a. Name tag: mapr-odt-gw
b. Click the "Yes, Create" button
c. Select the checkbox next to the "mapr-odt-gw" object and click the "Attach to
VPC" button
d. Select "mapr-odt-vpc" from the "VPC" drop-down list and click the "Yes, Attach"
button
14. In the "Virtual Private Cloud" section of your navigation pane, click "Route Tables"
15. Select the "mapr-odt-routes" object, select the "Routes" tab, and click the "Edit" button.
Fill out the Web form as follows:
a. Destination: 0.0.0.0/0
b. Target: mapr-odt-gw
c. Click the "Save" button
16. In the "Virtual Private Cloud" section of your navigation pane, click "Subnets"
17. Select the "mapr-odt-subnets" object and select the "Route Table" tab. Click the "Edit"
button and fill out the form as follows:
a. Select the "Change To" drop-down list and select "mapr-odt-routes"
b. Click the "Save" button

Create AWS Virtual Machine Instances for Hadoop Installation


You need to provision at least 3 virtual machines in AWS in order to complete the labs in this
course, and you can provision more if youd prefer. More VMs will allow you to experiment
with different cluster service layout plans (see lesson 2 for more detail), and will give you better
performance when running jobs. The VMs needed for the lab environment are not included in
the Free Tier, however, and will accrue a nominal charge during the expected time to perform
the lab exercises. More VMs will also result in a higher charge for their use. Read the
Managing your Nodes section of this manual to learn more about minimizing the EC2 use
charges.
1. Point your Web browser to http://aws.amazon.com
2. Select "AWS Management Console" from the "My Account / Console" drop-down list
3. Type your email address in the "My e-mail address is:" text field. Select the "I am a
returning user and my password is:" radio button. Click the "Sign in using our secure
server" button
4. In the "Compute & Networking" section of your AWS management console, click the
"EC2" link

PROPRIETARY AND CONFIDENTIAL INFORMATION


2014 MapR Technologies, Inc. All Rights Reserved.

Error! Reference source not found.Set up a Lab Environment in AWS

5. In the upper right-hand corner of the EC2 Web page, select the availability zone (from
the drop-down list next to the "Help" drop-down list) nearest to where you are
physically located from the following choices. Note that an availability zone will already
be selected based on the contact information you provided when you provisioned your
AWS account:
a. US East (N. Virginia)
b. US West (Oregon)
c. US West (N. California)
d. EU (Ireland)
e. Asia Pacific (Singapore)
f. Asia Pacific (Tokyo)
g. Asia Pacific (Sydney)
h. South America (Sao Paulo)
6. In the "INSTANCES" section of the navigation pane on the left-hand side of the Web
page, click the "Instances" link
7. Click the "Launch Instance" button
8. In the "Step 1: Choose an Amazon Machine Image" Web page, scroll down to the
bottom of the page and select the 64-bit version of an image of Red Hat v6.4 or 6.5.
Note: Red Hat 7.0 is NOT currently supported.
9. In the "Step 2: Choose an Instance Type" Web page, select the checkbox for "m3.large"
type and click the "Next: Configure Instance Details" button
10. In the "Step 3: Configure Instance Details" Web page, fill out the form as follows:
a. Number of instances: 3
b. Purchasing option: leave "Request Spot Instances" unchecked
c. Network: mapr-odt-vpc
d. Subnet: mapr-odt-subnet
e. Auto-assign Public IP: enable
f. IAM role: None
g. Shutdown behavior: Stop
h. Enable termination protection: Check "protect against accidental termination"
checkbox
i. Monitoring: leave "Enable CloudWatch detailed monitoring" unchecked
j. Tenancy: "shared tenancy (multi-tenant hardware)
k. Click the "Next: Add Storage" button
11. In the "Step 4: Add Storage" Web page:
a. Click the "Add New Volume" button
b. Leave all the defaults except check the "Delete on termination" checkbox
c. Repeat the above steps 2 more times to add a total of 3 EBS volumes to your
instances
d. Click the "Next: Tag Instance" button
12. In the "Step 5: Tag Instance" Web page, type "mapr-install-node" in the "Value" field
and click the "Next: Configure Security Group" button

PROPRIETARY AND CONFIDENTIAL INFORMATION


2014 MapR Technologies, Inc. All Rights Reserved.

Error! Reference source not found.Set up a Lab Environment in AWS

13. In the "Step 6: Configure Security Group" Web page, select the "Create new security
group" radio button, type "mapr-sg" in the "Security group name:" field", and perform
the following steps:
a. Click the "Add Rule" button
b. Select "All TCP" from the "Type" drop-down list and select "Anywhere" from the
"Source" drop-down list
c. Click the "Add Rule" button
d. Select "All UDP" from the "Type" drop-down list and select "Anywhere" from
the "Source" drop-down list
e. Click the "Add Rule" button
f. Select "All ICMP" from the "Type" drop-down list and select "Anywhere" from
the "Source" drop-down list
g. Click the "Review and Launch" button
14. In the "Step 7: Review Instance Launch" Web page, review your instance launch details
and click the "Launch" button
15. In the "Select an existing key pair or create a new key pair" pop-up window, perform
one of the following steps:
a. select "Create a new key pair" and type "mapr-odt-keypair" in the "Key pair
name" text field. Click the "Download Key Pair" button.

OR

b. Select "select an existing key pair" and select the key pair from the "key pair
name" drop-down list

IMPORTANT NOTE: makes sure you save a copy of the new or existing key pair
file in a location that you can reference it throughout your training. If you lose
this file, you will lose access to your AWS instances, and will have to create new
ones.
16. Click the "Launch Instances" button
17. In the "Launch Status" Web page, click the "View Instances" button
18. Wait for the instances to get in the "running" state and status checks to complete
19. Log the IP Addresses of VMs for use later.

PROPRIETARY AND CONFIDENTIAL INFORMATION


2014 MapR Technologies, Inc. All Rights Reserved.

Error! Reference source not found.Set up a Lab Environment in AWS

Create an AWS VM Instance for NFS Access


You will need to launch an instance that will serve as your NFS client. This is the simplest
instance, and will qualify for Free Tier use. Use the following information to launch this instance
in AWS.
1. Point your Web browser to http://aws.amazon.com
2. Select "AWS Management Console" from the "My Account / Console" drop-down list
3. Type your email address in the "My e-mail address is:" text field. Select the "I am a
returning user and my password is:" radio button. Click the "Sign in using our secure
server" button
4. In the "Compute & Networking" section of your AWS management console, click the
"EC2" link
5. In the "INSTANCES" section of the navigation pane on the left-hand side of the Web
page, click the "Instances" link
6. Click the "Launch Instance" button
7. In the "Step 1: Choose an Amazon Machine Image" Web page, scroll down to the
bottom of the page and select the 64-bit version of an image of Red Hat v6.4 or 6.5.
Note: Red Hat 7.0 is NOT currently supported
8. In the "Step 2: Choose an Instance Type" Web page, select the checkbox for "t1.micro"
type and click the "Next: Configure Instance Details" button
9. In the "Step 3: Configure Instance Details" Web page, fill out the form as follows:
a. Number of instances: 1
b. Purchasing option: leave "Request Spot Instances" unchecked
c. Network: mapr-odt-vpc
d. Subnet: mapr-odt-subnet
e. Auto-assign Public IP: enable
f. IAM role: None
g. Shutdown behavior: Stop
h. Enable termination protection: Select "protect against accidental termination"
checkbox
i. Monitoring: leave "Enable CloudWatch detailed monitoring" unchecked
j. Tenancy: "shared tenancy (multi-tenant hardware)
k. Click the "Next: Add Storage" button
10. In the "Step 4: Add Storage" Web page, click the "Next: Tag Instance" button
11. In the "Step 5: Tag Instance" Web page, type "MapR-NFS-node" in the "Value" field and
click the "Next: Configure Security Group" button
12. In the "Step 6: Configure Security Group" Web page:
a. select the "select an existing security group" radio button
b. select the "mapr-sg" checkbox

PROPRIETARY AND CONFIDENTIAL INFORMATION


2014 MapR Technologies, Inc. All Rights Reserved.

Error! Reference source not found.Set up a Lab Environment in AWS

c. Click the "Review and Launch" button


13. In the "Step 7: Review Instance Launch" Web page, review your instance launch details
and click the "Launch" button
14. In the "Select an existing key pair or create a new key pair" pop-up window:
a. select "select an existing key pair"
b. select the "mapr-odt-keypair" key pair from the "key pair name" drop-down list
Click the "I acknowledge that I have access to the selected private key file
(name), and that without this file, I won't be able to log into my instance"
checkbox

REMINDER: You must have a copy of this key file in a location that you can
reference it throughout your training. If you lose this file, you will lose access to
your AWS instances, and will have to create new ones.

15. Click the "Launch Instances" button
16. In the "Launch Status" Web page, click the "View Instances" button
17. Wait for the instance to get in the "running" state and status checks to complete

Log in to AWS Nodes


In order to login to your AWS nodes, you will need to use the SSH key pair that you downloaded
when launching your instances. There is only one login account on your RHEL 6.x instance called
"ec2-user" which requires using the SSH key pair to login.

1. Open the terminal emulation application on your computer
2. Navigate to the location where the SSH key pair file is saved
3. Change the permission of the SSH key pair file:
$ chmod 600 mapr-odt-keypair.pem
4. Login as ec2-user
$ ssh i mapr-odt-keypair.pem ec2-user@VM_IP_Address (such as
54.183.169.43)
5. Switch to user root:
$ sudo s

PROPRIETARY AND CONFIDENTIAL INFORMATION



2014 MapR Technologies, Inc. All Rights Reserved.

10

Error! Reference source not found.Set up a Lab Environment in AWS

6. Determine and log the internal IP address of the VM instance (save the result, such as
10.0.0.167):
$ hostname
7. Create a mapr user on this VM:
$ useradd mapr
$ passwd mapr
then type the password for the mapr user when prompted
8. Set the root user password:
$ passwd root
then type the password for the root user when prompted
9. Allow password authentication to the VM:
$ vi /etc/ssh/sshd_config
change PasswordAuthentication no to PasswordAuthentication yes
save and exit vi
10. Repeat steps 6-8 for all VM instances and log the hostname of each instance

Now you have root access on your RHEL virtual machine instance, and you can proceed with
the MapR Hadoop Operations labs.

Managing Your Nodes


AWS charges you by the hour for your instances so long as they are running. You don't need to
keep your nodes running while you are not performing tasks in the lab. You can safely stop your
instances while you are not using them and the restart them when you want to use them again.
This will ensure that you are only changed for time that you are using your nodes to perform lab
exercises. The Public IPs of your VMs may change when stopped, but the internal IPs will remain
consistent. You should check the Public IP address and note any changes, but you will not need
to re-check the VM hostnames.
1. Point your Web browser to http://aws.amazon.com
2. Type your email address in the "My e-mail address is:" text field. Select the "I am a
returning user and my password is:" radio button. Click the "Sign in using our secure
server" button
3. In the "Compute & Networking" section of your AWS management console, click the
"EC2" link
4. In the "INSTANCES" section of the navigation pane on the left-hand side of the Web
page, click the "Instances" link

PROPRIETARY AND CONFIDENTIAL INFORMATION



2014 MapR Technologies, Inc. All Rights Reserved.

11

Error! Reference source not found.Set up a Lab Environment in AWS

5. Select the instances that you want to stop, click on the Actions button, and select
Stop
6. Click on the Yes, Stop button
To restart the instances, repeat these steps, and select Start in step 5. Remember, you should
check the Public IP settings of your VMs and note any changes to your IP addresses. The
internal IP addresses will remain consistent, so the passwordless ssh and Hadoop software will
still function normally.

Terminating Your Instances and EBS Storage


When you are finished using your AWS nodes for the class exercises, you should terminate your
nodes. If you did not select the Delete on Termination box when creating the storage for your
nodes, then you will also need to terminate your EBS storage volumes.
7. Point your Web browser to http://aws.amazon.com
8. Type your email address in the "My e-mail address is:" text field. Select the "I am a
returning user and my password is:" radio button. Click the "Sign in using our secure
server" button
9. In the "Compute & Networking" section of your AWS management console, click the
"EC2" link
10. In the "INSTANCES" section of the navigation pane on the left-hand side of the Web
page, click the "Instances" link
11. Disable Termination Protection on each instance, individually. You will have to perform
these steps on each instance, one at a time:
a. Select the instance that you would like to disable termination protection
b. Click on the Actions button, and select Change Termination Protection
c. Select the Yes, Disable button
d. Repeat these steps for all instances that you want to terminate.
12. Select the instances that you want to terminate, click on the Actions button, and
select Terminate
13. Click on the Yes, Terminate button

If you did not select Delete on Termination when adding storage, follow the steps
below to delete the EBS storage volumes.

PROPRIETARY AND CONFIDENTIAL INFORMATION



2014 MapR Technologies, Inc. All Rights Reserved.

12

Error! Reference source not found.Set up a Lab Environment in AWS

14. In the "ELASTIC BLOCK STORE" section of the navigation pane on the left-hand side of
the Web page, click the "Volumes" link.
15. Select the checkbox next to the volumes that you want to remove, click on the Actions
button and select Delete Volumes

PROPRIETARY AND CONFIDENTIAL INFORMATION



2014 MapR Technologies, Inc. All Rights Reserved.

13

Error! Reference source not found.Set up a Lab Environment in AWS

Part 2: Setup passwordless ssh access between nodes


When testing hardware nodes and installing Hadoop, you will need to run various commands
and scripts on all of the nodes in the cluster. A tool like clustershell, or clush, will allow you to
propagate these commands from one master node to all of the other nodes on the cluster.
For clush to perform tasks on the other nodes, it needs passwordless ssh access, so that you do
not have to type in a password for every action on every node. Some of the actions we will do in
this course require root account access, and some require mapr account access. You will need to
perform the steps for passwordless ssh twice, once when logged in as root and once as mapr.
1. Log into one of your nodes. If you are using AWS VMs, use the instructions provided
above. We will set up passwordless ssh from this node to the other nodes. This node
will be the master node going forward, and we will run all further commands from this
node.
2. Su to root:
$ sudo -s
3. Generate an ssh key as the root user:
$ ssh-keygen
Enter file in which to save the key (/home): leave as default
Enter passphrase (empty for no passphrase): leave empty
Enter same passphrase again: leave empty
4. Copy the ssh key to the other nodes. We will be using the internal IP addresses
compiled when checking the hostname above:
$ ssh-copy-id IP-address-1
Are you sure you want to continue connecting (yes/no)? yes
5. Test the passwordless connection:
$ ssh IP-address-1
6. Return to the master node:
$ exit
7. Repeat steps 4 and 5 for each internal IP address on your list.

Congratulations! Your AWS environment is now set up and ready install Hadoop.

PROPRIETARY AND CONFIDENTIAL INFORMATION



2014 MapR Technologies, Inc. All Rights Reserved.

14

Anda mungkin juga menyukai