Anda di halaman 1dari 5

Building a serverless BI solution in

AWS Cloud
Problem statement and current challenges
Problem Statement

Seamless data process and data analysis of enterprise data from different
sources on cloud at minimum cost using on demand

Challenge

1. Unified data processing system.

2. 360 degree view of the data are not available due to rigid and complicated
architecture

3. Capturing the streaming data and transforming them to KPI relted data.

4. Server cost and scalability issue

5. Higher Capex, Opex and maintenance cost


BI over AWS Cloud – Solution Overview
Components Used

1. Boto3 – Uploading data files in AWS S3 using


python

2. Kinesis Firehose – processing streaming data

3. AWS S3 service – Storage service to host the files

4. AWS Lambda - Event-driven, serverless computing


platform used to run the etl automatically and send
notification when the KPI is generated.

5. Redshift – Data warehouse service of AWS

6. SNS – Notification service

7. Quicksight – Reporting and Dashboarding tool

8. API Gateway
Key Benefits
1. Serverless compute service that runs response to events and automatically manages the underlying compute resources

2.Resources when you need them. Devoid of any virtual machines and pay as you use.

3. Lower cost. Capital expenses are non existent, only have operational expenses and only pay for is used. When no longer the resources are needed, just
stop using them and stop paying for them.

4. Time to value. Stop spending time ordering new hardware and spend less time installing, configuring and maintaining it! The cloud is a place for rapid
and agile development.

Work and Access from anywhere

5.Cloud Datacenters are not visibly recognisable from the outside of the building, and the way the clusters of datacenters are built will garantee that
when one physical datacenter (availability zone) fails, your services and resources will still be available.

6. Use of AWS Services like Kinesis and Boto3 for uploading streaming and file data. Usage of serverless services like lambda, etl-glue, redshift, SNS.

6. Data transformation is done by pyspark in glue services.

8. The KPI/reports can be viewed in on demand AWS reporting system Quickstage

9. The data can be consumed by eternal parties using the online API gateway service. Proactive alert messages can be sent to business users through SNS.
Way forward plan & Resource Requirement
Way forward plan

1. Build the POC on AWS cloud with sample simulated data

2. Cost calculation and building the template for total cost of ownership.

3. Porting the model in Microsoft Azure and Google cloud

Resource Requirement

1.AWS account. Once the POC is successful then Azure and Google account is required.

2. One resource having knowledge cloud development.

3. 3 to 4 weeks of POC plan on AWS cloud.

Anda mungkin juga menyukai