Data Science
Teams Work
Custodia
n
Aladdin
NAV
Expense
s
Expense
Reclaims
Reclaims
Asset
s
Asset
s
Cash
Cash
Problem Statement
When NAV or any of its breakdown do not match, there
is a Exception
Exception are shown on Exception Monitor
A person used to manually classify Exceptions for 1
year.
Exceptions are generated every day and they are not
classified automatically.
Data Extraction
Data from table port_group
Portfolio and its aggregated sleeves [portfolio_code]
Data Cleaning
Removing Outliers
Removed ambiguity [ Accured Expenses, Accrued
Expenses, Accrued Expense -- All are same]
Levels of Comments : 97
Reduced it to 30 levels
Data
Portfolio_name
Portfolio_code
Portfolio_group
Company Name
P. NAV [continuous]
Q. NAV [continuous]
P. Exclaims [continuous]
P. Reclaims [continuous]
P. Asset [continuous]
P. Cash [continuous]
Q. Exclaims
[continuous]
Q. Reclaims
[continuous]
Q. Asset
[continuous]
Q. Cash [continuous]
Start time
End time
Status [Class]
Feature Selected
P. NAV [continuous]
Q. NAV [continuous]
P. Exclaims [continuous]
P. Reclaims [continuous]
P. Asset [continuous]
P. Cash [continuous]
Q. Exclaims [continuous]
Q. Reclaims [continuous]
Q. Asset [continuous]
Q. Cash [continuous]
Features Extracted
Model Used
Random Forest
Accuracy - 74% on testing data
BlackRock
Software Development
Problem Statement
Given 2 data set
Compute difference between them.
Generate a report (.csv)
Useful for Business people
Difference between 2 .csv file
Product features
Provide diff of 40 lakhs entries DataSet in 5 minutes
Can define Ignore column
Ignore time column [ Will be different in data from sql query]
Code Input
Path of Src_a
Path of Src_b
Key_column : Uniquely Identify the row [ ex:
portfolio_code, transactional_id]
Ignore_column : creation time
Integer_column
Match set : src_a [ column_name] operator src_b
[ column_name] operator value
Rename_column : src_a column and src_b column
Code Output
Sources
Hyperlink
Difference integer
Regression Testing
Shell Script
Run
Code
[Befor
e]
Run
Code
[After
]
Save
DB
(.csv)
Save
DB
(.csv)
Repor
t
DiffToo
l
(.java)