15min solution to missing data problem in RPA

99.8% accuracy to predict a product category from invoice data. Done in 15 minutes. See how, and try yourself!

Missing data is everywhere

Dataset

How to implement missing value predictor with Aito

Prerequisites

Upload the dataset with CLI


mv Train.csv InvoiceData.csv aito database quick-add-table InvoiceData.csv

Review schema


curl -X GET \ https://$AITO_INSTANCE_NAME.api.aito.ai/api/v1/schema/InvoiceData \ -H "x-api-key: $AITO_API_KEY" \ -H "content-type: application/json"
{
"columns": {
"GL_Code": {
"nullable": false,
"type": "String"
},
"Inv_Amt": {
"nullable": false,
"type": "Decimal"
},
"Inv_Id": {
"nullable": false,
"type": "Int"
},
"Item_Description": {
"analyzer": "english",
"nullable": false,
"type": "Text"
},
"Product_Category": {
"nullable": false,
"type": "String"
},
"Vendor_Code": {
"nullable": false,
"type": "String"
}
},
"type": "table"
}

Run a prediction


curl -X POST \ https://$AITO_INSTANCE_NAME.api.aito.ai/api/v1/_predict \
-H "x-api-key: $AITO_API_KEY" \
-H "content-type: application/json" \
-d '
{ "from": "InvoiceData",
"where": { "GL_Code": "GL-6101400",
"Inv_Amt": 55.93,
"Item_Description": "Arabian American Development Co Final Site Clean Up 2008-Oct General Requirements General Contractor Store Construction",
"Vendor_Code": "VENDOR-1254"
},
"predict": "Product_Category",
"limit": 3
}'
{
"offset": 0,
"total": 36,
"hits": [
{
"$p": 0.9998137062471193,
"field": "Product_Category",
"feature": "CLASS-1522"
},
{
"$p": 7.345473788575557E-5,
"field": "Product_Category",
"feature": "CLASS-1828"
},
{
"$p": 3.118734024876525E-5,
"field": "Product_Category",
"feature": "CLASS-1983"
}
]
}

Evaluating accuracy

{
"test": {
"$index": {
"$mod": [3, 0]
}
},
"evaluate": {
"from": "InvoiceData",
"where": {
"GL_Code": {"$get": "GL_Code"},
"Inv_Amt": {"$get": "Inv_Amt"},
"Item_Description": {"$get": "Item_Description"},
"Vendor_Code": {"$get": "Vendor_Code"}},
"predict": "Product_Category"
},
"select": ["trainSamples", "testSamples", "accuracy", "error"]
}
{
"trainSamples": 3711.0,
"testSamples": 1856,
"accuracy": 0.9983836206896551,
"error": 0.001616379310344862
}

Operationalise


curl -X POST \
https://$AITO_INSTANCE_NAME.api.aito.ai/api/v1/data/InvoiceData \
-H "content-type: application/json" \
-H "x-api-key: $AITO_API_KEY" \
-d '
{
"GL_Code": "GL-9900990",
"Inv_Amt": 39.00,
"Inv_Id": 34001,
"Item_Description": "Predictive database monthly subscription, developer tier",
"Product_Category": "CLASS-9999",
"Vendor_Code": "VENDOR-9999"
}'

Comparison

Summary

--

--

--

Aito.ai decision automation in the cloud. #ML for #nocode and #rpa operators.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Object Localization

An Easy Guide to Gradient Descent and its Variants — Everything you need to know as a Data…

Corona Face Mask Detection with Custom Vision and Tensorflow.js

MLflow: a better way to track your models

Maximum Likelihood Estimation (MLE) for Machine Learning

XGBoost: A Complete Guide to Fine-Tune and Optimize your Model

Implementing EfficientNet in PyTorch Part 3: MBConv, Squeeze-and-Excitation, and More

Reinforced Learning — “Techniques, Applications and benefit over Deep Learning”

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
aito.ai

aito.ai

Aito.ai decision automation in the cloud. #ML for #nocode and #rpa operators.

More from Medium

Experiments with: Altair

AIRBNB — Seattle data Analysis

Seattle city image

An XGBClassifier Application in Audio

Santander Customer Satisfaction — Self Case Study