GCP AI Platform Model Deployment in Distributed machine

Instance Hands-On Lab

Objective:

We will go over the process of training a pre-packaged machine learning model on AI Platform, deploying the trained model, and running predictions against it using Distribute Machine.

Actions:

We will focus on AI Platform Aspect not Tenser Flow
We will use lot of command on command line/ gcloud usig cloud shell

Step Summary:

Submit training job locally using AI-platform commands
Submit training job on AI Platform, both single and distributed
Deploy trained model, and submit predictions

Machine type:

STANDARD_1

1 master
4 workers
3 parameter servers

Trained on 4 machine with 3 parameters server

If you want a set of scripts to automate much of the process, the scripts and a PDF file of the commands we used can be found at the following public link:

https://console.cloud.google.com/storage/browser/gcp-course-exercise-scripts/data-engineer/ai-platform

algae study https://console.cloud.google.com/storage/browser/gcp-course-exercise-scripts/data-engineer/ai-platform

If you want to download all scripts and files in a terminal environment, use the below command:

gsutil cp gs://acg-gcp-course-exercise-scripts/data-engineer/ai-platform/\* .

(The period at the end is required to copy to your current location).

Step 1: Enable Google cloud API:

Go to we console --> Go to Menu --> Search AI platform in Products --> Click on Jobs --> Enable API
It will take a few mins to enable, you can also refresh page to check

Step 2: Prepare environment

Go to cloud shell and execute commands

### Set region environment variable (i will run in US central 1 environent)

REGION=us-central1

### Download and unzip ML github data for demo

wget https://github.com/GoogleCloudPlatform/cloudml-samples/archive/master.zip
unzip master.zip

algaestudy.com https://github.com/GoogleCloudPlatform/cloudml-samples/archive/master.zip

Step 3: Setup Directory

Check the files

### Navigate to the cloudml-samples-master > census > estimator directory.

cd ~/cloudml-samples-main/census/estimator

### **Develop and validate trainer on local machine**

### Get training data from public GCS bucket

mkdir data
gsutil -m cp gs://cloud-samples-data/ml-engine/census/data/* data/

### Set path variables for local file paths, These will change later when we use AI Platform

TRAIN_DATA=$(pwd)/data/adult.data.csv
EVAL_DATA=$(pwd)/data/adult.test.csv

cloudml-samples-master > census > estimator directory. algaestudy.com

Step 4: Enable TFS version

### Run sample requirements.txt to ensure we're using same version of TF as sample

sudo pip install -r ~/cloudml-samples-master/census/requirements.txt

Step 5: Specify output Directory and Clear it

### Specify output directory, set as variable

MODEL_DIR=output

### Best practice is to delete contents of output directory in case
### data remains from previous training run

rm -rf $MODEL_DIR/*

Step 6: Run a local trainer

Runing local traini g env using API platform

It will take ML model from from git hup repository and train it using our csv data
It will train on 1000 data points and evaluate on 100 test data points, to check accuracy

### Run local training using gcloud

Its going to training process in local session
gcloud ai-platform local train --module-name trainer.task --package-path trainer/ --job-dir $MODEL_DIR -- --train-files $TRAIN_DATA --eval-files $EVAL_DATA --train-steps 1000 --eval-steps 100

You can view result in visula format in product tersorboard dashboard

tensorboard --logdir=$MODEL_DIR --port=8080
In putput you will get url which you can directly use in chrome. it is a local viewer of accuracy and status to check progress and loss

tensorboard --logdir=$MODEL_DIR --port=8080 algaestudy.com

you might face error due to issues with code and versions. Please fix in your system

Step 7: Execute trainer on Distributed Instance Instance

To run trainer on GCP AI Platform - single instance
### Create regional Cloud Storage bucket used for all output and staging

gsutil mb -l $REGION gs://$DEVSHELL_PROJECT_ID-aip-demo

### Upload training and test/eval data to bucket

cd ~/cloudml-samples-master/census/estimator
gsutil cp -r data gs://$DEVSHELL_PROJECT_ID-aip-demo/data

### Set data variables to point to storage bucket files

TRAIN_DATA=gs://$DEVSHELL_PROJECT_ID-aip-demo/data/adult.data.csv
EVAL_DATA=gs://$DEVSHELL_PROJECT_ID-aip-demo/data/adult.test.csv

### Copy test.json to storage bucket

It will be used later for predictions
gsutil cp ../test.json gs://$DEVSHELL_PROJECT_ID-aip-demo/data/test.json

### Set TEST\_JSON to point to the same storage bucket file

TEST_JSON=gs://$DEVSHELL_PROJECT_ID-aip-demo/data/test.json

Execute trainer on Single Instance klassroom.algaeservice.com

Copy test.json to storage bucket algaestudy.com

Step 8: Setup Jobs

### Set variables for job name and output path

JOB_NAME=census_dist_1
OUTPUT_PATH=gs://$DEVSHELL_PROJECT_ID-aip-demo/$JOB_NAME

### Submit a single process job to AI Platform

### Job name is JOB\_NAME (census\_dist\_1)
### Output path is our Cloud storage bucket/job\_name

### Training and evaluation/test data is in our Cloud Storage bucket

gcloud ai-platform jobs submit training $JOB_NAME \
--job-dir $OUTPUT_PATH \
--runtime-version 1.15 \
--module-name trainer.task \
--package-path trainer/ \
--region $REGION \
--scale-tier STANDARD_1 \
-- \
--train-files $TRAIN_DATA \
--eval-files $EVAL_DATA \
--train-steps 1000 \
--verbosity DEBUG \
--eval-steps 100

Deployment in Distributed machine klassroom.algaeservice.com

Step 9: View in AI Platform

### Can view streaming logs/output with gcloud ai-platform jobs stream-logs $JOB\_NAME
### When complete, inspect output path with gsutil ls -r $OUTPUT\_PATH

Distributed machine AI platform cloud google algaestudy.com

We can verify training Input from AI Platform and checking output file

Step 9: Prediction Phase

### Deploy a model for prediction, setting variables in the process

cd ~/cloudml-samples-master/census/estimator
MODEL_NAME=census

### Create the ML Engine model

gcloud ai-platform models create $MODEL_NAME --regions=$REGION

Deploy a model for prediction, setting variables in the process

Our Model binary created is now available in bucket

Step 10: Set the job output we want to use and create model version

###This example uses census\_dist\_1

OUTPUT_PATH=gs://$DEVSHELL_PROJECT_ID-aip-demo/census_dist_1`

##### CHANGE census\_dist\_1 to use a different output from previous
## IMPORTANT - Look up and set full path for export trained model binaries
### gsutil ls -r $OUTPUT\_PATH/export
### Look for directory $OUTPUT\_PATH/export/census/ and copy/paste timestamp value (without colon) into the below command
MODEL_BINARIES=gs://$DEVSHELL_PROJECT_ID-aip-demo/census_dist_1/export/census/<timestamp>` ###CHANGE ME!
### Create version 1 of your model

gcloud ai-platform versions create v1 \
--model $MODEL_NAME \
--origin $MODEL_BINARIES \
--runtime-version 1.15

Set the job output we want to use and create model version

Model Deployed in Google cloud API Platform

Step 11: Generate online prediction first

### Send an online prediction request to our deployed model using test.json file
### Results come back with a direct response

gcloud ai-platform predict \
--model $MODEL_NAME \
--version v1 \
--json-instances \
../test.json

Step 12: Send a batch prediction job using same test.json file

### Results are exported to a Cloud Storage bucket location
### Set job name and output path variables

JOB_NAME=census_prediction_2
OUTPUT_PATH=gs://$DEVSHELL_PROJECT_ID-aip-demo/$JOB_NAME

### Submit the prediction job

gcloud ai-platform jobs submit prediction $JOB_NAME \
--model $MODEL_NAME \
--version v1 \
--data-format TEXT \
--region $REGION \
--input-paths $TEST_JSON \
--output-path $OUTPUT_PATH/predictions

Send a batch prediction job using same test.json file

###

$gcloud ai-platform jobs submit prediction $JOB_NAME \$

Step 13: View prediction results in web console at

gs://$DEVSHELL_PROJECT_ID-aip-demo/$JOB_NAME/predictions/`

Algae Education Services

Labels

Model Deployment in Distributed machine in Google Cloud Platform - Lab

GCP AI Platform Model Deployment in Distributed machine

Instance Hands-On Lab

Objective:

Actions:

Step Summary:

Machine type:

Step 1: Enable Google cloud API:

Step 2: Prepare environment

Step 3: Setup Directory

Step 4: Enable TFS version

Step 5: Specify output Directory and Clear it

Step 6: Run a local trainer

Step 7: Execute trainer on Distributed Instance Instance

Step 8: Setup Jobs

Step 9: Prediction Phase

Our Model binary created is now available in bucket

Step 10: Set the job output we want to use and create model version

Step 11: Generate online prediction first

Step 12: Send a batch prediction job using same test.json file

Step 13: View prediction results in web console at

No comments:

Followers

Categories

Total Pageviews

Popular Posts

Authors

Meet US

Services

More Services